<a href="https://colab.research.google.com/github/molan-zhang/urban-octo-train/blob/master/Copy_of_Federated_Learning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Section: Federated Learning

# Lesson: Introducing Federated Learning

Federated Learning is a technique for training Deep Learning models on data to which you do not have access. Basically:

Federated Learning: Instead of bringing all the data to one machine and training a model, we bring the model to the data, train it locally, and merely upload "model updates" to a central server.

Use Cases:

    - app company (Texting prediction app)
    - predictive maintenance (automobiles / industrial engines)
    - wearable medical devices
    - ad blockers / autotomplete in browsers (Firefox/Brave)
    
Challenge Description: data is distributed amongst sources but we cannot aggregated it because of:

    - privacy concerns: legal, user discomfort, competitive dynamics
    - engineering: the bandwidth/storage requirements of aggregating the larger dataset

# Lesson: Introducing / Installing PySyft

In order to perform Federated Learning, we need to be able to use Deep Learning techniques on remote machines. This will require a new set of tools. Specifically, we will use an extensin of PyTorch called PySyft.

### Install PySyft

- If you are using Google Colab, you can simply install PySyft using the following command:
`! pip install syft`

- If you are using PySyft locally, the easiest way to install the required libraries is with [Conda](https://docs.conda.io/projects/conda/en/latest/user-guide/overview.html). Create a new environment, then install the dependencies in that environment. In your terminal:

```bash
conda create -n pysyft python=3
conda activate pysyft # some older version of conda require "source activate pysyft" instead.
conda install jupyter notebook
pip install syft
pip install numpy
```

If you have any errors relating to zstd - run the following (if everything above installed fine then skip this step):

```
pip install --upgrade --force-reinstall zstd
```

and then retry installing syft (pip install syft).

If you are using Windows, I suggest installing [Anaconda and using the Anaconda Prompt](https://docs.anaconda.com/anaconda/user-guide/getting-started/) to work from the command line. 

With this environment activated and in the repo directory, launch Jupyter Notebook:

```bash
jupyter notebook
```

and re-open this notebook on the new Jupyter server.



In [1]:
! pip install syft
import syft

Collecting syft
[?25l  Downloading https://files.pythonhosted.org/packages/dc/78/fc404dd6236e876f679b9f2f66f6d648f07d9d4938d8126a870b0a44fc4e/syft-0.1.28a1-py3-none-any.whl (309kB)
[K     |████████████████████████████████| 317kB 4.9MB/s 
[?25hCollecting websocket-client>=0.56.0 (from syft)
[?25l  Downloading https://files.pythonhosted.org/packages/29/19/44753eab1fdb50770ac69605527e8859468f3c0fd7dc5a76dd9c4dbd7906/websocket_client-0.56.0-py2.py3-none-any.whl (200kB)
[K     |████████████████████████████████| 204kB 38.0MB/s 
[?25hCollecting lz4>=2.1.6 (from syft)
[?25l  Downloading https://files.pythonhosted.org/packages/5d/5e/cedd32c203ce0303188b0c7ff8388bba3c33e4bf6da21ae789962c4fb2e7/lz4-2.2.1-cp36-cp36m-manylinux1_x86_64.whl (395kB)
[K     |████████████████████████████████| 399kB 54.2MB/s 
[?25hCollecting torch==1.1 (from syft)
[?25l  Downloading https://files.pythonhosted.org/packages/69/60/f685fb2cfb3088736bafbc9bdbb455327bdc8906b606da9c9a81bae1c81e/torch-1.1.0-cp36-cp36m-

Falling back to insecure randomness since the required custom op could not be found for the installed version of TensorFlow. Fix this by compiling custom ops. Missing file was '/usr/local/lib/python3.6/dist-packages/tf_encrypted/operations/secure_random/secure_random_module_tf_1.15.0-rc3.so'





In [0]:
! pip install syft

Collecting syft
[?25l  Downloading https://files.pythonhosted.org/packages/dc/78/fc404dd6236e876f679b9f2f66f6d648f07d9d4938d8126a870b0a44fc4e/syft-0.1.28a1-py3-none-any.whl (309kB)
[K     |████████████████████████████████| 317kB 3.4MB/s 
[?25hCollecting flask-socketio>=3.3.2 (from syft)
  Downloading https://files.pythonhosted.org/packages/66/44/edc4715af85671b943c18ac8345d0207972284a0cd630126ff5251faa08b/Flask_SocketIO-4.2.1-py2.py3-none-any.whl
Collecting torchvision==0.3.0 (from syft)
[?25l  Downloading https://files.pythonhosted.org/packages/2e/45/0f2f3062c92d9cf1d5d7eabd3cae88cea9affbd2b17fb1c043627838cb0a/torchvision-0.3.0-cp36-cp36m-manylinux1_x86_64.whl (2.6MB)
[K     |████████████████████████████████| 2.6MB 50.5MB/s 
[?25hCollecting zstd>=1.4.0.0 (from syft)
[?25l  Downloading https://files.pythonhosted.org/packages/ce/73/585134600c7fe918566adddf94af42b35cb0dad4b96ee0180190fbcbb954/zstd-1.4.3.2.tar.gz (456kB)
[K     |████████████████████████████████| 460kB 40.6MB/s 
Co

In [0]:
import torch as th
import syft as sy

In [0]:
hook = sy.TorchHook(th)  # this line of code creates an instance of th with its backend APIs modified with Syft functions

In [4]:
x = th.tensor([1,2,3,4,5]) # notice that Torch functionalities still behave the same
x

tensor([1, 2, 3, 4, 5])

# Lesson: Basic Remote Execution in PySyft

## PySyft => Remote PyTorch

The essence of Federated Learning is the ability to train models in parallel on a wide number of machines. Thus, we need the ability to tell remote machines to execute the operations required for Deep Learning.

Thus, instead of using Torch tensors - we're now going to work with **pointers** to tensors. Let me show you what I mean. First, let's create a "pretend" machine owned by a "pretend" person - we'll call him Bob.

In [0]:
bob = sy.VirtualWorker(hook, id="bob") # creates a virtual worker (a simulaion to interface to Bob machines)

In [6]:
print(f'Type: {type(bob._objects)} \nValue: {bob._objects}')

Type: <class 'dict'> 
Value: {}


In [0]:
x = th.tensor([1,2,3,4,5])

In [0]:
x =  x.send(bob) # send this data to bob

In [9]:
bob._objects

{20350864914: tensor([1, 2, 3, 4, 5])}

In [10]:
# What's the type of the pointer? and wht's the reason behind this type?
# What's its value?

print(f'Type: {type(x)} \nValue: {x}')

Type: <class 'torch.Tensor'> 
Value: (Wrapper)>[PointerTensor | me:60370760932 -> bob:20350864914]


In [11]:
x.location # where the Tensor located?

<VirtualWorker id:bob #objects:1>

In [12]:
x.id # the pointer ID at our machine

60370760932

In [13]:
x.id_at_location # the ID of the tensor at the remote worker

20350864914

In [14]:
x.owner

<VirtualWorker id:me #objects:0>

In [15]:
x = x.get()
x

tensor([1, 2, 3, 4, 5])

In [16]:
x

tensor([1, 2, 3, 4, 5])

In [17]:
bob._objects

{}

# Project: Experience with Remote Tensors

In this project, I want you to .send() and .get() a tensor to TWO workers by calling .send(bob,alice). This will first require the creation of another VirtualWorker called alice.

In [0]:
# create a second Virtual worker and call it Alice
alice = sy.VirtualWorker(hook, id='alice')

In [0]:
# 1 - create some data (a tensor)
# 2- send the data to bob and alice
x = th.tensor([2,3,4])
x = x.send(bob, alice)

In [0]:
# notice what does the send fucntion on two workers return; a multi-pointer
x

(Wrapper)>[MultiPointerTensor]
	-> [PointerTensor | me:56454933827 -> bob:90495103448]
	-> [PointerTensor | me:85796864791 -> alice:61563688226]

In [0]:
# what does .child on the pointer object return?
# what does .child.child on the pointer object return? 
print(f'Type: {type(x.child)} \nValue: {x.child}')
print(f'Type: {type(x.child.child)} \nValue: {x.child.child}')

Type: <class 'syft.generic.pointers.multi_pointer.MultiPointerTensor'> 
Value: [MultiPointerTensor]
	-> [PointerTensor | me:56454933827 -> bob:90495103448]
	-> [PointerTensor | me:85796864791 -> alice:61563688226]
Type: <class 'dict'> 
Value: {'bob': [PointerTensor | me:56454933827 -> bob:90495103448], 'alice': [PointerTensor | me:85796864791 -> alice:61563688226]}


In [0]:
bob._objects

{90495103448: tensor([2, 3, 4])}

In [0]:
alice._objects

{61563688226: tensor([2, 3, 4])}

In [0]:
# try the .get() on the pointer
x = x.get()
x

[tensor([2, 3, 4]), tensor([2, 3, 4])]

In [0]:
alice._objects

{}

In [0]:
bob._objects

{}

In [0]:
# 1 - create some data (a tensor)
# 2- send the data to bob and alice
x = th.tensor([2, 5, 4])
x = x.send(bob, alice)

In [0]:
# try the .get(sum_results=True) on your pointer
sum = x.get(sum_results=True)
sum

tensor([ 4, 10,  8])

# Lesson: Introducing Remote Arithmetic

In [0]:
x = th.tensor([1,2,3,4,5]).send(bob)
y = th.tensor([1,1,1,1,1]).send(bob)

In [0]:
x # a pointer to a remote tensor located at bob

(Wrapper)>[PointerTensor | me:96502616908 -> bob:83999397113]

In [0]:
y # a pointer to another remote tensor located at bob

(Wrapper)>[PointerTensor | me:35900922850 -> bob:59092289169]

In [0]:
z = x + y # treat those tensors (i.e., x and y) as local tensors, but they are actually executed remotely
z

(Wrapper)>[PointerTensor | me:12996980457 -> bob:54222168809]

In [0]:
z = z.get()
z

tensor([2, 3, 4, 5, 6])

In [0]:
x = th.tensor([1.,2,3,4,5], requires_grad=True).send(bob)
y = th.tensor([1.,1,1,1,1], requires_grad=True).send(bob)

In [0]:
z = (x + y).sum()
z = z.get()
z

tensor(20., requires_grad=True)

# Project: Learn a Simple Linear Model

In this project, I'd like for you to create a simple linear model which will solve for the following dataset below. You should use only Variables and .backward() to do so (no optimizers or nn.Modules). 

Furthermore, you must do so with both the data and the model being located on Bob's machine.

In [0]:
import torch as th
import syft as sy

hook = sy.TorchHook(th)  
bob = sy.VirtualWorker(hook, id="bob")

bob



<VirtualWorker id:bob #objects:7>

In [0]:
# create some toy data for our model
input_data = th.tensor([[1., 1],[0.5, 1],[1, 0],[0, 0]], requires_grad=True).send(bob)
output_data = th.tensor([[1.],[1],[0],[0]], requires_grad=True).send(bob)

In [0]:
# create some linear weights and send them to bob
weights = th.tensor([[0.01],[0.01]], requires_grad = True).send(bob)

In [0]:
# create a linear model and train it on Bob's machine
# remember how to create a linear model? No? :( 
# Here's the main steps:
#    1- find a prediction
#    2- calcualte the loss (a mean square loss)
#    3- backpropogate using the backword() function
#    4- DO NOT forget to clear your gradients after updating the weights
#    weights.data.sub_(weights.grad * lr)

prediction = input_data.mm(weights)
prediction

(Wrapper)>[PointerTensor | me:7479747335 -> bob:2114841713]

In [0]:
lr = 0.1

for i in range(10):
  prediction = input_data.mm(weights)
  loss = ((prediction - output_data)**2).mean()
  loss.backward()
  weights.data.sub_(weights.grad * lr)
  weights.grad *= 0

  print(loss.get().data)

tensor(0.0468)
tensor(0.0385)
tensor(0.0357)
tensor(0.0332)
tensor(0.0310)
tensor(0.0290)
tensor(0.0271)
tensor(0.0254)
tensor(0.0238)
tensor(0.0223)


# Lesson: Garbage Collection and Common Errors


In [0]:
bob = bob.clear_objects() # clear the contents of a remote object

In [0]:
bob._objects

{}

In [0]:
x = th.tensor([1,2,3,4,5]).send(bob)

In [0]:
bob._objects

{85910937763: tensor([1, 2, 3, 4, 5])}

In [0]:
del x  # delete the pointer to the remote object

In [0]:
bob._objects

{}

In [0]:
x.child.garbage_collect_data  # True by default

True

In [0]:
x = th.tensor([1,2,3,4,5]).send(bob)

In [0]:
bob._objects

{84615314564: tensor([1, 2, 3, 4, 5])}

In [0]:
x = "asdf"

In [0]:
bob._objects

{}

In [0]:
x = th.tensor([1,2,3,4,5]).send(bob)

In [0]:
x

(Wrapper)>[PointerTensor | me:39795892651 -> bob:12911623213]

In [0]:
bob._objects

{12911623213: tensor([1, 2, 3, 4, 5])}

In [0]:
x = "asdf"

In [0]:
bob._objects  # some error from Jupyter is casuing this to appear!

{12911623213: tensor([1, 2, 3, 4, 5])}

In [0]:
del x

In [0]:
bob._objects

{12911623213: tensor([1, 2, 3, 4, 5])}

In [0]:
bob = bob.clear_objects() # erase force 
bob._objects

{}

In [0]:
for i in range(1000):
    x = th.tensor([1,2,3,4,5]).send(bob)

In [0]:
bob._objects # notice that there is only a single tensor in bob

{15601760445: tensor([1, 2, 3, 4, 5])}

In [0]:
x = th.tensor([1,2,3,4,5]).send(bob)
y = th.tensor([1,1,1,1,1])

In [0]:
z = x + y

TensorsNotCollocatedException: You tried to call a method involving two tensors where one tensor is actually locatedon another machine (is a PointerTensor). Call .get() on the PointerTensor or .send(bob) on the other tensor.

Tensor A: [PointerTensor | me:46419059800 -> bob:14412738960]
Tensor B: tensor([1, 1, 1, 1, 1])

In [0]:
alice = sy.VirtualWorker(hook, id="alice")
x = th.tensor([1,2,3,4,5]).send(bob)
y = th.tensor([1,1,1,1,1]).send(alice)

In [0]:
z = x + y

TensorsNotCollocatedException: ignored

# Lesson: Toy Federated Learning

Let's start by training a toy model the centralized way. This is about a simple as models get. We first need:

- a toy dataset
- a model
- some basic training logic for training a model to fit the data.

In [0]:
from torch import nn, optim

In [0]:
# A Toy Dataset
data = th.tensor([[1.,1],[0,1],[1,0],[0,0]], requires_grad=True)
target = th.tensor([[1.],[1], [0], [0]], requires_grad=True)

In [0]:
# A Toy Model
model = nn.Linear(2,1)

In [0]:
opt = optim.SGD(params=model.parameters(), lr=0.1)

In [0]:
def train(iterations=10):
    for iter in range(iterations):
        opt.zero_grad()

        pred = model(data)

        loss = ((pred - target)**2).mean()

        loss.backward()

        opt.step()

        print(loss.data)
        
train()

tensor(0.0060)
tensor(0.0056)
tensor(0.0052)
tensor(0.0048)
tensor(0.0044)
tensor(0.0041)
tensor(0.0038)
tensor(0.0036)
tensor(0.0033)
tensor(0.0031)


# nothing was federated up to this point!

In [18]:
# let's reeat the previous experiment in a FL approach:

import torch as th
import syft as sy

hook = sy.TorchHook(th)  

bob = sy.VirtualWorker(hook, id="bob")
alice = sy.VirtualWorker(hook, id="alice")

bob = bob.clear_objects()
alice = alice.clear_objects()



In [0]:
# create local datasets at Bob and Alice
data_bob = th.tensor([[1.,1],[0,1]], requires_grad=True).send(bob)
target_bob = th.tensor([[1.],[1]], requires_grad=True).send(bob)

data_alice = th.tensor([[1., 0],[0, 0]], requires_grad=True).send(alice)
target_alice = th.tensor([[0.],[0]], requires_grad=True).send(alice)

In [20]:
datasets = [(data_bob, target_bob), (data_alice, target_alice)]
datasets

[((Wrapper)>[PointerTensor | me:85445367186 -> bob:6936485306],
  (Wrapper)>[PointerTensor | me:49031248423 -> bob:32816380286]),
 ((Wrapper)>[PointerTensor | me:97470465344 -> alice:78629342950],
  (Wrapper)>[PointerTensor | me:18522179404 -> alice:95907302740])]

In [0]:
def train(iterations=3):

    model = nn.Linear(2,1)
    # our model created locally
    optimizer = optim.SGD(params=model.parameters(), lr=0.1)
    # our optimizer created locally
    
    for iter in range(iterations):

        for _data, _target in datasets:

            # send model to the data
            model = model.send(_data.location)

            # do training on the remote machine
            # 1 zero the gradients
            optimizer.zero_grad()

            # 2 calculate predictions
            predictions = model(_data)

            # 3 calculate loss -- MSE
            loss = ((predictions - _target)**2).mean()

            # 4 calculate gradeints
            loss.backward()

            # 5 update weights
            optimizer.step()

            # return the model to the local machine
            model = model.get()
            
            
        print(loss.get())

In [32]:
train(3)

tensor(0.1859, requires_grad=True)
tensor(0.1285, requires_grad=True)
tensor(0.0973, requires_grad=True)


# Lesson: Advanced Remote Execution Tools

In the last section we trained a toy model using Federated Learning. We did this by calling .send() and .get() on our model, sending it to the location of training data, updating it, and then bringing it back. However, at the end of the example we realized that we needed to go a bit further to protect people privacy. Namely, we want to average the gradients BEFORE calling .get(). That way, we won't ever see anyone's exact gradient (thus better protecting their privacy!!!)

But, in order to do this, we need a few more pieces:

- use a pointer to send a Tensor directly to another worker

And in addition, while we're here, we're going to learn about a few more advanced tensor operations as well which will help us both with this example and a few in the future!

In [0]:
bob.clear_objects()
alice.clear_objects()

<VirtualWorker id:alice #objects:0>

In [0]:
x = th.tensor([1,2,3,4,5]).send(bob)

In [0]:
x = x.send(alice)

In [0]:
bob._objects

{681755396: tensor([1, 2, 3, 4, 5])}

In [0]:
alice._objects

{59232059007: (Wrapper)>[PointerTensor | alice:59232059007 -> bob:681755396]}

In [0]:
y = x + x

In [0]:
y

(Wrapper)>[PointerTensor | me:35076229352 -> alice:14700048988]

In [0]:
bob._objects

{681755396: tensor([1, 2, 3, 4, 5]), 61259427323: tensor([ 2,  4,  6,  8, 10])}

In [0]:
alice._objects

{14700048988: (Wrapper)>[PointerTensor | alice:14700048988 -> bob:61259427323],
 59232059007: (Wrapper)>[PointerTensor | alice:59232059007 -> bob:681755396]}

In [0]:
jon = sy.VirtualWorker(hook, id="jon")

In [0]:
bob.clear_objects()
alice.clear_objects()



<VirtualWorker id:alice #objects:0>

In [0]:
x = th.tensor([1,2,3,4,5]).send(bob).send(alice)
y = th.tensor([1,2,3,4,5]).send(bob).send(jon)

In [0]:
z = x + y

TensorsNotCollocatedException: ignored

In [0]:
x = x.get() # to get the data back
x

(Wrapper)>[PointerTensor | me:59854104348 -> bob:36602545298]

In [0]:
bob._objects

{36602545298: tensor([1, 2, 3, 4, 5]), 89784608003: tensor([1, 2, 3, 4, 5])}

In [0]:
alice._objects

{}

In [0]:
x = x.get()
x

tensor([1, 2, 3, 4, 5])

In [0]:
bob._objects

# Lesson: Pointer Chain Operations

In [0]:
bob.clear_objects()
alice.clear_objects()

<VirtualWorker id:alice #objects:0>

In [0]:
x = th.tensor([1,2,3,4,5]).send(bob)

In [0]:
bob._objects

{36597101211: tensor([1, 2, 3, 4, 5])}

In [0]:
alice._objects

{}

In [0]:
x.move(alice)

(Wrapper)>[PointerTensor | me:16851411436 -> alice:16851411436]

In [0]:
x

(Wrapper)>[PointerTensor | me:16851411436 -> alice:16851411436]

In [0]:
bob._objects

{}

In [0]:
alice._objects

{16851411436: tensor([1, 2, 3, 4, 5])}

In [0]:
bob.clear_objects()
alice.clear_objects()

<VirtualWorker id:alice #objects:0>

In [0]:
x = th.tensor([1,2,3,4,5]).send(bob).send(alice)

In [0]:
bob._objects

{81869613150: tensor([1, 2, 3, 4, 5])}

In [0]:
alice._objects

{99685627431: (Wrapper)>[PointerTensor | alice:99685627431 -> bob:81869613150]}

In [0]:
x

(Wrapper)>[PointerTensor | me:83688287202 -> alice:99685627431]

In [0]:
x.remote_get() # inplace operation

(Wrapper)>[PointerTensor | me:83688287202 -> alice:99685627431]

In [0]:
bob._objects

{}

In [0]:
alice._objects

{99685627431: tensor([1, 2, 3, 4, 5])}

In [0]:
x.move(bob)

(Wrapper)>[PointerTensor | me:83688287202 -> bob:83688287202]

In [0]:
x

(Wrapper)>[PointerTensor | me:83688287202 -> bob:83688287202]

In [0]:
bob._objects

{83688287202: tensor([1, 2, 3, 4, 5])}

In [0]:
alice._objects

{}


**Exercise:**

To avoid exposing gradients among participants, you need to send the gradiants to a TRUSTED thirdparty (trusted aggregator) who will aggregate the models and then send the final model to the server (local worker). In this way, we assure that none of the participating workers can access the aggregated model!

1. create a dataset for each worker (create two)
2. create a model for each worker and train it remotely on each worker
3. send those two models using the *move()* function to a third worker
4. the third workers aggregates the two models (find their mean)
5. send the aggregated model to the main server (local worker) using the *get()* function



---

To set the weights: `model.weight.set_()`



In [0]:
import syft as sy
import torch as th
from torch import nn, optim

In [0]:
# create workers

bob1 = sy.VirtualWorker(hook, id="bob1")
alice1 = sy.VirtualWorker(hook, id="alice1")
secureWorker = sy.VirtualWorker(hook,id="secureWorker")
bob1 = bob1.clear_objects()
alice1 = alice1.clear_objects()
secureWorker = secureWorker.clear_objects()

In [189]:
# Make each worker aware of the other workers
bob1.add_workers([alice1,secureWorker])
alice1.add_workers([bob1,secureWorker])
secureWorker.add_workers([bob1, alice1])



<VirtualWorker id:secureWorker #objects:0>

In [0]:
# A Toy Dataset
data = th.tensor([[1.,1],[0,1],[1,0],[0,0]], requires_grad=True)
target = th.tensor([[1.],[1], [0], [0]], requires_grad=True)

In [0]:
# create local datasets at Bob and Alice
data_bob1 = th.tensor([[1.,1],[0,1]], requires_grad=True).send(bob1)
target_bob1 = th.tensor([[1.],[1]], requires_grad=True).send(bob1)

data_alice1 = th.tensor([[1., 0],[0, 0]], requires_grad=True).send(alice1)
target_alice1 = th.tensor([[0.],[0]], requires_grad=True).send(alice1)

In [0]:
# create a linear model at local worker
model = nn.Linear(2,1)

In [0]:
# Send copies of the linear model to alice and bob
model_bob = model.copy().send(bob1)
model_alice = model.copy().send(alice1)

In [0]:
# create two opimizers for Alice and Bob
opt_bob = optim.SGD(params=model.parameters(), lr=0.1)
opt_alice = optim.SGD(params=model.parameters(), lr=0.1)

In [0]:
opt_bob.zero_grad()
opt_alice.zero_grad()
b_predictions = model_bob(data_bob1)
a_predictions = model_alice(data_alice1)
b_loss = ((b_predictions - target_bob1)**2).mean()
a_loss = ((a_predictions - target_alice1)**2).mean()
b_loss.backward()
a_loss.backward()
opt_bob.step()
opt_alice.step()



In [203]:

# move the models to the third worker 
model_bob.move(secureWorker)
model_alice.move(secureWorker)
# aggreegate the models (their average)
# --- use model.weight.data to access the weights, and model.bias.data to access bias
weight = (model_bob.weight.data + model_alice.weight.data) / 2
bias = (model_bob.bias.data + model_alice.bias.data) / 2


TypeError: ignored

In [209]:
# send the model back to the local worker
# --- use model.weight.set_(new_weights) to update the weights
model.weight.set_(weight.get())
# --- use model.bias.set_(new_bias) to update the bias
model.bias.set_(bias.get())

AttributeError: ignored

In [212]:
# make predictions uding the aggregated model
def train(iterations=10):
  opt = optim.SGD(params=model.parameters(), lr=0.1)
  for iter in range(iterations):
    opt.zero_grad()
    pred = model(data)
    loss = ((pred - target)**2).mean()
    loss.backward()
    opt.step()
    print(loss.data)
        
train()

tensor(2.0549)
tensor(0.9718)
tensor(0.4677)
tensor(0.2324)
tensor(0.1221)
tensor(0.0699)
tensor(0.0447)
tensor(0.0321)
tensor(0.0254)
tensor(0.0216)
