# Section: Federated Learning

# Lesson: Introducing Federated Learning

Federated Learning is a technique for training Deep Learning models on data to which you do not have access. Basically:

Federated Learning: Instead of bringing all the data to one machine and training a model, we bring the model to the data, train it locally, and merely upload "model updates" to a central server.

Use Cases:

    - app company (Texting prediction app)
    - predictive maintenance (automobiles / industrial engines)
    - wearable medical devices
    - ad blockers / autotomplete in browsers (Firefox/Brave)
    
Challenge Description: data is distributed amongst sources but we cannot aggregated it because of:

    - privacy concerns: legal, user discomfort, competitive dynamics
    - engineering: the bandwidth/storage requirements of aggregating the larger dataset

# Lesson: Introducing / Installing PySyft

In order to perform Federated Learning, we need to be able to use Deep Learning techniques on remote machines. This will require a new set of tools. Specifically, we will use an extensin of PyTorch called PySyft.

### Install PySyft

The easiest way to install the required libraries is with [Conda](https://docs.conda.io/projects/conda/en/latest/user-guide/overview.html). Create a new environment, then install the dependencies in that environment. In your terminal:

```bash
conda create -n pysyft python=3
conda activate pysyft # some older version of conda require "source activate pysyft" instead.
conda install jupyter notebook
pip install syft
pip install numpy
```

If you have any errors relating to zstd - run the following (if everything above installed fine then skip this step):

```
pip install --upgrade --force-reinstall zstd
```

and then retry installing syft (pip install syft).

If you are using Windows, I suggest installing [Anaconda and using the Anaconda Prompt](https://docs.anaconda.com/anaconda/user-guide/getting-started/) to work from the command line. 

With this environment activated and in the repo directory, launch Jupyter Notebook:

```bash
jupyter notebook
```

and re-open this notebook on the new Jupyter server.

If any part of this doesn't work for you (or any of the tests fail) - first check the [README](https://github.com/OpenMined/PySyft.git) for installation help and then open a Github Issue or ping the #beginner channel in our slack! [slack.openmined.org](http://slack.openmined.org/)

In [None]:
import torch as th

In [None]:
x = th.tensor([1,2,3,4,5])
x

In [None]:
y = x + x

In [None]:
print(y)

In [None]:

import syft as sy

In [None]:
#modifies pytorch
hook = sy.TorchHook(th)

In [None]:
#pytorch continues to work as before
th.tensor([1,2,3,4,5])

# Lesson: Basic Remote Execution in PySyft

## PySyft => Remote PyTorch

The essence of Federated Learning is the ability to train models in parallel on a wide number of machines. Thus, we need the ability to tell remote machines to execute the operations required for Deep Learning.

Thus, instead of using Torch tensors - we're now going to work with **pointers** to tensors. Let me show you what I mean. First, let's create a "pretend" machine owned by a "pretend" person - we'll call him Bob.

In [None]:
#worker simulates interface we have to another machine
bob = sy.VirtualWorker(hook, id="bob")

In [None]:
#collection of objects simple tensors working with
bob._objects

In [None]:
x = th.tensor([1,2,3,4,5])

In [None]:
x = x.send(bob)

In [None]:
#Bob's object collection is = {76957029230: tensor([1, 2, 3, 4, 5])}
# this is a pointer to the object, serialized into Json object
bob._objects

In [None]:
#pointer is pointing to Bob = <VirtualWorker id:bob #objects:1>
x.location

In [None]:
#pointer location = 76957029230
x.id_at_location

In [None]:
#this is the id of the object 83032657046
x.id

In [None]:
#who owns pointer = <VirtualWorker id:me #objects:0>
x.owner

In [None]:
hook.local_worker

In [None]:
#local worker, contact bob and tell him to do this
#(Wrapper)>[PointerTensor | me:83032657046 -> bob:76957029230]
x

In [None]:
#x gets the tensor -should be = tensor([1, 2, 3, 4, 5])
x = x.get()
x

In [None]:
#result {}
bob._objects

# Project: Playing with Remote Tensors

In this project, I want you to .send() and .get() a tensor to TWO workers by calling .send(bob,alice). This will first require the creation of another VirtualWorker called alice.

In [None]:
#import pytoch and create tensor  - pointers to remote tensors
import torch as th
x = th.tensor([1,2,3,4,5])
x

In [None]:
#import PySyft
#modifies pytorch
import syft as sy
hook = sy.TorchHook(th)

In [None]:
#create virtual workers bob and alice
#worker simulates interface we have to another machine
bob = sy.VirtualWorker(hook, id="bob")
#worker simulates interface we have to another machine
alice= sy.VirtualWorker(hook, id="alice")

In [None]:
# y = y.send(bob,alice)

In [None]:
# y.owner

In [None]:
# y.location

In [None]:
# bob._objects

In [None]:
# alice._objects

In [None]:
#y = y.get()
# y

In [None]:
# after adding alice as virtual worker
# x = th.tensor([1,2,3,4,5])
#x_ptr = x.send(bob, alice)   #this is a multipointer
# x_ptr

In [None]:
x_ptr = x.send(bob, alice)

In [None]:
# multipointer returns bob and alice
#(Wrapper)>[MultiPointerTensor]
#	-> (Wrapper)>[PointerTensor | me:55988065873 -> bob:9222176640]
#	-> (Wrapper)>[PointerTensor | me:20656805602 -> alice:7139222520]
x_ptr

In [None]:
#dictionary of virtual workers
#{'bob': (Wrapper)>[PointerTensor | me:55988065873 -> bob:9222176640],
# 'alice': (Wrapper)>[PointerTensor | me:20656805602 -> alice:7139222520]}
# x_ptr.child.child

In [None]:
#multipointer to bob and alice each pointing to tensor
x_ptr.get()

In [None]:
x = th.tensor([1,2,3,4,5]).send(bob,alice)

In [None]:
#sum results of bob and alice tensors
x.get(sum_results=True)

# Lesson: Introducing Remote Arithmetic

In [None]:
x = th.tensor([1,2,3,4,5]).send(bob)
y = th.tensor([1,1,1,1,1]).send(bob)

In [None]:
x

In [None]:
y

In [None]:
#sum tensors
z = x + y

In [None]:
z

In [None]:
#result of the summed tensors
z = z.get()
z

In [None]:
#another way of summing result
z = th.add(x,y)
z

In [None]:
z = z.get()
z

In [None]:
#backwards propagration creates gradients on x, y
x = th.tensor([1.,2,3,4,5], requires_grad=True).send(bob)
y = th.tensor([1.,1,1,1,1], requires_grad=True).send(bob)

In [None]:
z = (x + y).sum()

In [None]:
z.backward()

In [None]:
x = x.get()

In [None]:
x

In [None]:
x.grad

# Project: Learn a Simple Linear Model

In this project, I'd like for you to create a simple linear model which will solve for the following dataset below. You should use only Variables and .backward() to do so (no optimizers or nn.Modules). Furthermore, you must do so with both the data and the model being located on Bob's machine.

In [None]:
import torch as th
import syft as sy
hook = sy.TorchHook(th)

In [None]:
#worker simulates interface we have to another machine
bob = sy.VirtualWorker(hook, id="bob")

In [None]:
#create data for model
input = th.tensor([[1.,1],[0,1,],[1,0],[0,0]], requires_grad=True).send(bob)
target = th.tensor([[1.],[1],[0],[0]], requires_grad=True).send(bob)

In [None]:
weights = th.tensor([[0.],[0.]], requires_grad=True).send(bob)

In [None]:
#forward propagation - prediction
pred = input.mm(weights)

In [None]:
#prediction is also a pointer
pred

In [None]:
loss = ((pred - target)**2).sum()

In [None]:
#backward propagation
loss.backward()

weights.data.sub_(weights.grad * 0.1)
weights.grad *= 0

In [None]:
#tensor(2.)
print(loss.get().data)

In [None]:
#loop to reduce loss as below
'''
tensor(0.5600)
tensor(0.2432)
tensor(0.1372)
tensor(0.0849)
tensor(0.0538)
tensor(0.0344)
tensor(0.0220)
tensor(0.0141)
tensor(0.0090)
tensor(0.0058)
'''
for i in range(10):
    pred = input.mm(weights)
    loss = ((pred - target)**2).sum()
    loss.backward()
    weights.data.sub_(weights.grad * 0.1)
    weights.grad *=0
    print(loss.get().data)
    

# Lesson: Garbage Collection and Common Errors


In [None]:
import torch as th
import syft as sy
hook = sy.TorchHook(th)
#worker simulates interface we have to another machine
bob = sy.VirtualWorker(hook, id="bob")

In [None]:
bob = bob.clear_objects()

In [None]:
bob._objects

In [None]:
x = th.tensor([1,2,3,4,5]).send(bob)

In [None]:
bob._objects

In [None]:
del x

In [None]:
bob._objects

In [None]:
x = th.tensor([1,2,3,4,5]).send(bob)

In [None]:
#result = True - when data is sent this is the default attribute
x.child.garbage_collect_data

In [None]:
bob._objects

In [None]:
x = "asdf"

In [None]:
bob._objects

In [None]:
x = th.tensor([1,2,3,4,5]).send(bob)

#gotcha - garbage collection turned off - pointer cached even if deleted


In [None]:
x="asdf"

In [None]:
#bob.objects

del x

In [None]:
#del x still lives on
bob._objects
#####


In [None]:
x = th.tensor([1,2,3,4,5]).send(bob)

In [None]:
x

In [None]:
bob._objects

In [None]:
x = "asdf"

In [None]:
bob._objects

In [None]:
del x

In [None]:
bob._objects

In [None]:
#### start here again
bob = bob.clear_objects()
bob._objects

In [None]:
for i in range(1000):
    x = th.tensor([1,2,3,4,5]).send(bob)

In [None]:
bob._objects

In [None]:
x = th.tensor([1,2,3,4,5]).send(bob)
y = th.tensor([1,1,1,1,1])

In [None]:
#Results in PureTorchTensorFoundError - means one is a regulatory tensor
z = x + y

In [None]:
#Another example of a common error
#worker simulates interface we have to another machine
alice= sy.VirtualWorker(hook, id="alice")
x = th.tensor([1,2,3,4,5]).send(bob)
y = th.tensor([1,1,1,1,1]).send(alice)

In [None]:
#Result: TensorsNotCollocatedException
z = x + y

# Lesson: Toy Federated Learning

Let's start by training a toy model the centralized way. This is about a simple as models get. We first need:

- a toy dataset
- a model
- some basic training logic for training a model to fit the data.

In [2]:
# Train federated learning model
# distribute a tiny toy dataset across two different worker
# train a model while the data stays on those workers

import torch as th
import syft as sy
hook = sy.TorchHook(th)
from torch import nn, optim

W0724 02:46:44.576999  2548 secure_random.py:26] Falling back to insecure randomness since the required custom op could not be found for the installed version of TensorFlow. Fix this by compiling custom ops. Missing file was 'C:\Users\Claudia\Anaconda3\lib\site-packages\tf_encrypted/operations/secure_random/secure_random_module_tf_1.14.0.so'
W0724 02:46:44.608000  2548 deprecation_wrapper.py:119] From C:\Users\Claudia\Anaconda3\lib\site-packages\tf_encrypted\session.py:26: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.



In [3]:
# A Toy Dataset  - each has 2 inputs, 1 output
data = th.tensor([[1.,1],[0,1],[1,0],[0,0]], requires_grad=True)
target = th.tensor([[1.],[1], [0], [0]], requires_grad=True)

In [4]:
# a Toy Model - 2 inputs, 1 output
model = nn.Linear(2,1)

In [5]:
# create optimizers - stochastic gradient ascent
opt = optim.SGD(params=model.parameters(), lr=0.1)

In [6]:
# zero out gradients = opt.zero_grad()
# predict:  pred = model(data)
# calculate loss function: loss = ((pred - target)**2).sum()
# backward propagation: loss.backward()
# step to optimizer:  opt.step()
# print the loss of tensor:  print(loss.data)  //result: tesor(0.4472)
# watch loss go down if below cell is run 20 times manually
'''
opt.zero_grad()
pred = model(data)
loss = ((pred - target)**2).sum()
loss.backward()
opt.step()
print(loss.data)
'''

'\nopt.zero_grad()\npred = model(data)\nloss = ((pred - target)**2).sum()\nloss.backward()\nopt.step()\nprint(loss.data)\n'

In [7]:
#loop 20 times to see loss

def train(iterations=20):
    for iter in range(iterations):
        opt.zero_grad()

        pred = model(data)

        loss = ((pred - target)**2).sum()

        loss.backward()

        opt.step()

        print(loss.data)
        
train()

tensor(1.4409)
tensor(0.1688)
tensor(0.0514)
tensor(0.0287)
tensor(0.0181)
tensor(0.0116)
tensor(0.0074)
tensor(0.0048)
tensor(0.0031)
tensor(0.0020)
tensor(0.0013)
tensor(0.0008)
tensor(0.0005)
tensor(0.0003)
tensor(0.0002)
tensor(0.0001)
tensor(9.4990e-05)
tensor(6.2187e-05)
tensor(4.0865e-05)
tensor(2.6968e-05)


In [8]:
# Above is a train method - a simple linear model that can learn on some toy data
# on a centralized server
# for federated learning - need to move data and models to individual machines
# goal is for models training on individual machines
# split data into difference pieses
# send it to two different workers
# create 2 virtual workers: bob and alice

#2 virtual workers simulate interface we have to another machine
bob = sy.VirtualWorker(hook, id="bob")
alice = sy.VirtualWorker(hook, id="alice")

In [9]:
#verify workers exist
#result: <VirtualWorker id:bob #objects:0>, <VirtualWorker id:alice #objects:0>)
bob, alice

(<VirtualWorker id:bob #objects:0>, <VirtualWorker id:alice #objects:0>)

In [10]:
#send first 2 rows to bob
data_bob = data[0:2].send(bob)
target_bob = target[0:2].send(bob)

In [11]:
#sent 2 last rows to alice
data_alice = data[2:4].send(alice)
target_alice = target[2:4].send(alice)

In [12]:
#put in tuples
datasets = [(data_bob, target_bob), (data_alice, target_alice)]

In [13]:
'''
model = nn.Linear(2,1)
opt = optim.SGD(params=model.parameters(), lr=0.1)
'''

'\nmodel = nn.Linear(2,1)\nopt = optim.SGD(params=model.parameters(), lr=0.1)\n'

In [14]:
#train model
# result = tensor(1.7390, requires_grad=True)

'''
_data, _target = datasets[0]

# sent model to the data
model = model.send(_data.location)

# do normal training
opt.zero_grad()
pred = model(_data)
loss = ((pred - _target)**2).sum()
loss.backward()
opt.step()

#get smarter model back
model = model.get()

print(loss.get())

'''

'\n_data, _target = datasets[0]\n\n# sent model to the data\nmodel = model.send(_data.location)\n\n# do normal training\nopt.zero_grad()\npred = model(_data)\nloss = ((pred - _target)**2).sum()\nloss.backward()\nopt.step()\n\n#get smarter model back\nmodel = model.get()\n\nprint(loss.get())\n\n'

In [15]:
# loop for virtual workers and data
# model trains across a distributed set
# prepares privacy of users
# can reverse engineer this model
# diff between model sent and model got back - can reverse engineer
# do not send model to bob then back to us then to alice

# solution: train multiple different models in parallel 
# on different workers on different people's data sets
# average models together

def train(iterations=20):

    model = nn.Linear(2,1)
    opt = optim.SGD(params=model.parameters(), lr=0.1)
    
    for iter in range(iterations):

        for _data, _target in datasets:

            # send model to the data
            model = model.send(_data.location)

            # do normal training
            opt.zero_grad()
            pred = model(_data)
            loss = ((pred - _target)**2).sum()
            loss.backward()
            opt.step()

            # get smarter model back
            model = model.get()

            print(loss.get())

In [12]:
train()

tensor(10.6323, requires_grad=True)
tensor(0.4114, requires_grad=True)
tensor(0.4211, requires_grad=True)
tensor(0.3778, requires_grad=True)
tensor(0.1607, requires_grad=True)
tensor(0.2290, requires_grad=True)
tensor(0.0902, requires_grad=True)
tensor(0.1357, requires_grad=True)
tensor(0.0523, requires_grad=True)
tensor(0.0806, requires_grad=True)
tensor(0.0304, requires_grad=True)
tensor(0.0480, requires_grad=True)
tensor(0.0178, requires_grad=True)
tensor(0.0288, requires_grad=True)
tensor(0.0104, requires_grad=True)
tensor(0.0173, requires_grad=True)
tensor(0.0061, requires_grad=True)
tensor(0.0105, requires_grad=True)
tensor(0.0036, requires_grad=True)
tensor(0.0064, requires_grad=True)
tensor(0.0022, requires_grad=True)
tensor(0.0040, requires_grad=True)
tensor(0.0013, requires_grad=True)
tensor(0.0025, requires_grad=True)
tensor(0.0008, requires_grad=True)
tensor(0.0015, requires_grad=True)
tensor(0.0005, requires_grad=True)
tensor(0.0010, requires_grad=True)
tensor(0.0003, requ

# Lesson: Advanced Remote Execution Tools

In the last section we trained a toy model using Federated Learning. We did this by calling .send() and .get() on our model, sending it to the location of training data, updating it, and then bringing it back. However, at the end of the example we realized that we needed to go a bit further to protect people privacy. Namely, we want to average the gradients BEFORE calling .get(). That way, we won't ever see anyone's exact gradient (thus better protecting their privacy!!!)

But, in order to do this, we need a few more pieces:

- use a pointer to send a Tensor directly to another worker

And in addition, while we're here, we're going to learn about a few more advanced tensor operations as well which will help us both with this example and a few in the future!

In [16]:
#clear workers
bob.clear_objects()
alice.clear_objects()

<VirtualWorker id:alice #objects:0>

In [None]:
# normal tensor:  x = th.tensor([1,2,3,4])

In [17]:
# normal tensor sent to bob
x = th.tensor([1,2,3,4,5]).send(bob)

In [18]:
# noraml tensor above sent to alice from bob
x = x.send(alice)

In [19]:
#pointer to it and bob has object: result = {55497021924: tensor([1, 2, 3, 4, 5])}
bob._objects

{55497021924: tensor([1, 2, 3, 4, 5])}

In [20]:
#pointer to it - alice has object: result = {43170828342: (Wrapper)>[PointerTensor |
# alice:43170828342 -> bob:55497021924]}
alice._objects

{43170828342: (Wrapper)>[PointerTensor | alice:43170828342 -> bob:55497021924]}

In [21]:
y = x + x

In [22]:
#pointer to alice machine
#(Wrapper)>[PointerTensor | me:31029599192 -> alice:9424618293]
y

(Wrapper)>[PointerTensor | me:31029599192 -> alice:9424618293]

In [23]:
#2 pointers now, bob's tensors and bob+bob tensors
#{55497021924: tensor([1, 2, 3, 4, 5]),
 69357532531: tensor([ 2,  4,  6,  8, 10])}
bob._objects

{55497021924: tensor([1, 2, 3, 4, 5]),
 69357532531: tensor([ 2,  4,  6,  8, 10])}

In [24]:
alice._objects

{43170828342: (Wrapper)>[PointerTensor | alice:43170828342 -> bob:55497021924],
 9424618293: (Wrapper)>[PointerTensor | alice:9424618293 -> bob:69357532531]}

In [25]:
#creating new virtual worker jon
jon = sy.VirtualWorker(hook, id="jon")

In [26]:
bob.clear_objects()
alice.clear_objects()

x = th.tensor([1,2,3,4,5]).send(bob).send(alice)

In [27]:
bob._objects

{2755384355: tensor([1, 2, 3, 4, 5])}

In [28]:
#alice has pointer to bob's tensor
alice._objects

{88057610199: (Wrapper)>[PointerTensor | alice:88057610199 -> bob:2755384355]}

In [29]:
# pointer to bob's tensor
x = x.get()
x

(Wrapper)>[PointerTensor | me:88057610199 -> bob:2755384355]

In [30]:
bob._objects

{2755384355: tensor([1, 2, 3, 4, 5])}

In [31]:
alice._objects

{}

In [None]:
#this will result in an error since jon's ownership structures is not the same as alice
#x = th.tensor([1,2,3,4,5]).send(bob).send(alice)
#y = th.tensor([1,2,3,4,5]).send(bob).send(jon)
#z = x + y

In [32]:
#pointing to data 
x = x.get()
x

tensor([1, 2, 3, 4, 5])

In [33]:
alice._objects

{}

In [34]:
bob._objects

{}

In [35]:
bob.clear_objects()
alice.clear_objects()

x = th.tensor([1,2,3,4,5]).send(bob).send(alice)

In [36]:
bob._objects

{12771092517: tensor([1, 2, 3, 4, 5])}

In [37]:
alice._objects

{21115825920: (Wrapper)>[PointerTensor | alice:21115825920 -> bob:12771092517]}

In [38]:
#result:  (Wrapper)>[PointerTensor | me:21115825920 -> bob:12771092517]
x = x.get()
x

(Wrapper)>[PointerTensor | me:21115825920 -> bob:12771092517]

In [39]:
#bob had data but not alice {} - this is empty - because above alice sent point to us
bob._objects, alice._objects

({12771092517: tensor([1, 2, 3, 4, 5])}, {})

In [40]:
#get data back: Result = tensor([1, 2, 3, 4, 5])
x = x.get()
x

tensor([1, 2, 3, 4, 5])

In [41]:
#bob has no data {}
bob._objects

{}

In [42]:
#garbage collector deletes the whole chain by deleting the pointer
del x

In [43]:
#bob is empty
bob._objects

{}

In [None]:
alice._objects

# Lesson: Pointer Chain Operations

In [None]:
bob.clear_objects()
alice.clear_objects()

In [None]:
x = th.tensor([1,2,3,4,5]).send(bob)

In [None]:
bob._objects

In [None]:
alice._objects

In [None]:
x.move(alice)

In [None]:
bob._objects

In [None]:
alice._objects

In [None]:
x = th.tensor([1,2,3,4,5]).send(bob).send(alice)

In [None]:
bob._objects

In [None]:
alice._objects

In [None]:
x.remote_get()

In [None]:
bob._objects

In [None]:
alice._objects

In [None]:
x.move(bob)

In [None]:
x

In [None]:
bob._objects

In [None]:
alice._objects