# Federated Learning

Federated Learning is a technique for training Deep Learning models on data to which you do not have access. Basically:

> Federated Learning: Instead of bringing all the data to one machine and training a model, we bring the model to the data, train it locally, and merely upload "model updates" to a central server.

Use Cases:

    - app company (Texting prediction app)
    - predictive maintenance (automobiles / industrial engines)
    - wearable medical devices
    - ad blockers / autotomplete in browsers (Firefox/Brave)


Challenge Description: data is distributed amongst sources but we cannot aggregate it because of:

    - privacy concerns: legal, user discomfort, competitive dynamics
    - engineering: the bandwidth/storage requirements of aggregating the larger dataset
    
#### **Major benefit to data-driven organizations**
Reduce the bandwidth cost of having to upload datasets to the cloud by allowing training to happen in devices. 

## PySyft

In order to perform Federated Learning, we need to be able to use Deep Learning techniques on remote machines. One tool we'll use for this is an extension of PyTorch called PySyft.

#### Install PySyft

The easiest way to install the required libraries is with [Conda](https://docs.conda.io/projects/conda/en/latest/user-guide/overview.html). Create a new environment, then install the dependencies in that environment. In your terminal:

```bash
conda create -n pysyft python=3
conda activate pysyft # some older version of conda require "source activate pysyft" instead.
conda install jupyter notebook
pip install syft
pip install numpy
```

If you have any errors relating to zstd - run the following (if everything above installed fine then skip this step):

```
pip install --upgrade --force-reinstall zstd
```
and then retry installing syft (pip install syft).

With this environment activated and in the repo directory, launch Jupyter Notebook:

```bash
jupyter notebook
```

Reopen this notebook.

In [15]:
import torch as th

In [16]:
x = th.tensor([1,2,3,4,5])
x

tensor([1, 2, 3, 4, 5])

In [17]:
y = x + x

In [18]:
print(y)

tensor([ 2,  4,  6,  8, 10])


In [19]:
import syft as sy

W1003 17:20:49.167685 4793488832 secure_random.py:26] Falling back to insecure randomness since the required custom op could not be found for the installed version of TensorFlow. Fix this by compiling custom ops. Missing file was '/Users/jee/.pyenv/versions/miniconda3-latest/envs/pysyft/lib/python3.7/site-packages/tf_encrypted/operations/secure_random/secure_random_module_tf_1.14.0.so'
W1003 17:20:49.182069 4793488832 deprecation_wrapper.py:119] From /Users/jee/.pyenv/versions/miniconda3-latest/envs/pysyft/lib/python3.7/site-packages/tf_encrypted/session.py:26: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.



In [20]:
hook = sy.TorchHook(th)

In [21]:
th.tensor([1,2,3,4,5])

tensor([1, 2, 3, 4, 5])

# Basic Remote Execution in PySyft

## PySyft => Remote PyTorch

The essence of Federated Learning is the ability to train models in parallel on a wide number of machines. Thus, we need the ability to tell remote machines to execute the operations required for Deep Learning.

Thus, instead of using Torch tensors - we're now going to work with **pointers** to tensors. First, let's create a "pretend" machine owned by a "pretend" person - we'll call him Bob.

In [22]:
bob = sy.VirtualWorker(hook, id="bob")

In [23]:
bob._objects

{}

In [24]:
input = th.tensor([range(10)])

In [None]:
x = input.send(bob)

In [15]:
bob._objects

{7593808629: tensor([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])}

Now a pointer, is a kind of tensor. But, instead of executing these commands locally
each command is serialized to a simple JSON format, sent to Bob,
and Bob executes it on our behalf, and returns a pointer to the new object.

In [16]:
x.location

<VirtualWorker id:bob #objects:1>

In [17]:
x.id_at_location

7593808629

In [18]:
x.id

96649436933

In [23]:
# we are the local worker
x.owner

<VirtualWorker id:me #objects:0>

In [20]:
hook.local_worker

<VirtualWorker id:me #objects:0>

In [19]:
x

(Wrapper)>[PointerTensor | me:96649436933 -> bob:7593808629]

In [24]:
# get the information back from Bob.
x = x.get()
x

tensor([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])

In [26]:
# Bob no longer has any tensors since we retrieved it above
bob._objects

{}

# Playing with Remote Tensors

In this project, I want you to .send() and .get() a tensor to TWO workers by calling .send(bob,alice). This will first require the creation of another VirtualWorker called alice.

In [14]:
# create two virtual workers (pseudo devices)
alice = sy.VirtualWorker(hook, id='alice')
bob = sy.VirtualWorker(hook, id='bob')


NameError: name 'sy' is not defined

In [31]:
tensor = th.tensor(range(10))
tensor

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [32]:
x_pointer = tensor.send(alice, bob)

In [33]:
x_pointer

(Wrapper)>[MultiPointerTensor]
	-> (Wrapper)>[PointerTensor | me:41739231715 -> alice:32349965707]
	-> (Wrapper)>[PointerTensor | me:73732368390 -> bob:71309423397]

In [35]:
x_pointer.get()

[tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),
 tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])]

In [39]:
# send the tensor to the virtual workers, and sum results
x = th.tensor([1,2,3,4,5]).send(alice, bob)
x.get(sum_results=True)

tensor([ 2,  4,  6,  8, 10])

# Introducing Remote Arithmetic

In [40]:
x = th.tensor([1,2,3,4,5]).send(bob)
y = th.tensor([1,1,1,1,1]).send(bob)

In [42]:
x

(Wrapper)>[PointerTensor | me:1186486666 -> bob:96224330477]

In [43]:
y

(Wrapper)>[PointerTensor | me:82049975451 -> bob:66183703328]

In [49]:
z = x + y

In [50]:
z

(Wrapper)>[PointerTensor | me:59520685906 -> bob:88976585696]

In [51]:
z = z.get()
z

tensor([2, 3, 4, 5, 6])

In [33]:
z = th.add(x,y)
z

(Wrapper)>[PointerTensor | me:28437210120 -> bob:28437210120]

In [34]:
z = z.get()
z

tensor([2, 3, 4, 5, 6])

In [56]:

x = th.tensor([1.,2,3,4,5], requires_grad=True).send(bob)
y = th.tensor([1.,1,1,1,1], requires_grad=True).send(bob)

In [57]:
z = (x + y).sum()

In [58]:
# back propagation
z.backward()

(Wrapper)>[PointerTensor | me:98360935105 -> bob:65217515287]

In [59]:
x = x.get()

In [60]:
x

tensor([1., 2., 3., 4., 5.], requires_grad=True)

In [40]:
x.grad

tensor([1., 1., 1., 1., 1.])

# Learn a Simple Linear Model

Let's create a simple linear model which will solve for the following dataset below. We'll use only variables and .backward() to do so (no optimizers or nn.Modules). We must do so with both the data and the model being located on Bob's machine.

In [63]:
input = th.tensor([[1.,1],[0,1,],[1,0],[0,0]], requires_grad=True).send(bob)
target = th.tensor([[1.],[1],[0],[0]], requires_grad=True).send(bob)


In [64]:
weights = th.tensor([[0.],[0.]], requires_grad=True).send(bob)

In [69]:
for i in range(10):
    prediction = input.mm(weights)
    loss = ((prediction - target)**2).sum()
    loss.backward()
    weights.data.sub_(weights.grad * 0.1)
    weights.grad *= 0

    print(loss.get().data)

tensor(0.5600)
tensor(0.2432)
tensor(0.1372)
tensor(0.0849)
tensor(0.0538)
tensor(0.0344)
tensor(0.0220)
tensor(0.0141)
tensor(0.0090)
tensor(0.0058)


# Garbage Collection and Common Errors


In [71]:
# pysyft always assumes, that when you create a tensor, and send it to someone, 
# you should continue to control the life cycle of that tensor.
#e.g if you delete the pointer to the tensor, then you should also delete
# the tensor associated to the deleted pointer.
# 
bob = bob.clear_objects()

In [72]:
bob._objects

{}

In [77]:
x = th.tensor([1,2,3,4,5]).send(bob)

In [78]:
bob._objects

{68041866096: tensor([1, 2, 3, 4, 5])}

In [79]:
del x

In [80]:
bob._objects

{}

In [81]:
x = th.tensor([1,2,3,4,5]).send(bob)

In [94]:
bob._objects

{}

In [95]:
x = "asdf"

In [90]:
bob._objects

{}

In [96]:
x = th.tensor([1,2,3,4,5]).send(bob)

In [97]:
x

(Wrapper)>[PointerTensor | me:21949707481 -> bob:80439636635]

In [98]:
bob._objects

{80439636635: tensor([1, 2, 3, 4, 5])}

In [99]:
x = "asdf"

In [100]:
bob._objects

{80439636635: tensor([1, 2, 3, 4, 5])}

In [101]:
del x

In [102]:
bob._objects

{80439636635: tensor([1, 2, 3, 4, 5])}

In [107]:
bob = bob.clear_objects()
bob._objects

{}

 #### Demonstrate Garbage collection.
 If send a tensor 1000 times to Bob, Pysyft will garbage collect the rest and keep the most recent tensor. We should remain with only one tensor after the loop execution.

In [108]:
for i in range(1000):
    x = th.tensor([1,2,3,4,5]).send(bob)

In [111]:
bob._objects

{67703940308: tensor([1, 2, 3, 4, 5])}

#### Tensors not being collocated error
We'll get this error when we try to call a method/operation involving two tensors where one tensor is actually located on another machine.

In [118]:

# Doing operations on tensors that are not collocated.
# notice we are sending x to bob but not y
x = th.tensor([1,2,3,4,5]).send(bob)
y = th.tensor([1,1,1,1,1])

In [119]:
z = x + y
# Notice the error below.

TensorsNotCollocatedException: You tried to call a method involving two tensors where one tensor is actually located on another machine (is a PointerTensor). Call .get() on the PointerTensor or .send(bob) on the other tensor.

Tensor A: [PointerTensor | me:93757846675 -> bob:96967233919]
Tensor B: tensor([1, 1, 1, 1, 1])

In [120]:
# let's try adding two pointer tensors on different machines.
x = th.tensor([1,2,3,4,5]).send(bob)
y = th.tensor([1,1,1,1,1]).send(alice)

In [121]:
z = x + y

TensorsNotCollocatedException: You tried to call __add__ involving two tensors which are not on the same machine! One tensor is on <VirtualWorker id:bob #objects:4> while the other is on <VirtualWorker id:alice #objects:2>. Use a combination of .move(), .get(), and/or .send() to co-locate them to the same machine.

## Toy Federated Learning

Let's start by training a toy model the centralized way. This is about a simple as models get. We first need:

- a toy dataset
- a model
- some basic training logic for training a model to fit the data.

In [46]:
import torch as th
from torch import nn, optim

# create two virtual workers (pseudo devices)
alice = sy.VirtualWorker(hook, id='alice')
bob = sy.VirtualWorker(hook, id='bob')


In [52]:
# A Toy Dataset
data = th.tensor([[1.,1],[0,1],[1,0],[0,0]], requires_grad=True)
target = th.tensor([[1.],[1], [0], [0]], requires_grad=True)

model = nn.Linear(2, 1)
# user a stochastic gradient descent optimizer with a learning rate of 0.1
optimizer = optim.SGD(params=model.parameters(), lr=0.1)

In [57]:
def train(iterations=20):
    """Train a simple linear model."""
    for i in range(iterations):
        opt.zero_grad()

        prediction = model(data)
        # mean squared error loss
        loss = ((prediction - target) ** 2).sum()
        # propagate
        loss.backward()
        opt.step()

        print(loss.data)
        
train()

tensor(2.1635)
tensor(2.1635)
tensor(2.1635)
tensor(2.1635)
tensor(2.1635)
tensor(2.1635)
tensor(2.1635)
tensor(2.1635)
tensor(2.1635)
tensor(2.1635)
tensor(2.1635)
tensor(2.1635)
tensor(2.1635)
tensor(2.1635)
tensor(2.1635)
tensor(2.1635)
tensor(2.1635)
tensor(2.1635)
tensor(2.1635)
tensor(2.1635)


In [58]:
# split our data and then send it to two different workers.
data_bob = data[0:2].send(bob) # first two rows
target_bob = target[0:2].send(bob) # first two rows of target

# do the same thing alice's machine.
data_alice = data[2:4].send(alice)
target_alice = target[2:4].send(alice)

In [67]:
datasets = [(data_bob, target_bob), (data_alice, target_alice)]
_data, _target = datasets[0]

def train(iterations=20):
    model = nn.Linear(2,1)
    opt = optim.SGD(params=model.parameters(), lr=0.1)
    
    for _data, _target in datasets:
        # This sends the model to where the data is located.
        # Iterate through each tensor in model.parameters and call .send for each
        model = model.send(_data.location)
        
        # do normal training of model
        opt.zero_grad()
        prediction = model(_data)
        loss = ((prediction - _target) ** 2).sum()
        loss.backward()
        opt.step()
        
        # get the smarter model back
        model = model.get()
        
        print(loss.get())

train()


tensor(0.4400, requires_grad=True)
tensor(0.1665, requires_grad=True)


## Advanced Remote Execution Tools

We trained the model using federated learning. 
* calling .send() on our model, sending it to the location of training data
* calling get() to bring the smarter model back

However, we need to go a bit further to protect people privacy. We want to average the gradients **BEFORE** calling .get(). 
That way, we won't ever see anyone's exact gradient (thus better protecting their privacy!!!)

To do this, we need a few more pieces:

- use a pointer to send a Tensor directly to another worker

First, let's learn about some advanced tensor operations that'll help us both with this example and a others in the future!

In [83]:
bob.clear_objects()
alice.clear_objects()

<VirtualWorker id:alice #objects:0>

In [84]:
x = th.tensor([1,2,3,4,5]).send(bob)
x

(Wrapper)>[PointerTensor | me:31829264911 -> bob:91654797179]

In [85]:

# send the above pointer to alice and replace it with a pointer to that pointer 
x = x.send(alice)

print("Bob has:", bob._objects)
print("Alice now has:", alice._objects)

Bob has: {91654797179: tensor([1, 2, 3, 4, 5])}
Alice now has: {31829264911: (Wrapper)>[PointerTensor | alice:31829264911 -> bob:91654797179]}


In [86]:
y = x + x 

In [87]:
y

(Wrapper)>[PointerTensor | me:33133663683 -> alice:17692086572]

In [90]:
bob._objects

{91654797179: tensor([1, 2, 3, 4, 5]),
 81075422456: tensor([ 2,  4,  6,  8, 10])}

In [92]:
alice._objects

# as you can see, alice and bob are joint owners of the tensors.
# if we were to do an operation on two tensors that don't have the same chai
# structure it would lead to an error

{31829264911: (Wrapper)>[PointerTensor | alice:31829264911 -> bob:91654797179],
 17692086572: (Wrapper)>[PointerTensor | alice:17692086572 -> bob:81075422456]}

In [97]:
jon = sy.VirtualWorker(hook, id="jon")

x = th.tensor([1,2,3]).send(bob).send(alice)
y = th.tensor([1,2,3]).send(bob).send(jon)

# this operation won't work because we are trying to add two tensors 
# that are not on the same machine.(y hasn't been sent to alice)
w = x + y

TensorsNotCollocatedException: You tried to call __add__ involving two tensors which are not on the same machine! One tensor is on <VirtualWorker id:alice #objects:3> while the other is on <VirtualWorker id:jon #objects:2>. Use a combination of .move(), .get(), and/or .send() to co-locate them to the same machine.

In [149]:
bob.clear_objects()
alice.clear_objects()

x = th.tensor([1,2,3,4,5]).send(bob).send(alice)

In [150]:
bob._objects

{66445387514: tensor([1, 2, 3, 4, 5])}

In [151]:
alice._objects

{73172654515: (Wrapper)>[PointerTensor | alice:73172654515 -> bob:66445387514]}

In [152]:
# if we get the data from alice by calling get, then alice 
# won't have the pointer because she sent it to us
x = x.get()
x

(Wrapper)>[PointerTensor | me:73172654515 -> bob:66445387514]

In [153]:
bob._objects

{66445387514: tensor([1, 2, 3, 4, 5])}

In [154]:
alice._objects

{}

In [155]:
# calling it again means that Bob will also not have the data
# since he's sending it to us.
x = x.get()
x

tensor([1, 2, 3, 4, 5])

In [156]:
bob._objects

{}

In [157]:
bob.clear_objects()
alice.clear_objects()

x = th.tensor([1,2,3,4,5]).send(bob).send(alice)

In [158]:
bob._objects

{93506586597: tensor([1, 2, 3, 4, 5])}

In [159]:
alice._objects

{36924676809: (Wrapper)>[PointerTensor | alice:36924676809 -> bob:93506586597]}

In [160]:
# garbage collection also works if we explicitly delete x. 
# Bob and Alice won't have the pointers anymore.
del x

In [161]:
bob._objects

{}

In [162]:
alice._objects

{}

## Pointer Chain Operations

In [174]:

#first, let's clear objects
bob.clear_objects()
alice.clear_objects()

<VirtualWorker id:alice #objects:0>

In [175]:
# create some data and send to bob and then to alice.
x = th.tensor([1,2,3,4,5]).send(bob).send(alice)

In [176]:
bob._objects

{99266811107: tensor([1, 2, 3, 4, 5])}

In [177]:
alice._objects

{30334613606: (Wrapper)>[PointerTensor | alice:30334613606 -> bob:99266811107]}

In [178]:
# force the data from Bob to Alice, where Bob will now have no data.
x.remote_get()

(Wrapper)>[PointerTensor | me:35479271960 -> alice:30334613606]

In [179]:
bob._objects

{}

In [180]:
alice._objects

{30334613606: tensor([1, 2, 3, 4, 5])}

In [181]:
# move the data back to Bob, (since he doesn't have any)
x.move(bob)

(Wrapper)>[PointerTensor | me:35479271960 -> bob:35479271960]

In [192]:
x = th.tensor([1,2,3,4,5]).send(bob).send(alice)

In [193]:
# Bob should now have two objects
bob._objects

{35479271960: tensor([1, 2, 3, 4, 5]), 50634907307: tensor([1, 2, 3, 4, 5])}

In [194]:
alice._objects

{95212642338: tensor([1, 2, 3, 4, 5]),
 24477462908: (Wrapper)>[PointerTensor | alice:24477462908 -> bob:50634907307]}

In [195]:
x.remote_get()

(Wrapper)>[PointerTensor | me:83400783648 -> alice:24477462908]

In [196]:
bob._objects

{35479271960: tensor([1, 2, 3, 4, 5])}

In [197]:
alice._objects

{95212642338: tensor([1, 2, 3, 4, 5]), 24477462908: tensor([1, 2, 3, 4, 5])}

In [198]:
x.move(bob)

(Wrapper)>[PointerTensor | me:83400783648 -> bob:83400783648]

In [199]:
x

(Wrapper)>[PointerTensor | me:83400783648 -> bob:83400783648]

In [200]:
bob._objects

{35479271960: tensor([1, 2, 3, 4, 5]), 83400783648: tensor([1, 2, 3, 4, 5])}

In [201]:
alice._objects

{95212642338: tensor([1, 2, 3, 4, 5])}