# Section: Federated Learning

# Lesson: Introducing Federated Learning

Federated Learning is a technique for training Deep Learning models on data to which you do not have access. Basically:

Federated Learning: Instead of bringing all the data to one machine and training a model, we bring the model to the data, train it locally, and merely upload "model updates" to a central server.

Use Cases:

    - app company (Texting prediction app)
    - predictive maintenance (automobiles / industrial engines)
    - wearable medical devices
    - ad blockers / autotomplete in browsers (Firefox/Brave)
    
Challenge Description: data is distributed amongst sources but we cannot aggregated it because of:

    - privacy concerns: legal, user discomfort, competitive dynamics
    - engineering: the bandwidth/storage requirements of aggregating the larger dataset

# Lesson: Introducing / Installing PySyft

In order to perform Federated Learning, we need to be able to use Deep Learning techniques on remote machines. This will require a new set of tools. Specifically, we will use an extensin of PyTorch called PySyft.

### Install PySyft

The easiest way to install the required libraries is with [Conda](https://docs.conda.io/projects/conda/en/latest/user-guide/overview.html). Create a new environment, then install the dependencies in that environment. In your terminal:

```bash
conda create -n pysyft python=3
conda activate pysyft # some older version of conda require "source activate pysyft" instead.
conda install jupyter notebook
pip install syft
pip install numpy
```

If you have any errors relating to zstd - run the following (if everything above installed fine then skip this step):

```
pip install --upgrade --force-reinstall zstd
```

and then retry installing syft (pip install syft).

If you are using Windows, I suggest installing [Anaconda and using the Anaconda Prompt](https://docs.anaconda.com/anaconda/user-guide/getting-started/) to work from the command line. 

With this environment activated and in the repo directory, launch Jupyter Notebook:

```bash
jupyter notebook
```

and re-open this notebook on the new Jupyter server.

If any part of this doesn't work for you (or any of the tests fail) - first check the [README](https://github.com/OpenMined/PySyft.git) for installation help and then open a Github Issue or ping the #beginner channel in our slack! [slack.openmined.org](http://slack.openmined.org/)

In [None]:
import torch as th

In [None]:
x = th.tensor([1,2,3,4,5])
x

In [None]:
y = x + x

In [None]:

print(y)

In [None]:
import syft as sy

In [None]:
hook = sy.TorchHook(th)
print(hook)

In [None]:
th.tensor([1,2,3,4,5])

# Lesson: Basic Remote Execution in PySyft

## PySyft => Remote PyTorch

The essence of Federated Learning is the ability to train models in parallel on a wide number of machines. Thus, we need the ability to tell remote machines to execute the operations required for Deep Learning.

Thus, instead of using Torch tensors - we're now going to work with **pointers** to tensors. Let me show you what I mean. First, let's create a "pretend" machine owned by a "pretend" person - we'll call him Bob.

In [None]:
bob = sy.VirtualWorker(hook, id="bob")

In [None]:
bob._objects

In [None]:
x = th.tensor([1,2,3,4,5])

In [None]:
x = x.send(bob)
x

In [None]:
bob._objects

In [None]:
x.location

In [None]:
x.id_at_location

In [None]:
x.id

In [None]:
x.owner

In [None]:
hook.local_worker

In [None]:
x

In [None]:
x = x.get()
x

In [None]:
bob._objects

# Project: Playing with Remote Tensors

In this project, I want you to .send() and .get() a tensor to TWO workers by calling .send(bob,alice). This will first require the creation of another VirtualWorker called alice.

In [None]:
# try this project here!
alice = sy.VirtualWorker(hook, id="Alice")

In [None]:
x = x.send(alice, bob)
x

In [None]:
bob._objects

In [None]:
alice._objects

In [None]:
x.id

In [None]:
x.id_at_location

In [None]:
x.owner

In [None]:
x = x.get()
x

# Lesson: Introducing Remote Arithmetic

In [None]:
x = th.tensor([1,2,3,4,5]).send(bob)
y = th.tensor([1,1,1,1,1]).send(bob)

In [None]:
x

In [None]:
y

In [None]:
z = x + y

In [None]:
z

In [None]:
z = z.get()
z

In [None]:
z = th.add(x,y)
z

In [None]:
z = z.get()
z

In [None]:
x = th.tensor([1.,2,3,4,5], requires_grad=True).send(bob)
y = th.tensor([1.,1,1,1,1], requires_grad=True).send(bob)

In [None]:
z = (x + y).sum()

In [None]:
z.backward()

In [None]:
x = x.get()

In [None]:
x

In [None]:
x.grad

# Project: Learn a Simple Linear Model

In this project, I'd like for you to create a simple linear model which will solve for the following dataset below. You should use only Variables and .backward() to do so (no optimizers or nn.Modules). Furthermore, you must do so with both the data and the model being located on Bob's machine.

In [None]:
# try this project here!
input = th.tensor([[0.,0],[0,1],[1,0],[1,1]], requires_grad = True)
labels = th.tensor([[0.],[0],[1],[1]], requires_grad = True)
print(input.shape, labels.shape)

In [None]:
input.send(bob)
labels.send(bob)

In [None]:
weights = th.tensor([[0.],[0]], requires_grad = True)
weights.send(bob)

In [None]:
for i in range(10):
    pred = input.mm(weights)

    loss = ((pred - labels)**2).sum()

    loss.backward()

    weights.data.sub_(weights.grad * 0.1)
    weights.grad *= 0

    print(loss)

In [None]:
weights

# Lesson: Garbage Collection and Common Errors


In [None]:
bob = bob.clear_objects()

In [None]:
bob._objects

In [None]:
x = th.tensor([1,2,3,4,5]).send(bob)

In [None]:
bob._objects

In [None]:
del x

In [None]:
bob._objects

In [None]:
x = th.tensor([1,2,3,4,5]).send(bob)

In [None]:
bob._objects

In [None]:
x = "asdf"

In [None]:
bob._objects

In [None]:
x = th.tensor([1,2,3,4,5]).send(bob)

In [None]:
x

In [None]:
bob._objects

In [None]:
x = "asdf"

In [None]:
bob._objects

In [None]:
del x

In [None]:
bob._objects

In [None]:
bob = bob.clear_objects()
bob._objects

In [None]:
for i in range(1000):
    x = th.tensor([1,2,3,4,5]).send(bob)

In [None]:
bob._objects

In [None]:
x = th.tensor([1,2,3,4,5]).send(bob)
y = th.tensor([1,1,1,1,1])

In [None]:
z = x + y

In [None]:
x = th.tensor([1,2,3,4,5]).send(bob)
y = th.tensor([1,1,1,1,1]).send(bob)

In [None]:
z = x + y

In [None]:
z = z.get()
z

# Lesson: Toy Federated Learning

Let's start by training a toy model the centralized way. This is about a simple as models get. We first need:

- a toy dataset
- a model
- some basic training logic for training a model to fit the data.

In [None]:
from torch import nn, optim
import torch

In [None]:
# A Toy Dataset
data = torch.tensor([[1.,1],[0,1],[1,0],[0,0]], requires_grad=True)
target = torch.tensor([[1.],[1],[0],[0]], requires_grad=True)

In [None]:
# A Toy Model
model = nn.Linear(2,1)

In [None]:
optimizer = optim.SGD(params=model.parameters(), lr=0.1)

In [None]:
def train(iterations=20):
    for i in range(iterations):
        optimizer.zero_grad()

        pred = model(data)
        loss = ((pred - target)**2).sum()

        loss.backward()
        optimizer.step()

        print(loss)
        
train()

In [None]:
data_bob = data[0:2].send(bob)
target_bob = target[0:2].send(bob)

In [None]:
data_alice = data[2:4].send(alice)
target_alice = target[2:4].send(alice)

In [None]:
dataset = [(data_bob, target_bob), (data_alice, target_alice)]

In [None]:
_data, _target = dataset[0]

In [None]:
_data.location

In [None]:
def train(iterations=20):
    
    model = nn.Linear(2,1)
    optimizer = optim.SGD(params=model.parameters(), lr=0.1)
    for i in range(iterations):

        for _data, _target in dataset:
            #send model to local machines
            model = model.send(_data.location)

            #do normal training
            optimizer.zero_grad()
            pred = model(_data)
            loss = ((pred - _target)**2).sum()
            loss.backward()
            optimizer.step()

            #get your model back
            model = model.get()

        print(loss.get())

In [None]:
train()

# Lesson: Advanced Remote Execution Tools

In the last section we trained a toy model using Federated Learning. We did this by calling .send() and .get() on our model, sending it to the location of training data, updating it, and then bringing it back. However, at the end of the example we realized that we needed to go a bit further to protect people privacy. Namely, we want to average the gradients BEFORE calling .get(). That way, we won't ever see anyone's exact gradient (thus better protecting their privacy!!!)

But, in order to do this, we need a few more pieces:

- use a pointer to send a Tensor directly to another worker

And in addition, while we're here, we're going to learn about a few more advanced tensor operations as well which will help us both with this example and a few in the future!

In [None]:
bob.clear_objects()
alice.clear_objects()

In [None]:
x = th.tensor([1,2,3,4,5]).send(bob)
x

In [None]:
x = x.send(alice)

In [None]:
x

In [None]:
bob._objects

In [None]:
alice._objects

In [None]:
y = x + x

In [None]:
y

In [None]:
bob._objects

In [None]:
alice._objects

In [None]:
jon = sy.VirtualWorker(hook, id="jon")

In [None]:
bob.clear_objects()
alice.clear_objects()

x = th.tensor([1,2,3,4,5]).send(bob).send(alice)

In [None]:
bob._objects

In [None]:
alice._objects

In [None]:
x = x.get()
x

In [None]:
bob._objects

In [None]:
alice._objects

In [None]:
x = x.get()
x

In [None]:
bob._objects

In [None]:
bob.clear_objects()
alice.clear_objects()

x = th.tensor([1,2,3,4,5]).send(bob).send(alice)

In [None]:
bob._objects

In [None]:
alice._objects

In [None]:
del x

In [None]:
bob._objects

In [None]:
alice._objects

# Lesson: Pointer Chain Operations

In [None]:
bob.clear_objects()
alice.clear_objects()

In [None]:
x = th.tensor([1,2,3,4,5]).send(bob)

In [None]:
bob._objects

In [None]:
alice._objects

In [None]:
x.move(alice)

In [None]:
bob._objects

In [None]:
alice._objects

In [None]:
x = th.tensor([1,2,3,4,5]).send(bob).send(alice)

In [None]:
bob._objects

In [None]:
alice._objects

In [None]:
x.remote_get()

In [None]:
bob._objects

In [None]:
alice._objects

In [None]:
x.move(bob)

In [None]:
x

In [None]:
bob._objects

In [None]:
alice._objects

In [1]:
import torch
from torch import nn
import torch.nn.functional as F
from torchvision import datasets, transforms
#import syft as sy

# Define a transform to normalize the data
transform = transforms.Compose([transforms.ToTensor(),
                              transforms.Normalize((0.5,), (0.5,))])

In [2]:
import syft as sy

hook = sy.TorchHook(torch)

bob = sy.VirtualWorker(hook, id='bob')
alice = sy.VirtualWorker(hook, id='alice')
jon = sy.VirtualWorker(hook, id='jon')



In [3]:
# Download and load the training data
trainset = datasets.MNIST('~/.pytorch/MNIST_data/', download=True, train=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True)

federated_train_loader = sy.FederatedDataLoader( # <-- this is now a FederatedDataLoader 
    datasets.MNIST('../data', train=True, download=True,
                   transform=transform).federate((bob, alice)),# <-- NEW: we distribute the dataset across all the workers, it's now a FederatedDataset
                   batch_size=64, shuffle=True)

In [None]:
next(iter(federated_train_loader))

In [4]:
from torch import optim

model = nn.Sequential(nn.Linear(784, 128),
                     nn.ReLU(),
                     nn.Linear(128, 64),
                     nn.ReLU(),
                     nn.Linear(64, 10))

#for param in model.parameters():
#    param.requires_grad = True

optimizer = optim.SGD(params=model.parameters(), lr = 0.001)
criterion = nn.CrossEntropyLoss()

In [23]:
model

Sequential(
  (0): Linear(in_features=784, out_features=128, bias=True)
  (1): ReLU()
  (2): Linear(in_features=128, out_features=64, bias=True)
  (3): ReLU()
  (4): Linear(in_features=64, out_features=10, bias=True)
)

In [22]:
print(model.location)

AttributeError: 'Sequential' object has no attribute 'location'

In [16]:
len(alice._objects), len(bob._objects)

(2, 2)

In [24]:
epochs = 10

#model.get()

for epoch in range(epochs):
    
    running_loss = 0
    i = 0
    
    print(f"epoch #{epoch}")

    for data, target in federated_train_loader:
        
        #sending model    
        model.send(data.location)
        
        #normal training        
        data = data.view(data.shape[0], -1)
        
        optimizer.zero_grad()
        
        preds = model(data)
        
        
        loss = criterion(preds, target)
        
        loss.backward()
        optimizer.step()
        
        c_loss = loss.get()
        running_loss += c_loss
        
        if i%200 == 0:
            print(f"iteraton #{i}, loss: {c_loss} ")
            
        #sending raw gradients to another worker
        model = model.get()
        
        i+=1
    
    else:
        print(f"training loss for epoch #{epoch}: {running_loss/len(federated_train_loader)}")

epoch #0
iteraton #0, loss: 2.0703284740448 
iteraton #200, loss: 1.981971263885498 
iteraton #400, loss: 1.9552336931228638 
iteraton #600, loss: 1.830167293548584 
iteraton #800, loss: 1.661259651184082 
training loss for epoch #0: 1.8869941234588623
epoch #1
iteraton #0, loss: 1.6650855541229248 
iteraton #200, loss: 1.473429799079895 
iteraton #400, loss: 1.3863648176193237 
iteraton #600, loss: 1.2823587656021118 
iteraton #800, loss: 1.0514498949050903 
training loss for epoch #1: 1.3721574544906616
epoch #2
iteraton #0, loss: 1.1837631464004517 
iteraton #200, loss: 0.9677549600601196 
iteraton #400, loss: 1.0327754020690918 
iteraton #600, loss: 0.7608288526535034 
iteraton #800, loss: 0.8710792660713196 
training loss for epoch #2: 0.9555604457855225
epoch #3
iteraton #0, loss: 0.8272402882575989 
iteraton #200, loss: 0.6295068264007568 
iteraton #400, loss: 0.663025975227356 
iteraton #600, loss: 0.7249807715415955 
iteraton #800, loss: 0.6824154257774353 
training loss for e

AttributeError: 'Parameter' object has no attribute 'child'

In [21]:
model.get()

AttributeError: 'Parameter' object has no attribute 'child'