# Part 10: Federated Learning with Encrypted Gradient Aggregation

In the last few sections, we've been learning about encrypted computation by building several simple programs. In this section, we're going to return to the [Federated Learning Demo of Part 4](https://github.com/OpenMined/PySyft/blob/dev/examples/tutorials/Part%204%20-%20Federated%20Learning%20via%20Trusted%20Aggregator.ipynb), where we had a "trusted aggregator" who was responsible for averaging the model updates from multiple workers.

We will now use our new tools for encrypted computation to remove this trusted aggregator because it is less than ideal as it assumes that we can find someone trustworthy enough to have access to this sensitive information. This is not always the case.

Thus, in this notebook, we will show how one can use SMPC to perform secure aggregation such that we don't need a "trusted aggregator".

Authors:
- Theo Ryffel - Twitter: [@theoryffel](https://twitter.com/theoryffel)
- Andrew Trask - Twitter: [@iamtrask](https://twitter.com/iamtrask)

# Section 1: Normal Federated Learning

First, here is some code which performs classic federated learning on the Boston Housing Dataset. This section of code is broken into several sections.

### Setting Up

In [1]:
import pickle

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.utils.data import TensorDataset, DataLoader

class Parser:
    """Parameters for training"""
    def __init__(self):
        self.epochs = 10
        self.lr = 0.001
        self.test_batch_size = 8
        self.batch_size = 8
        self.log_interval = 10
        self.seed = 1
    
args = Parser()

torch.manual_seed(args.seed)
kwargs = {}

torch.set_default_tensor_type(torch.cuda.FloatTensor)

## Loading the Dataset

In [2]:
with open('../data/BostonHousing/boston_housing.pickle','rb') as f:
    ((X, y), (X_test, y_test)) = pickle.load(f)

X = torch.from_numpy(X).float()
y = torch.from_numpy(y).float()
X_test = torch.from_numpy(X_test).float()
y_test = torch.from_numpy(y_test).float()
# preprocessing
mean = X.mean(0, keepdim=True)
dev = X.std(0, keepdim=True)
mean[:, 3] = 0. # the feature at column 3 is binary,
dev[:, 3] = 1.  # so we don't standardize it
X = (X - mean) / dev
X_test = (X_test - mean) / dev
train = TensorDataset(X, y)
test = TensorDataset(X_test, y_test)
train_loader = DataLoader(train, batch_size=args.batch_size, shuffle=True, **kwargs)
test_loader = DataLoader(test, batch_size=args.test_batch_size, shuffle=True, **kwargs)

## Neural Network Structure

In [3]:
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.fc1 = nn.Linear(13, 32)
        self.fc2 = nn.Linear(32, 24)
        self.fc3 = nn.Linear(24, 1)

    def forward(self, x):
        x = x.view(-1, 13)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

model = Net()
optimizer = optim.SGD(model.parameters(), lr=args.lr)

## Hooking PyTorch

In [4]:
import syft as sy

hook = sy.TorchHook(torch)
bob = sy.VirtualWorker(hook, id="bob")
alice = sy.VirtualWorker(hook, id="alice")
james = sy.VirtualWorker(hook, id="james")

compute_nodes = [bob, alice]

W1119 21:56:00.028934 15460 secure_random.py:26] Falling back to insecure randomness since the required custom op could not be found for the installed version of TensorFlow. Fix this by compiling custom ops. Missing file was 'c:\users\florian\appdata\local\programs\python\python37\lib\site-packages\tf_encrypted-0.5.9-py3.7.egg\tf_encrypted/operations/secure_random/secure_random_module_tf_1.15.0-rc3.so'
W1119 21:56:00.045923 15460 module_wrapper.py:139] From c:\users\florian\appdata\local\programs\python\python37\lib\site-packages\tf_encrypted-0.5.9-py3.7.egg\tf_encrypted\session.py:24: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.



**Send data to the workers** <br>
Usually they would already have it, this is just for demo purposes that we send it manually

In [5]:
train_distributed_dataset = []

for batch_idx, (data,target) in enumerate(train_loader):
    data = data.send(compute_nodes[batch_idx % len(compute_nodes)])
    target = target.send(compute_nodes[batch_idx % len(compute_nodes)])
    train_distributed_dataset.append((data, target))

## Training Function

In [6]:
def train(epoch):
    model.train()
    for batch_idx, (data,target) in enumerate(train_distributed_dataset):
        worker = data.location
        model.send(worker)

        optimizer.zero_grad()
        # update the model
        pred = model(data)
        loss = F.mse_loss(pred.view(-1), target)
        loss.backward()
        optimizer.step()
        model.get()
            
        if batch_idx % args.log_interval == 0:
            loss = loss.get()
            print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                epoch, batch_idx * data.shape[0], len(train_loader),
                       100. * batch_idx / len(train_loader), loss.item()))
        


## Testing Function

In [7]:
def test():
    model.eval()
    test_loss = 0
    for data, target in test_loader:
        output = model(data)
        test_loss += F.mse_loss(output.view(-1), target, reduction='sum').item() # sum up batch loss
        pred = output.data.max(1, keepdim=True)[1] # get the index of the max log-probability
        
    test_loss /= len(test_loader.dataset)
    print('\nTest set: Average loss: {:.4f}\n'.format(test_loss))

## Training the Model

In [8]:
import time

In [9]:
#t = time.time()
#
#for epoch in range(1, args.epochs + 1):
#    train(epoch)
#
#    
#total_time = time.time() - t
#print('Total', round(total_time, 2), 's')

## Calculating Performance

In [10]:
#test()

# Section 2: Adding Encrypted Aggregation

Now we're going to slightly modify this example to aggregate gradients using encryption. The main piece that's different is really 1 or 2 lines of code in the `train()` function, which we'll point out. For the moment, let's re-process our data and initialize a model for bob and alice.

In [11]:
remote_dataset = (list(),list())

train_distributed_dataset = []

for batch_idx, (data,target) in enumerate(train_loader):
    data = data.send(compute_nodes[batch_idx % len(compute_nodes)])
    target = target.send(compute_nodes[batch_idx % len(compute_nodes)])
    remote_dataset[batch_idx % len(compute_nodes)].append((data, target))

def update(data, target, model, optimizer):
    model.send(data.location)
    optimizer.zero_grad()
    pred = model(data)
    loss = F.mse_loss(pred.view(-1), target)
    loss.backward()
    optimizer.step()
    return model

bobs_model = Net().to("cuda")
alices_model = Net().to("cuda")

bobs_optimizer = optim.SGD(bobs_model.parameters(), lr=args.lr)
alices_optimizer = optim.SGD(alices_model.parameters(), lr=args.lr)

models = [bobs_model, alices_model]
params = [list(bobs_model.parameters()), list(alices_model.parameters())]
optimizers = [bobs_optimizer, alices_optimizer]


## Building our Training Logic

The only **real** difference is inside of this train method. Let's walk through it step-by-step.

### Part A: Train:

In [12]:
# this is selecting which batch to train on
data_index = 0



### Part B: Encrypted Aggregation

In [13]:
# create a list where we'll deposit our encrypted model average
new_params = list()

In [19]:
optimizers

[SGD (
 Parameter Group 0
     dampening: 0
     lr: 0.001
     momentum: 0
     nesterov: False
     weight_decay: 0
 ), SGD (
 Parameter Group 0
     dampening: 0
     lr: 0.001
     momentum: 0
     nesterov: False
     weight_decay: 0
 )]

### Part C: Cleanup

## Let's put it all Together!!

And now that we know each step, we can put it all together into one training loop!

In [14]:
def train(epoch):
    for data_index in range(len(remote_dataset[0])-1):
        print(params[0][0][0])

        # update remote models
        for remote_index in range(len(compute_nodes)):
            data, target = remote_dataset[remote_index][data_index]
            d = data.to("cuda")
            t = target.to("cuda")
            models[remote_index] = update(d, t, models[remote_index], optimizers[remote_index])

        # encrypted aggregation
        new_params = list()
        for param_i in range(len(params[0])):
        
            # for each worker
            spdz_params = list()
            for remote_index in range(len(compute_nodes)):
                
                # select the identical parameter from each worker and copy it
                copy_of_parameter = (params[remote_index][param_i]).get()
                spdz_params.append(copy_of_parameter)
        
            # average params from multiple workers, fetch them to the local machine
            # decrypt and decode (from fixed precision) back into a floating point number
            new_param = (spdz_params[0] + spdz_params[1])/2
            
            # save the new averaged parameter
            new_params.append(new_param)

        # cleanup
        with torch.no_grad():
            for model in params:
                for param in model:
                    param *= 0

            #for model in models:
            #    model.get()

            for remote_index in range(len(compute_nodes)):
                for param_index in range(len(params[remote_index])):
                    params[remote_index][param_index].data = (new_params[param_index])

In [15]:
def test():
    models[0].eval()
    test_loss = 0
    for data, target in test_loader:
        d = data.to("cuda")
        t = target.to("cuda")
        output = models[0](d)
        test_loss += F.mse_loss(output.view(-1), t, reduction='sum').item() # sum up batch loss
        pred = output.data.max(1, keepdim=True)[1] # get the index of the max log-probability
        
    test_loss /= len(test_loader.dataset)
    print('Test set: Average loss: {:.4f}\n'.format(test_loss))

In [16]:
t = time.time()

for epoch in range(args.epochs):
    print(params[0][0][0])
    print("--")
    print(f"Epoch {epoch + 1}")
    train(epoch)
    test()

    
total_time = time.time() - t
print('Total', round(total_time, 2), 's')

tensor([ 0.1949, -0.2378, -0.0006,  0.0733, -0.2164, -0.0913,  0.1612, -0.0450,
         0.0629,  0.0570, -0.1938, -0.0333,  0.1362], grad_fn=<SelectBackward>)
--
Epoch 1
tensor([ 0.1949, -0.2378, -0.0006,  0.0733, -0.2164, -0.0913,  0.1612, -0.0450,
         0.0629,  0.0570, -0.1938, -0.0333,  0.1362], grad_fn=<SelectBackward>)
tensor([ 0.1700, -0.1786, -0.1206,  0.1590, -0.0383,  0.0700,  0.0935, -0.0672,
         0.1568,  0.0562, -0.0616,  0.0482,  0.1081], grad_fn=<SelectBackward>)
tensor([ 0.1699, -0.1785, -0.1208,  0.1590, -0.0384,  0.0699,  0.0934, -0.0671,
         0.1564,  0.0559, -0.0617,  0.0481,  0.1081], grad_fn=<SelectBackward>)
tensor([ 0.1697, -0.1783, -0.1211,  0.1590, -0.0388,  0.0697,  0.0931, -0.0669,
         0.1559,  0.0554, -0.0620,  0.0483,  0.1079], grad_fn=<SelectBackward>)
tensor([ 0.1697, -0.1783, -0.1211,  0.1590, -0.0390,  0.0695,  0.0929, -0.0668,
         0.1558,  0.0554, -0.0617,  0.0482,  0.1079], grad_fn=<SelectBackward>)
tensor([ 0.1697, -0.1781, -0.

KeyboardInterrupt: 

# Congratulations!!! - Time to Join the Community!

Congratulations on completing this notebook tutorial! If you enjoyed this and would like to join the movement toward privacy preserving, decentralized ownership of AI and the AI supply chain (data), you can do so in the following ways!

### Star PySyft on Github

The easiest way to help our community is just by starring the Repos! This helps raise awareness of the cool tools we're building.

- [Star PySyft](https://github.com/OpenMined/PySyft)

### Join our Slack!

The best way to keep up to date on the latest advancements is to join our community! You can do so by filling out the form at [http://slack.openmined.org](http://slack.openmined.org)

### Join a Code Project!

The best way to contribute to our community is to become a code contributor! At any time you can go to PySyft Github Issues page and filter for "Projects". This will show you all the top level Tickets giving an overview of what projects you can join! If you don't want to join a project, but you would like to do a bit of coding, you can also look for more "one off" mini-projects by searching for github issues marked "good first issue".

- [PySyft Projects](https://github.com/OpenMined/PySyft/issues?q=is%3Aopen+is%3Aissue+label%3AProject)
- [Good First Issue Tickets](https://github.com/OpenMined/PySyft/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22)

### Donate

If you don't have time to contribute to our codebase, but would still like to lend support, you can also become a Backer on our Open Collective. All donations go toward our web hosting and other community expenses such as hackathons and meetups!

[OpenMined's Open Collective Page](https://opencollective.com/openmined)