<a href="https://colab.research.google.com/github/ewotawa/secure_private_ai/blob/master/Section_2_Federated_Learning_Final_Project_2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Federated Learning Final Project

## Overview
* See  <a href="https://classroom.udacity.com/nanodegrees/nd185/parts/3fe1bb10-68d7-4d84-9c99-9539dedffad5/modules/28d685f0-0cb1-4f94-a8ea-2e16614ab421/lessons/c8fe481d-81ea-41be-8206-06d2deeb8575/concepts/a5fb4b4c-e38a-48de-b2a7-4e853c62acbe">video</a> for additional details. 
* Do Federated Learning where the central server is not trusted with the raw gradients.  
* In the final project notebook, you'll receive a dataset.  
* Train on the dataset using Federated Learning.  
* The gradients should not come up to the server in raw form.  
* Instead, use the new .move() command to move all of the gradients to one of the workers, sum them up there, and then bring that batch up to the central server and then bring that batch up 
* Idea: the central server never actually sees the raw gradient for any person.  
* We'll look at secure aggregation in course 3.  
* For now, do a larger-scale Federated Learning case where you handle the gradients in a special way.

## Approach
* Reviewing methods of classmates for Federated Learning. 

## References
*  <a href = "https://github.com/edgarinvillegas/private-ai/blob/master/Section%203%20-%20Final%20project.ipynb/">GitHub Notebook</a>
* <a href = "https://github.com/OpenMined/PySyft/blob/dev/examples/tutorials/Part%2010%20-%20Federated%20Learning%20with%20Secure%20Aggregation.ipynb">Part 10: Federated Learning with Encrypted Gradient Aggregation</a>


### Install libraries and dependencies

In [1]:
# PySyft

!pip install syft

import syft as sy

# PyTorch

!pip install torch
!pip install torchvision

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

import torchvision
from torchvision import datasets, transforms

# Numpy

import numpy as np

Collecting syft
[?25l  Downloading https://files.pythonhosted.org/packages/38/2e/16bdefc78eb089e1efa9704c33b8f76f035a30dc935bedd7cbb22f6dabaa/syft-0.1.21a1-py3-none-any.whl (219kB)
[K     |████████████████████████████████| 225kB 9.7MB/s 
[?25hCollecting websockets>=7.0 (from syft)
[?25l  Downloading https://files.pythonhosted.org/packages/c1/d2/bf72435a7d56f94b57efdeae26c76bf0d16f409fd44ff595da745c3fbefd/websockets-8.0.1-cp36-cp36m-manylinux1_x86_64.whl (72kB)
[K     |████████████████████████████████| 81kB 36.6MB/s 
Collecting websocket-client>=0.56.0 (from syft)
[?25l  Downloading https://files.pythonhosted.org/packages/29/19/44753eab1fdb50770ac69605527e8859468f3c0fd7dc5a76dd9c4dbd7906/websocket_client-0.56.0-py2.py3-none-any.whl (200kB)
[K     |████████████████████████████████| 204kB 51.5MB/s 
[?25hCollecting zstd>=1.4.0.0 (from syft)
[?25l  Downloading https://files.pythonhosted.org/packages/22/37/6a7ba746ebddbd6cd06de84367515d6bc239acd94fb3e0b1c85788176ca2/zstd-1.4.1.0.tar

W0721 19:27:25.114496 140153441781632 secure_random.py:26] Falling back to insecure randomness since the required custom op could not be found for the installed version of TensorFlow. Fix this by compiling custom ops. Missing file was '/usr/local/lib/python3.6/dist-packages/tf_encrypted/operations/secure_random/secure_random_module_tf_1.14.0.so'
W0721 19:27:25.129579 140153441781632 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/tf_encrypted/session.py:26: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.





###  Recall Toy Federated Learning

Use the data and model from Section 2

- a toy dataset
- a model
- some basic training logic for training a model to fit the data.

In [0]:
# A Toy Dataset
data = torch.tensor([[1.,1],[0,1],[1,0],[0,0]], requires_grad=True)
target = torch.tensor([[1.],[1], [0], [0]], requires_grad=True)

In [0]:
# A Toy Model
model = nn.Linear(2,1)

In [0]:
# Optimizer
opt = optim.SGD(params=model.parameters(), lr=0.1)

### Federated Learning

Set up hook, virtual workers, and virtual aggregator

In [0]:
hook = sy.TorchHook(torch)  # <-- NEW: hook PyTorch ie add extra functionalities to support Federated Learning
vw00 = sy.VirtualWorker(hook, id="vw00")
vw01 = sy.VirtualWorker(hook, id="vw01")

aggr = sy.VirtualWorker(hook, id="aggr")

In [0]:
vw00.clear_objects()
vw01.clear_objects()
aggr.clear_objects()

compute_nodes = [vw00, vw01]

In [130]:
data_vw00 = data[0:2].send(vw00)
target_vw00 = target[0:2].send(vw00)

vw00._objects

{33131537139: tensor([[1., 1.],
         [0., 1.]], requires_grad=True), 56770717889: tensor([[1.],
         [1.]], requires_grad=True)}

In [131]:
data_vw01 = data[2:4].send(vw01)
target_vw01 = target[2:4].send(vw01)

vw01._objects

{4977808065: tensor([[1., 0.],
         [0., 0.]], requires_grad=True), 20190781258: tensor([[0.],
         [0.]], requires_grad=True)}

In [0]:
datasets = [(data_vw00, target_vw00), (data_vw01, target_vw01)]

In [133]:
vw00_m = nn.Linear(2,1).send('vw00')
vw01_m = nn.Linear(2,1).send('vw01')

models = [vw00_m, vw01_m]
print('models: \n', models)

vw00_o = optim.SGD(params=models[0].parameters(), lr=0.1)
vw01_o = optim.SGD(params=models[1].parameters(), lr=0.1)

opts = [vw00_o, vw01_o]
print('\noptimizers: \n', opts)

vw00_p = list(models[0].parameters())
vw01_p = list(models[1].parameters())

params = [vw00_p, vw01_p]
print('\nparameters: \n', params)

models: 
 [Linear(in_features=2, out_features=1, bias=True), Linear(in_features=2, out_features=1, bias=True)]

optimizers: 
 [SGD (
Parameter Group 0
    dampening: 0
    lr: 0.1
    momentum: 0
    nesterov: False
    weight_decay: 0
), SGD (
Parameter Group 0
    dampening: 0
    lr: 0.1
    momentum: 0
    nesterov: False
    weight_decay: 0
)]

parameters: 
 [[Parameter containing:
Parameter>[PointerTensor | me:16636253281 -> vw00:32115523423], Parameter containing:
Parameter>[PointerTensor | me:32201929280 -> vw00:24923104342]], [Parameter containing:
Parameter>[PointerTensor | me:79356677133 -> vw01:62241726400], Parameter containing:
Parameter>[PointerTensor | me:14087045917 -> vw01:62576430508]]]


In [0]:
def fed_train(iterations=20):

    for iter in range(iterations):
        
        print('iter: \t', iter)

        for i in range(len(compute_nodes)): 
          # locate the data, identify model, optimizer by dataset ids
          data = datasets[i][0]
          target = datasets[i][1]
          
          worker_id = data.location.id
          model = models[i]
          opt = opts[i]
          

          print("data location: ", data.location, "\tworker ID: ", worker_id)

          # do normal training
          opt.zero_grad()
          pred = model(data)
          loss = ((pred - target)**2).sum()
          loss.backward()
          opt.step()

    return models, params
            

In [135]:
models, params = fed_train()

iter: 	 0
data location:  <VirtualWorker id:vw00 #objects:4> 	worker ID:  vw00
data location:  <VirtualWorker id:vw01 #objects:4> 	worker ID:  vw01
iter: 	 1
data location:  <VirtualWorker id:vw00 #objects:4> 	worker ID:  vw00
data location:  <VirtualWorker id:vw01 #objects:4> 	worker ID:  vw01
iter: 	 2
data location:  <VirtualWorker id:vw00 #objects:4> 	worker ID:  vw00
data location:  <VirtualWorker id:vw01 #objects:4> 	worker ID:  vw01
iter: 	 3
data location:  <VirtualWorker id:vw00 #objects:4> 	worker ID:  vw00
data location:  <VirtualWorker id:vw01 #objects:4> 	worker ID:  vw01
iter: 	 4
data location:  <VirtualWorker id:vw00 #objects:4> 	worker ID:  vw00
data location:  <VirtualWorker id:vw01 #objects:4> 	worker ID:  vw01
iter: 	 5
data location:  <VirtualWorker id:vw00 #objects:4> 	worker ID:  vw00
data location:  <VirtualWorker id:vw01 #objects:4> 	worker ID:  vw01
iter: 	 6
data location:  <VirtualWorker id:vw00 #objects:4> 	worker ID:  vw00
data location:  <VirtualWorker id

In [136]:
print(models)

[Linear(in_features=2, out_features=1, bias=True), Linear(in_features=2, out_features=1, bias=True)]


In [137]:
print(params)

[[Parameter containing:
Parameter>[PointerTensor | me:16636253281 -> vw00:32115523423], Parameter containing:
Parameter>[PointerTensor | me:32201929280 -> vw00:24923104342]], [Parameter containing:
Parameter>[PointerTensor | me:79356677133 -> vw01:62241726400], Parameter containing:
Parameter>[PointerTensor | me:14087045917 -> vw01:62576430508]]]


### Encrypted Aggregation

In [0]:
# create list to deposit encrypted averages
new_params = list()

In [140]:
print(len(params[0]))
print(len(compute_nodes))

2
2


In [0]:
# iterate through each parameter
for param_i in range(len(params[0])):
  
  # for each worker
  spdz_params = list()
  for remote_index in range(len(compute_nodes)):
    
    # select the identical parameter from each worker and copy it
    copy_of_parameter = params[remote_index][param_i].copy()
    
    # SMPC can only work with integers (not floats). Use Integers to store decimal information.
    # Use fixed precision encoding.
    fixed_precision_param = copy_of_parameter
    
    # encrypt on the remote machine. 
    # note: fixed_precision_param is already a pointer. 
    # calling share encrypts the data to which it is pointing.
    # returns a pointer to MPC secret shared object; need to fetch.
    encrypted_param = fixed_precision_param.share(vw00, vw01, crypto_provider=aggr)
    
    # fetch the pointer to MPC shared value
    param = encrypted_param.get()
    
    # save the parameter so that can average it with same parameter from other workers
    spdz_params.append(param)
    
  # average params from multiple workers, fetch to local machine
  # decrypt and decode (from fixed precision) back to floating point number
  new_param = (spdz_params[0] + spdz_params[1]).get() / 2
  
  # save the new averaged parameter
  new_params.append(new_param)

In [142]:
print(new_params)

[tensor([[0, 0]]), tensor([0])]


In [145]:
# cleanup
with torch.no_grad():
  for model in params: 
    
    for param in model: 
      param *= 0
      
    for model in models:
      model.get()
    
    for remote_index in range(len(compute_nodes)):
      for param_index in range(len(params[remote_index])):
        params[remote_index][param_index].set_(new_params[param_index])

RuntimeError: ignored