# Section: Federated Learning

# Lesson: Introducing Federated Learning

Federated Learning is a technique for training Deep Learning models on data to which you do not have access. Basically:

Federated Learning: Instead of bringing all the data to one machine and training a model, we bring the model to the data, train it locally, and merely upload "model updates" to a central server.

Use Cases:

    - app company (Texting prediction app)
    - predictive maintenance (automobiles / industrial engines)
    - wearable medical devices
    - ad blockers / autotomplete in browsers (Firefox/Brave)
    
Challenge Description: data is distributed amongst sources but we cannot aggregated it because of:

    - privacy concerns: legal, user discomfort, competitive dynamics
    - engineering: the bandwidth/storage requirements of aggregating the larger dataset

# Lesson: Introducing / Installing PySyft

In order to perform Federated Learning, we need to be able to use Deep Learning techniques on remote machines. This will require a new set of tools. Specifically, we will use an extensin of PyTorch called PySyft.

### Install PySyft

The easiest way to install the required libraries is with [Conda](https://docs.conda.io/projects/conda/en/latest/user-guide/overview.html). Create a new environment, then install the dependencies in that environment. In your terminal:

```bash
conda create -n pysyft python=3
conda activate pysyft # some older version of conda require "source activate pysyft" instead.
conda install jupyter notebook
pip install syft
pip install numpy
```

If you have any errors relating to zstd - run the following (if everything above installed fine then skip this step):

```
pip install --upgrade --force-reinstall zstd
```

and then retry installing syft (pip install syft).

If you are using Windows, I suggest installing [Anaconda and using the Anaconda Prompt](https://docs.anaconda.com/anaconda/user-guide/getting-started/) to work from the command line. 

With this environment activated and in the repo directory, launch Jupyter Notebook:

```bash
jupyter notebook
```

and re-open this notebook on the new Jupyter server.

If any part of this doesn't work for you (or any of the tests fail) - first check the [README](https://github.com/OpenMined/PySyft.git) for installation help and then open a Github Issue or ping the #beginner channel in our slack! [slack.openmined.org](http://slack.openmined.org/)

# Lesson: Pointer Chain Operations

In [1]:
!pip install tf-encrypted

! URL="https://github.com/openmined/PySyft.git" && FOLDER="PySyft" && if [ ! -d $FOLDER ]; then git clone -b dev --single-branch $URL; else (cd $FOLDER && git pull $URL && cd ..); fi;

!cd PySyft; python setup.py install  > /dev/null

import os
import sys
module_path = os.path.abspath(os.path.join('./PySyft'))
if module_path not in sys.path:
    sys.path.append(module_path)
    
!pip install --upgrade --force-reinstall lz4
!pip install --upgrade --force-reinstall websocket
!pip install --upgrade --force-reinstall websockets
!pip install --upgrade --force-reinstall zstd

Collecting tf-encrypted
[?25l  Downloading https://files.pythonhosted.org/packages/55/ff/7dbd5fc77fcec0df1798268a6b72a2ab0150b854761bc39c77d566798f0b/tf_encrypted-0.5.7-py3-none-manylinux1_x86_64.whl (2.1MB)
[K     |████████████████████████████████| 2.1MB 3.4MB/s 
[?25hCollecting pyyaml>=5.1 (from tf-encrypted)
[?25l  Downloading https://files.pythonhosted.org/packages/a3/65/837fefac7475963d1eccf4aa684c23b95aa6c1d033a2c5965ccb11e22623/PyYAML-5.1.1.tar.gz (274kB)
[K     |████████████████████████████████| 276kB 66.6MB/s 
Building wheels for collected packages: pyyaml
  Building wheel for pyyaml (setup.py) ... [?25l[?25hdone
  Stored in directory: /root/.cache/pip/wheels/16/27/a1/775c62ddea7bfa62324fd1f65847ed31c55dadb6051481ba3f
Successfully built pyyaml
Installing collected packages: pyyaml, tf-encrypted
  Found existing installation: PyYAML 3.13
    Uninstalling PyYAML-3.13:
      Successfully uninstalled PyYAML-3.13
Successfully installed pyyaml-5.1.1 tf-encrypted-0.5.7
Cloning

Collecting pytorch==1.0.1
[31m  ERROR: Could not find a version that satisfies the requirement pytorch==1.0.1 (from versions: 0.1.2, 1.0.2)[0m
[31mERROR: No matching distribution found for pytorch==1.0.1[0m


In [2]:

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms
import syft as sy

W0722 14:48:48.843425 140146997147520 secure_random.py:26] Falling back to insecure randomness since the required custom op could not be found for the installed version of TensorFlow. Fix this by compiling custom ops. Missing file was '/usr/local/lib/python3.6/dist-packages/tf_encrypted/operations/secure_random/secure_random_module_tf_1.14.0.so'
W0722 14:48:48.856741 140146997147520 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/tf_encrypted/session.py:26: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.



In [3]:

hook = sy.TorchHook(torch)  
bob = sy.VirtualWorker(hook, id="bob")  
alice = sy.VirtualWorker(hook, id="alice")
secure_worker=sy.VirtualWorker(hook, id="secure_worker")
bob.clear_objects()
alice.clear_objects()
secure_worker.clear_objects()

<VirtualWorker id:secure_worker #objects:0>

In [0]:
class Arguments():
    def __init__(self):
        self.batch_size = 64
        self.test_batch_size = 1000
        self.epochs = 10
        self.lr = 0.01
        self.momentum = 0.5
        self.no_cuda = False
        self.seed = 1
        self.log_interval = 10
        self.save_model = True

args = Arguments()

use_cuda = not args.no_cuda and torch.cuda.is_available()

torch.manual_seed(args.seed)

device = torch.device("cuda" if use_cuda else "cpu")


kwargs = {'num_workers': 1, 'pin_memory': True} if use_cuda else {}

In [5]:

federated_train_loader = sy.FederatedDataLoader(
                          datasets.FashionMNIST('../data', train=True, download=True,
                          transform=transforms.Compose([
                          transforms.ToTensor(),
                          transforms.Normalize((0.1307,), (0.3081,))])).federate((bob, alice)),
                          batch_size=args.batch_size, shuffle=True,**kwargs)
train_loader = torch.utils.data.DataLoader(
                          datasets.FashionMNIST('../data', train=True, download=True,
                          transform=transforms.Compose([
                          transforms.ToTensor(),
                          transforms.Normalize((0.1307,), (0.3081,))])),
                          batch_size=args.batch_size, shuffle=True,**kwargs)

test_loader = torch.utils.data.DataLoader(
                       datasets.FashionMNIST('../data', train=False, transform=transforms.Compose([
                       transforms.ToTensor(),
                       transforms.Normalize((0.1307,), (0.3081,))])),
                       batch_size=args.test_batch_size, shuffle=True,**kwargs)

0it [00:00, ?it/s]

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to ../data/FashionMNIST/raw/train-images-idx3-ubyte.gz


26427392it [00:01, 13488112.90it/s]                             


Extracting ../data/FashionMNIST/raw/train-images-idx3-ubyte.gz


0it [00:00, ?it/s]

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to ../data/FashionMNIST/raw/train-labels-idx1-ubyte.gz


32768it [00:00, 95016.54it/s]                            
0it [00:00, ?it/s]

Extracting ../data/FashionMNIST/raw/train-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to ../data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz


4423680it [00:01, 2800617.04it/s]                             
0it [00:00, ?it/s]

Extracting ../data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to ../data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz


8192it [00:00, 30938.61it/s]            


Extracting ../data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz
Processing...
Done!


W0722 14:49:41.585267 140146997147520 dataloader.py:197] The following options are not supported: num_workers: 1, pin_memory: True


In [0]:
torch.set_default_tensor_type(torch.cuda.FloatTensor)

In [0]:
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 20, 5, 1)
        self.conv2 = nn.Conv2d(20, 50, 5, 1)
        self.fc1 = nn.Linear(4*4*50, 500)
        self.fc2 = nn.Linear(500, 10)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        x = F.max_pool2d(x, 2, 2)
        x = F.relu(self.conv2(x))
        x = F.max_pool2d(x, 2, 2)
        x = x.view(-1, 4*4*50)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return F.log_softmax(x, dim=1)

In [0]:
# def train(args, model, device, train_loader, optimizer, epoch):
#     model.train()
#     for batch_idx, (data, target) in enumerate(federated_train_loader):
#         model.send(data.location) # <-- NEW: send the model to the right location
#         data, target = data.to(device), target.to(device)
#         optimizer.zero_grad()
#         output = model(data)
#         loss = F.nll_loss(output, target)
#         loss.backward()
#         optimizer.step()
#         model.get() # <-- NEW: get the model back
#         if batch_idx % args.log_interval == 0:
#             loss = loss.get() # <-- NEW: get the loss back
#             print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
#                 epoch, batch_idx * args.batch_size, len(train_loader) * args.batch_size, #batch_idx * len(data), len(train_loader.dataset),
#                 100. * batch_idx / len(train_loader), loss.item()))
            
            

In [0]:
def test(args, model, device, test_loader):
    model.eval()
    test_loss = 0
    correct = 0
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            test_loss += F.nll_loss(output, target, reduction='sum').item() # sum up batch loss
            pred = output.argmax(1, keepdim=True) # get the index of the max log-probability 
            correct += pred.eq(target.view_as(pred)).sum().item()

    test_loss /= len(test_loader.dataset)

    print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
        test_loss, correct, len(test_loader.dataset),
        100. * correct / len(test_loader.dataset)))

In [8]:
model=Net().to(device)

optimizer=optim.SGD(model.parameters(), lr=args.lr)


RuntimeError: ignored

In [0]:
data,target=next(iter(train_loader))
data, target = data.to(device), target.to(device)
optimizer.zero_grad()
output = model(data)
loss = F.nll_loss(output, target)
loss.backward()
optimizer.step()
test(args, model, device, test_loader)

RuntimeError: ignored

In [0]:
model_bob= model.copy().send(bob)
model_alice=model.copy().send(alice)
optimizer_bob = optim.SGD(model_bob.parameters(), lr=args.lr)
optimizer_alice = optim.SGD(model_alice.parameters(), lr=args.lr)

In [0]:
model_bob.train()
model_alice.train()
data,target=next(iter(federated_train_loader))
if data.location == bob:
          
  model_bob.send(data.location) # <-- NEW: send the model to the right location
  data, target = data.to(device), target.to(device)
  optimizer_bob.zero_grad()
  output_bob = model_bob(data)
  loss_bob = F.nll_loss(output_bob, target)
  loss_bob.backward()
  optimizer_bob.step()
  loss_bob=loss_bob.get().data
  test(args, model_bob, device, test_loader)
elif data.location==alice:
          
  model_alice.send(data.location) # <-- NEW: send the model to the right location
  data, target = data.to(device), target.to(device)
  optimizer_alice.zero_grad()
  output_alice = model_alice(data)
  loss_alice = F.nll_loss(output_alice, target)
  loss_alice.backward()
  optimizer_alice.step()
  loss_alice=loss_alice.get().data
  test(args, model_bob, device, test_loader)

RuntimeError: ignored

In [0]:

for epoch in range(1, args.epochs + 1):
    model_bob.train()
    model_alice.train()
    for data, target in federated_train_loader:
      if data.location == bob:
          
          model_bob.send(data.location) # <-- NEW: send the model to the right location
          data, target = data.to(device), target.to(device)
          optimizer_bob.zero_grad()
          output_bob = model_bob(data)
          loss_bob = F.nll_loss(output_bob, target)
          loss_bob.backward()
          optimizer_bob.step()
          loss_bob=loss_bob.get().data
          test(args, model_bob, device, test_loader)
      elif data.location==alice:
          
          model_alice.send(data.location) # <-- NEW: send the model to the right location
          data, target = data.to(device), target.to(device)
          optimizer_alice.zero_grad()
          output_alice = model_alice(data)
          loss_alice = F.nll_loss(output_alice, target)
          loss_alice.backward()
          optimizer_alice.step()
          loss_alice=loss_alice.get().data
          test(args, model_bob, device, test_loader)
    model_alice.move(secure_worker)
    model_bob.move(secure_worker)
    with torch.no_grad():
        model.weight.set_(((model_alice.conv1.weight.data + model_bob.conv1.weight.data) / 2).get())
        model.bias.set_(((model_alice.conv1.bias.data + model_bob.conv1.bias.data) / 2).get())    
        model.weight.set_(((model_alice.conv2.weight.data + model_bob.conv2.weight.data) / 2).get())
        model.bias.set_(((model_alice.conv2.bias.data + model_bob.conv2.bias.data) / 2).get())    
        model.weight.set_(((model_alice.fc1.weight.data + model_bob.fc1.weight.data) / 2).get())
        model.bias.set_(((model_alice.fc1.bias.data + model_bob.fc1.bias.data) / 2).get())    
        model.weight.set_(((model_alice.fc2.weight.data + model_bob.fc2.weight.data) / 2).get())
        model.bias.set_(((model_alice.fc2.bias.data + model_bob.fc2.bias.data) / 2).get())    

        print('Train Epoch: {} \tAlice Loss: {:.6f} \bob Loss: {:.6f}'.format(
                epoch, loss_alice.item(),loss_bob.item()))
            
            
  
    

if (args.save_model):
    torch.save(model.state_dict(), "mnist_cnn.pt")

RuntimeError: ignored

Mon Jul 22 10:10:25 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.67       Driver Version: 410.79       CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|   0  Tesla K80           Off  | 00000000:00:04.0 Off |                    0 |
| N/A   28C    P8    26W / 149W |      0MiB / 11441MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|  No ru

Looking in links: https://download.pytorch.org/whl/cu100/stable
Collecting torch==1.0.1
[?25l  Downloading https://files.pythonhosted.org/packages/f7/92/1ae072a56665e36e81046d5fb8a2f39c7728c25c21df1777486c49b179ae/torch-1.0.1-cp36-cp36m-manylinux1_x86_64.whl (560.0MB)
[K     |████████████████████████████████| 560.1MB 27kB/s 
[31mERROR: torchvision 0.3.0 has requirement torch>=1.1.0, but you'll have torch 1.0.1 which is incompatible.[0m
[?25hInstalling collected packages: torch
  Found existing installation: torch 1.1.0
    Uninstalling torch-1.1.0:
      Successfully uninstalled torch-1.1.0
Successfully installed torch-1.0.1


In [5]:
! nvidia-smi

Mon Jul 22 10:40:48 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.67       Driver Version: 410.79       CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|   0  Tesla K80           Off  | 00000000:00:04.0 Off |                    0 |
| N/A   28C    P8    26W / 149W |      0MiB / 11441MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|  No ru

Collecting torch>=1.1.0 (from torchvision)
[?25l  Downloading https://files.pythonhosted.org/packages/69/60/f685fb2cfb3088736bafbc9bdbb455327bdc8906b606da9c9a81bae1c81e/torch-1.1.0-cp36-cp36m-manylinux1_x86_64.whl (676.9MB)
[K     |████████████████████████████████| 676.9MB 25kB/s 
[31mERROR: syft 0.1.21a1 has requirement msgpack>=0.6.1, but you'll have msgpack 0.5.6 which is incompatible.[0m
[31mERROR: syft 0.1.21a1 has requirement tf_encrypted!=0.5.7,>=0.5.4, but you'll have tf-encrypted 0.5.7 which is incompatible.[0m
Installing collected packages: torch
  Found existing installation: torch 1.0.1
    Uninstalling torch-1.0.1:
      Successfully uninstalled torch-1.0.1
Successfully installed torch-1.1.0


In [0]:
import torch 

In [4]:
torch.__version__

'1.1.0'