# Tutorial: Federated learning with websockets and federated averaging with possible solutions for problem you might face

This notebook will discuss detailed steps and problems you might face when going through these steps

Make sure you have correct websocket-client library because if you have another websocket library installed on top of websocket-client when you run this command ``` import websocket ``` it try will access that additional websocket library first because websocket-client is also called imported into your python script by ``` import websocket ``` and when you try to create connection with this command ``` websocket.create_connection() ``` this causes websocket don't have any module named create_connection
Solution: in terminal activate that environment where syft is installed run ```pip uninstall websocket``` to remove any additional websocket libraries then run ```pip install --upgrade websocket_client```

Authors:
- midokura-silvia



## Preparation: start the websocket server workers

Each worker is represented by two parts, a local handle (websocket client worker) and the remote instance that holds the data and performs the computations. The remote part is called a websocket server worker.

So first, you need to ```cd``` to the folder where this notebook and other additional files for running server and client are 

for example
in windows 10  
>cd (path till projects directory) \python_projects\websockets-example-MNIST

Note: Don't copy paste the path above because this is purely for the sake example your path may differ depending on your OS and project folder
 


because if you don't when you try to run ```python start_websocket_servers.py``` command in terminal this script open sub processes with python which runs other scripts that starts websocket server workers and only the name of the file with its extension is mentioned because the file's path may vary.
we need to create the remote workers. For this, you need to run in a terminal (not possible from the notebook):

```bash
python start_websocket_servers.py
```

## Setting up the websocket client workers

We first need to perform the imports and setup some arguments and variables.

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
import sys
import syft as sy
from syft.workers.websocket_client import WebsocketClientWorker
import torch
from torchvision import datasets, transforms

from syft.frameworks.torch.federated import utils

W0729 17:18:05.992472 13792 secure_random.py:22] Falling back to insecure randomness since the required custom op could not be found for the installed version of TensorFlow (1.14.0). Fix this by compiling custom ops.
W0729 17:18:06.043451 13792 deprecation_wrapper.py:119] From h:\softwares\anaconda\envs\dlpytorch\lib\site-packages\tf_encrypted\session.py:28: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.



In [3]:
import run_websocket_client as rwc

In [4]:
args = rwc.define_and_get_arguments(args=[])
use_cuda = args.cuda and torch.cuda.is_available()
torch.manual_seed(args.seed)
device = torch.device("cuda" if use_cuda else "cpu")
print(args)

Namespace(batch_size=64, cuda=False, epochs=2, federate_after_n_batches=50, lr=0.01, save_model=False, seed=1, test_batch_size=1000, use_virtual=False, verbose=False)


Now let's instantiate the websocket client workers, our local access point to the remote workers.
Note that **this step will fail, if the websocket server workers are not running**.

In [5]:
hook = sy.TorchHook(torch)

kwargs_websocket = {"host": "localhost", "hook": hook, "verbose": args.verbose}
alice = WebsocketClientWorker(id="alice", port=8777, **kwargs_websocket)
bob = WebsocketClientWorker(id="bob", port=8778, **kwargs_websocket)
charlie = WebsocketClientWorker(id="charlie", port=8779, **kwargs_websocket)

workers = [alice, bob, charlie]
print(workers)


[<WebsocketClientWorker id:alice #objects local:0 #objects remote: 0>, <WebsocketClientWorker id:bob #objects local:0 #objects remote: 0>, <WebsocketClientWorker id:charlie #objects local:0 #objects remote: 0>]


## Prepare and distribute the training data

We will use the MNIST dataset and distribute the data randomly onto the workers. 
This is not realistic for a federated training setup, where the data would normally already be available at the remote workers.

We instantiate two FederatedDataLoaders, one for the train and one for the test set of the MNIST dataset.

*If you run into BrokenPipe errors go to the parrent directory of the directory where your project is and delete data folder then restart notebook and try again if the error comes again delete that data folder again run the following command*

for example directory for data 

>(path till projects directory) \python_projects\

directory for project notebook and scripts

>(path till projects directory) \python_projects\websockets-example-MNIST

Note: Don't copy paste the path above because this is purely for the sake example your path may differ depending on your OS and project folder


In [13]:
#run this box only if the the next box gives pipeline error
torch.utils.data.DataLoader(
    datasets.MNIST(
        "../data",
        train=True,download=True))

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to ../data\MNIST\raw\train-images-idx3-ubyte.gz


100.1%

Extracting ../data\MNIST\raw\train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to ../data\MNIST\raw\train-labels-idx1-ubyte.gz


113.5%

Extracting ../data\MNIST\raw\train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to ../data\MNIST\raw\t10k-images-idx3-ubyte.gz


100.4%

Extracting ../data\MNIST\raw\t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to ../data\MNIST\raw\t10k-labels-idx1-ubyte.gz


180.4%

Extracting ../data\MNIST\raw\t10k-labels-idx1-ubyte.gz
Processing...
Done!


<torch.utils.data.dataloader.DataLoader at 0x20261c9f978>

In [6]:
federated_train_loader = sy.FederatedDataLoader(
    datasets.MNIST(
        "../data",
        train=True,
        download=True,
        transform=transforms.Compose(
            [transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))]
        ),
    ).federate(tuple(workers)),
    batch_size=args.batch_size,
    shuffle=True,
    iter_per_worker=True
)

test_loader = torch.utils.data.DataLoader(
    datasets.MNIST(
        "../data",
        train=False,
        transform=transforms.Compose(
            [transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))]
        ),
    ),
    batch_size=args.test_batch_size,
    shuffle=True
)


Next, we need to instantiate the machine learning model. It is a small neural network with 2 convolutional and two fully connected layers. 
It uses ReLU activations and max pooling.

In [7]:
model = rwc.Net().to(device)
print(model)

Net(
  (conv1): Conv2d(1, 20, kernel_size=(5, 5), stride=(1, 1))
  (conv2): Conv2d(20, 50, kernel_size=(5, 5), stride=(1, 1))
  (fc1): Linear(in_features=800, out_features=500, bias=True)
  (fc2): Linear(in_features=500, out_features=10, bias=True)
)


In [8]:
import logging
import sys
logger = logging.getLogger()
logger.setLevel(logging.DEBUG)
handler = logging.StreamHandler(sys.stderr)
formatter = logging.Formatter("%(asctime)s %(levelname)s %(filename)s(l:%(lineno)d) - %(message)s")
handler.setFormatter(formatter)
logger.handlers = [handler]

## Let's start the training


Now we are ready to start the federated training. We will perform training over a given number of batches separately on each worker and then calculate the federated average of the resulting model and calculate test accuracy over that model.

In [9]:
for epoch in range(1, args.epochs + 1):
    print("Starting epoch {}/{}".format(epoch, args.epochs))
    model = rwc.train(model, device, federated_train_loader, args.lr, args.federate_after_n_batches)
    rwc.test(model, device, test_loader)

Starting epoch 1/2


2019-07-29 17:21:41,521 DEBUG run_websocket_client.py(l:132) - Starting training round, batches [0, 50]
2019-07-29 17:22:38,985 DEBUG run_websocket_client.py(l:132) - Starting training round, batches [50, 100]
2019-07-29 17:23:34,826 DEBUG run_websocket_client.py(l:132) - Starting training round, batches [100, 150]
2019-07-29 17:24:30,790 DEBUG run_websocket_client.py(l:132) - Starting training round, batches [150, 200]
2019-07-29 17:25:27,429 DEBUG run_websocket_client.py(l:132) - Starting training round, batches [200, 250]
2019-07-29 17:26:24,497 DEBUG run_websocket_client.py(l:132) - Starting training round, batches [250, 300]
2019-07-29 17:27:03,823 DEBUG run_websocket_client.py(l:132) - Starting training round, batches [300, 350]
2019-07-29 17:27:20,828 DEBUG run_websocket_client.py(l:132) - Starting training round, batches [350, 400]
2019-07-29 17:27:20,849 DEBUG run_websocket_client.py(l:144) - At least one worker ran out of data, stopping.
2019-07-29 17:27:26,161 DEBUG run_webs

Starting epoch 2/2


2019-07-29 17:27:51,358 DEBUG run_websocket_client.py(l:132) - Starting training round, batches [0, 50]
2019-07-29 17:28:48,148 DEBUG run_websocket_client.py(l:132) - Starting training round, batches [50, 100]
2019-07-29 17:29:45,990 DEBUG run_websocket_client.py(l:132) - Starting training round, batches [100, 150]
2019-07-29 17:30:43,343 DEBUG run_websocket_client.py(l:132) - Starting training round, batches [150, 200]
2019-07-29 17:31:40,577 DEBUG run_websocket_client.py(l:132) - Starting training round, batches [200, 250]
2019-07-29 17:32:38,326 DEBUG run_websocket_client.py(l:132) - Starting training round, batches [250, 300]
2019-07-29 17:33:18,053 DEBUG run_websocket_client.py(l:132) - Starting training round, batches [300, 350]
2019-07-29 17:33:35,902 DEBUG run_websocket_client.py(l:132) - Starting training round, batches [350, 400]
2019-07-29 17:33:35,926 DEBUG run_websocket_client.py(l:144) - At least one worker ran out of data, stopping.
2019-07-29 17:33:40,785 DEBUG run_webs

# Congratulations!!! - Time to Join the Community!

Congratulations on completing this notebook tutorial! If you enjoyed this and would like to join the movement toward privacy preserving, decentralized ownership of AI and the AI supply chain (data), you can do so in the following ways!

### Star PySyft on GitHub

The easiest way to help our community is just by starring the GitHub repos! This helps raise awareness of the cool tools we're building.

- [Star PySyft](https://github.com/OpenMined/PySyft)

### Join our Slack!

The best way to keep up to date on the latest advancements is to join our community! You can do so by filling out the form at [http://slack.openmined.org](http://slack.openmined.org)

### Join a Code Project!

The best way to contribute to our community is to become a code contributor! At any time you can go to PySyft GitHub Issues page and filter for "Projects". This will show you all the top level Tickets giving an overview of what projects you can join! If you don't want to join a project, but you would like to do a bit of coding, you can also look for more "one off" mini-projects by searching for GitHub issues marked "good first issue".

- [PySyft Projects](https://github.com/OpenMined/PySyft/issues?q=is%3Aopen+is%3Aissue+label%3AProject)
- [Good First Issue Tickets](https://github.com/OpenMined/PySyft/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22)

### Donate

If you don't have time to contribute to our codebase, but would still like to lend support, you can also become a Backer on our Open Collective. All donations go toward our web hosting and other community expenses such as hackathons and meetups!

[OpenMined's Open Collective Page](https://opencollective.com/openmined)