# Federated Learning 

## Populate remote GridNodes with labeled tensors
In this notebbok, we will show how to populate a GridNode with labeled data, so it will be used later (link to second part) by people interested in train models.

In particular, we will consider that two Data Owners (Alice & Bob) want to populate their nodes with some data from the well-known MNIST dataset.

## 0 - Previous setup

Components:

 - PyGrid Network      http://network:7000
 - PyGrid Node Alice (http://alice:5000)
 - PyGrid Node Bob   (http://bob:5001)

This tutorial assumes that these components are running in background. See [instructions](https://github.com/OpenMined/PyGrid/tree/dev/examples#how-to-run-this-tutorial) for more details.

### Import dependencies
Here we import core dependencies

In [1]:
import syft as sy
from syft.grid.clients.data_centric_fl_client import DataCentricFLClient  # websocket client. It sends commands to the node servers

import torch
import torchvision
from torchvision import datasets, transforms

import requests

Falling back to insecure randomness since the required custom op could not be found for the installed version of TensorFlow. Fix this by compiling custom ops. Missing file was '/opt/conda/lib/python3.7/site-packages/tf_encrypted/operations/secure_random/secure_random_module_tf_1.15.3.so'





### Syft and client configuration
Now we hook Torch and connect the clients to the servers

In [2]:
parties = 2
#TAG_NAME = "mnist_test_"+str(parties)+"nodes_ns"  #ns=no shuffle
TAG_NAME = "NPC_500_2nodes"
TAG_NAME = "mnist_small"

# address

MAX_N_SAMPLES = 100 #60000  # Number of samples
TOTAL_SAMPLES = 500 #60000


gridnode01 = "http://203.145.219.187:55364"
gridnode02 = "http://203.145.219.187:58845"
gridnode03 = "http://203.145.219.187:52154"
gridnode04 = "http://203.145.219.187:51803"
gridnode05 = "http://203.145.219.187:55624"
gridnode06 = "http://203.145.219.187:55120"
gridnode07 = "http://203.145.219.187:55575"
gridnode08 = "http://203.145.219.187:51898"
address_list = [gridnode01,gridnode02,gridnode03,gridnode04,gridnode05,gridnode06,gridnode07,gridnode08]        
node_name = ["gridnode01","gridnode02","gridnode03","gridnode04","gridnode05","gridnode06","gridnode07","gridnode08"]



In [3]:
hook = sy.TorchHook(torch)

# Connect direcly to grid nodes
compute_nodes = {}
for idx in range(parties): 
    compute_nodes[node_name[idx]] = DataCentricFLClient(hook, address_list[idx])

# Check if they are connected
for key, value in compute_nodes.items(): 
    print("Is " + key + " connected?: " + str(value.ws.connected))

Is gridnode01 connected?: True
Is gridnode02 connected?: True


## 1 - Load dataset
Download (and load) the MNIST dataset

In [4]:
from dataloader import NpcPatchDataset

dataset_path = '../../Pygrid_aetherAI/Data'
# Define a transformation.
transform = transforms.Compose([
                              transforms.ToTensor(),
                              transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225)),  #  mean and std 
                              ])

# Download and load MNIST dataset
#trainset = datasets.MNIST(MNIST_PATH, download=True, train=True, transform=transform)
trainset = NpcPatchDataset(train=True, root=dataset_path, transform=transform)


train_loader_x = []
train_loader_y = []



for idx in range(parties):     
    
    trainloader = torch.utils.data.DataLoader(trainset, batch_size=MAX_N_SAMPLES, shuffle=False)
    
    dataiter = iter(trainloader)
    images_train_mnist, labels_train_mnist = dataiter.next()
    
    
    images_train_mnist = images_train_mnist
    labels_train_mnist = labels_train_mnist
    
    train_loader_x.append(images_train_mnist)
    train_loader_y.append(labels_train_mnist)

In [4]:
from mnist_loader import read_mnist_data


train_loader_x = []
train_loader_y = []



for idx in range(parties): 
    if parties == 1:
        party_folder = "2"
    else:
        party_folder = str(parties)
    npz_path = '../'+party_folder+'Parties/data_party'+str(idx)+'.npz'
    mnist_train_loader,mnist_test_loader = read_mnist_data(npz_path, batch = MAX_N_SAMPLES )
    
    dataiter = iter(mnist_train_loader)
    images_train_mnist, labels_train_mnist = dataiter.next()
    
    
    images_train_mnist = images_train_mnist
    labels_train_mnist = labels_train_mnist
    
    train_loader_x.append(images_train_mnist)
    train_loader_y.append(labels_train_mnist)
    

## 2 - Split dataset & send
We split our dataset ...

In [5]:
for index in range(parties): 
    
    train_loader_x[index].tag("#X_"+TAG_NAME)\
        .describe("input mnist datapoinsts split " +str(parties)+ " parties")
    train_loader_y[index].tag("#Y_"+TAG_NAME)\
        .describe("input mnist labels split " +str(parties)+ " parties")
    print("Sending data to {}".format( node_name[index]))
    train_loader_x[index].send(compute_nodes[node_name[index]], garbage_collect_data=False)
    train_loader_y[index].send(compute_nodes[node_name[index]], garbage_collect_data=False)

Sending data to gridnode01
Sending data to gridnode02


In [6]:
for index in range(parties):
    print(node_name[index]+"'s tags: ", requests.get(address_list[index] + "/data-centric/dataset-tags").json())


gridnode01's tags:  ['#X_mnist_test_8nodes_ns', '#X_mnist_small', '#Y_mnist_small', '#Y_mnist_test_8nodes_ns']
gridnode02's tags:  ['#X_NPC_500_2nodes', '#Y_mnist_small', '#X_mnist_small', '#Y_NPC_500_2nodes']


**Now go ahead and continue with  [2nd part](02-FL-mnist-train-model.ipynb) where we will train a Federated Deep Learning model from scratch without having data!**

# Congratulations!!! - Time to Join the Community!

Congratulations on completing this notebook tutorial! If you enjoyed this and would like to join the movement toward privacy preserving, decentralized ownership of AI and the AI supply chain (data), you can do so in the following ways!

### Star PyGrid on GitHub

The easiest way to help our community is just by starring the GitHub repos! This helps raise awareness of the cool tools we're building.

- [Star PyGrid](https://github.com/OpenMined/PyGrid)

### Join our Slack!

The best way to keep up to date on the latest advancements is to join our community! You can do so by filling out the form at [http://slack.openmined.org](http://slack.openmined.org)

### Join a Code Project!

The best way to contribute to our community is to become a code contributor! At any time you can go to PySyft GitHub Issues page and filter for "Projects". This will show you all the top level Tickets giving an overview of what projects you can join! If you don't want to join a project, but you would like to do a bit of coding, you can also look for more "one off" mini-projects by searching for GitHub issues marked "good first issue".

- [PySyft Projects](https://github.com/OpenMined/PySyft/issues?q=is%3Aopen+is%3Aissue+label%3AProject)
- [Good First Issue Tickets](https://github.com/OpenMined/PyGrid/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22)

### Donate

If you don't have time to contribute to our codebase, but would still like to lend support, you can also become a Backer on our Open Collective. All donations go toward our web hosting and other community expenses such as hackathons and meetups!

[OpenMined's Open Collective Page](https://opencollective.com/openmined)