# Federated Learning - MNIST Example

## Populate remote GridNodes with labeled tensors
In this notebbok, we will show how to populate a GridNode with labeled data, so it will be used later (link to second part) by people interested in train models.

In particular, we will consider that two Data Owners (Alice & Bob) want to populate their nodes with some data from the well-known MNIST dataset.

## 0 - Previous setup

Components:

 - PyGrid Network      203.145.218.196:80
 - PyGrid Node Alice ( http://alice.libthomas.org:80)
 - PyGrid Node Bob   (http://bob.libthomas.org:80)

This tutorial assumes that these components are running in background. See [instructions](https://github.com/OpenMined/PyGrid/tree/dev/examples#how-to-run-this-tutorial) for more details.

### Import dependencies
Here we import core dependencies

In [11]:
import syft as sy
from syft.grid.clients.data_centric_fl_client import DataCentricFLClient  # websocket client. It sends commands to the node servers

import torch
import torchvision
from torchvision import datasets, transforms

import requests

### Syft and client configuration
Now we hook Torch and connect the clients to the servers

In [12]:
parties = 1
TAG_NAME = str(parties)+"parties_experiment"
# address

gridnode01 = "http://203.145.219.187:53980"
gridnode02 = "http://203.145.219.187:53946"
gridnode03 = "http://203.145.219.187:53359"
gridnode04 = "http://203.145.219.187:56716"
gridnode05 = "http://203.145.219.187:57096"
gridnode06 = "http://203.145.219.187:55194"
gridnode07 = "http://203.145.219.187:57574"
gridnode08 = "http://203.145.219.187:52228"
address_list = [gridnode01,gridnode02,gridnode03,gridnode04,gridnode05,gridnode06,gridnode07,gridnode08]        
node_name = ["gridnode01","gridnode02","gridnode03","gridnode04","gridnode05","gridnode06","gridnode07","gridnode08"]

In [13]:
hook = sy.TorchHook(torch)



# Connect direcly to grid nodes
compute_nodes = {}
for idx in range(parties): 
    compute_nodes[node_name[idx]] = DataCentricFLClient(hook, address_list[idx])


# Check if they are connected
for key, value in compute_nodes.items(): 
    print("Is " + key + " connected?: " + str(value.ws.connected))



Is gridnode01 connected?: True


## 1 - Load dataset
Download (and load) the MNIST dataset

In [14]:
#device = torch.device("cpu")

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
#device=[torch.device("cuda:2"),torch.device("cuda:3")]


In [18]:
from mnist_loader import read_mnist_data
N_SAMPLES = 10000  # Number of samples
# dataset_path = 'Data'
# # Define a transformation.
# transform = transforms.Compose([
#                               transforms.ToTensor(),
#                               transforms.Normalize((0.1307,), (0.3081,)),  #  mean and std 
#                               ])

# trainset = NpcPatchDataset(train=True, root='Data', transform=transform)
# trainloader = torch.utils.data.DataLoader(trainset, batch_size=N_SAMPLES, shuffle=True)


train_loader_x = []
train_loader_y = []



for idx in range(parties): 
    npz_path = '../'+str(parties)+'Parties/data_party'+str(idx)+'.npz'
    mnist_train_loader,mnist_test_loader = read_mnist_data(npz_path, batch = N_SAMPLES )
    
    dataiter = iter(mnist_train_loader)
    images_train_mnist, labels_train_mnist = dataiter.next()
    
    
    images_train_mnist = images_train_mnist.to(device)
    labels_train_mnist = labels_train_mnist.to(device)
    
    train_loader_x.append(images_train_mnist)
    train_loader_y.append(labels_train_mnist)
    
    




## 2 - Split dataset
We split our dataset ...

In [19]:
#parties = 2
#for index, _ in enumerate(parties):
for index in range(parties): 
    
    train_loader_x[index].tag("#X","#"+TAG_NAME,"#dataset")\
        .describe("input mnist datapoinsts split " +str(parties)+ " parties")
    train_loader_y[index].tag("#Y","#"+TAG_NAME,"#dataset")\
        .describe("input mnist labels split " +str(parties)+ " parties")
#     images_train_mnist[index]\
#         .tag("#X", "#npc_100_dynamic_cuda", "#dataset")\
#         .describe("The input datapoints to the MNIST dataset.") 
    
    
#     labels_train_mnist[index]\
#         .tag("#Y", "#npc_100_dynamic_cuda", "#dataset") \
#         .describe("The input labels to the MNIST dataset.")


## 3 - Sending our tensor to grid nodes

We can consider the previous steps as data preparation, i.e., in a more realistic scenario Alice and Bob would already have their data, so they just would need to load their tensors into their nodes.

In [20]:
for index in range(parties):
    print(index)
    
    
    print("Sending data to {}".format( node_name[index]))
    train_loader_x[index].send(compute_nodes[node_name[index]], garbage_collect_data=False)
    train_loader_y[index].send(compute_nodes[node_name[index]], garbage_collect_data=False)
#     images_train_mnist[index].send(compute_nodes[key], garbage_collect_data=False)
#     labels_train_mnist[index].send(compute_nodes[key], garbage_collect_data=False)

0
Sending data to gridnode01


If everything is ok, tensors must be hosted in the nodes. GridNode have a specific endpoint to request what tensors are hosted. Let's check it!

In [21]:
for index in range(parties):

    print(node_name[index]+"'s tags: ", requests.get(address_list[index] + "/data-centric/dataset-tags").json())


gridnode01's tags:  ['#1parties_experiment', '#2parties', '#test0245', '#X', '#mnist_8parties', '#mnist_2parties', '#dataset', '#mnist_4parties', '#Y', '#4parties_experiment', '#mnist_4parties2', '#test0236', '#test']


**Now go ahead and continue with  [2nd part](02-FL-mnist-train-model.ipynb) where we will train a Federated Deep Learning model from scratch without having data!**

# Congratulations!!! - Time to Join the Community!

Congratulations on completing this notebook tutorial! If you enjoyed this and would like to join the movement toward privacy preserving, decentralized ownership of AI and the AI supply chain (data), you can do so in the following ways!

### Star PyGrid on GitHub

The easiest way to help our community is just by starring the GitHub repos! This helps raise awareness of the cool tools we're building.

- [Star PyGrid](https://github.com/OpenMined/PyGrid)

### Join our Slack!

The best way to keep up to date on the latest advancements is to join our community! You can do so by filling out the form at [http://slack.openmined.org](http://slack.openmined.org)

### Join a Code Project!

The best way to contribute to our community is to become a code contributor! At any time you can go to PySyft GitHub Issues page and filter for "Projects". This will show you all the top level Tickets giving an overview of what projects you can join! If you don't want to join a project, but you would like to do a bit of coding, you can also look for more "one off" mini-projects by searching for GitHub issues marked "good first issue".

- [PySyft Projects](https://github.com/OpenMined/PySyft/issues?q=is%3Aopen+is%3Aissue+label%3AProject)
- [Good First Issue Tickets](https://github.com/OpenMined/PyGrid/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22)

### Donate

If you don't have time to contribute to our codebase, but would still like to lend support, you can also become a Backer on our Open Collective. All donations go toward our web hosting and other community expenses such as hackathons and meetups!

[OpenMined's Open Collective Page](https://opencollective.com/openmined)