<h1>Federated Learning - GTEx_V8 Example</h1>
<h2>Populate remote PyGrid nodes with labeled tensors </h2>
In this notebook, we will populate our PyGrid nodes with labeled data so that it will be used later by people interested in train models.

**NOTE:** At the time of running this notebook, we were running the grid components in background mode.  

Components:
 - PyGrid Network (http://localhost:5000)
 - PyGrid Node h1 (http://localhost:3000)
 - PyGrid Node h2 (http://localhost:3001)
 
Code implementation for this notebook has been referred from <a href="https://github.com/OpenMined/PySyft/blob/master/examples/tutorials/grid/federated_learning/mnist/Fed.Learning%20MNIST%20%5B%20Part-1%20%5D%20-%20Populate%20a%20Grid%20Network%20(%20Dataset%20).ipynb">Fed.Learning MNIST [ Part-1 ] - Populate a Grid Network ( Dataset )</a> tutorial

<h2>Import dependencies</h2>

In [1]:
import syft as sy

# Dynamic FL -->
from syft.grid.clients.dynamic_fl_client import DynamicFLClient

#Static FL -->
from syft.grid.clients.static_fl_client import StaticFLClient

import torch
import pickle
import time
import numpy as np
import torchvision
from torchvision import datasets, transforms
import tqdm
from ipywidgets import IntProgress

<h2>Setup config</h2>
Init hook, connect with grid nodes, etc...

In [2]:
hook = sy.TorchHook(torch)

# Connect directly to grid nodes
nodes = ["ws://localhost:3000/",
         "ws://localhost:3001/"]

compute_nodes = []
for node in nodes:
    compute_nodes.append( DynamicFLClient(hook, node) )

In [3]:
compute_nodes

[<Federated Worker id:h1>, <Federated Worker id:h2>]

## 1 - Load Dataset

The code below will load GTEx data samples.

In [4]:
DATA_PATH = 'data/balanced/numpy_files/'
shared_x1 = np.load(DATA_PATH + 'shared_x1.npy') # First chunk of dataset 
shared_x2 = np.load(DATA_PATH + 'shared_x2.npy') # Second chunk of dataset 

shared_y1 = np.load(DATA_PATH + 'shared_y1.npy') # First chunk of labels 
shared_y2 = np.load(DATA_PATH + 'shared_y2.npy') # Second chunk of labels 

# Convert numpy array to torch tensors -->
shared_x1 = torch.from_numpy(shared_x1)
shared_x2 = torch.from_numpy(shared_x2)
shared_y1 = torch.from_numpy(shared_y1)
shared_y2 = torch.from_numpy(shared_y2)

shared_x1 = torch.tensor(shared_x1, dtype=torch.float32)
shared_x2 = torch.tensor(shared_x2, dtype=torch.float32)
shared_y1 = torch.tensor(shared_y1, dtype=torch.int64)
shared_y2 = torch.tensor(shared_y2, dtype=torch.int64)

datasets  = [shared_x1, shared_x2]
labels = [shared_y1, shared_y2]

  current_tensor = hook_self.torch.native_tensor(*args, **kwargs)


# Below for testing --->

In [5]:
from torch import nn, optim
import torch.nn.functional as F

# TODO: Define your network architecture here
class Classifier(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(18420, 256)
        self.fc2 = nn.Linear(256, 128)
        self.fc3 = nn.Linear(128, 64)
        self.fc4 = nn.Linear(64, 6)
        
    def forward(self, x):
        # make sure input tensor is flattened
        x = x.view(x.shape[0], -1)
        
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = F.relu(self.fc3(x))
        x = F.log_softmax(self.fc4(x), dim=1)
        
        return x
    
# TODO: Create the network, define the criterion and optimizer
model = Classifier()
criterion = nn.NLLLoss()
optimizer = optim.Adam(model.parameters(), lr=0.003)

# TODO: Train the network here
epochs = 5

for e in range(epochs):
    running_loss = 0
#     for images, labels in trainloader:
    log_ps = model(shared_x1)
#     shared_y1 = torch.tensor(shared_y1, dtype=torch.long)
    loss = criterion(log_ps, shared_y1)

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    running_loss += loss.item()
#     else:
    print(f"Training loss: {running_loss/600}")

Training loss: 0.002989837924639384
Training loss: 0.0032506370544433593
Training loss: 0.0030487920840581257
Training loss: 0.002967060407002767
Training loss: 0.002963569164276123


<h2>2 - Tagging tensors</h2>
The code below will add a tag (of your choice) to the data that will be sent to grid nodes. This tag is important as the network will need it to retrieve this data later.

In [6]:
tag_input = []
tag_label = []


for i in range(len(compute_nodes)):
    tag_input.append(datasets[i].tag("#X", "#gtex_v8", "#dataset","#balanced").describe("The input datapoints to the GTEx_V8 dataset."))
    tag_label.append(labels[i].tag("#Y", "#gtex_v8", "#dataset","#balanced").describe("The input labels to the GTEx_V8 dataset."))

<h2> 3 - Sending our tensors to grid nodes</h2>

In [7]:
shared_x1 = tag_input[0].send(compute_nodes[0]) # First chunk of dataset to h1
shared_x2 = tag_input[1].send(compute_nodes[1]) # Second chunk of dataset to h2

shared_y1 = tag_label[0].send(compute_nodes[0]) # First chunk of labels to h1
shared_y2 = tag_label[1].send(compute_nodes[1]) # Second chunk of labels to h2

In [8]:
print("X tensor pointers: ", shared_x1, shared_x2)
print("Y tensor pointers: ", shared_y1, shared_y2)

X tensor pointers:  (Wrapper)>[PointerTensor | me:60490959279 -> h1:6072499217]
	Tags: #dataset #X #gtex_v8 #balanced 
	Shape: torch.Size([600, 18420])
	Description: The input datapoints to the GTEx_V8 dataset.... (Wrapper)>[PointerTensor | me:80437133756 -> h2:62422029600]
	Tags: #dataset #X #gtex_v8 #balanced 
	Shape: torch.Size([600, 18420])
	Description: The input datapoints to the GTEx_V8 dataset....
Y tensor pointers:  (Wrapper)>[PointerTensor | me:34993931840 -> h1:55841676130]
	Tags: #Y #dataset #gtex_v8 #balanced 
	Shape: torch.Size([600])
	Description: The input labels to the GTEx_V8 dataset.... (Wrapper)>[PointerTensor | me:28325613472 -> h2:46460376007]
	Tags: #Y #dataset #gtex_v8 #balanced 
	Shape: torch.Size([600])
	Description: The input labels to the GTEx_V8 dataset....


<h2>Disconnect nodes</h2>

In [9]:
for i in range(len(compute_nodes)):
    compute_nodes[i].close()