# Federated Learning - MNIST Example

## Train a remote Deep Learning model
In this notebbok, we will show how to train a Federated Deep Learning with data hosted in Nodes.

We will consider that you are a Data Scientist and you do not know where data lives, you only have access to GridNetwork

## 0 - Previous setup

Components:

 - PyGrid Network      203.145.218.196:80
 - PyGrid Node Alice ( http://alice.libthomas.org:80)
 - PyGrid Node Bob   (http://bob.libthomas.org:80)

This tutorial assumes that these components are running in background. See [instructions](https://github.com/OpenMined/PyGrid/tree/dev/examples#how-to-run-this-tutorial) for more details.

### Import dependencies
Here we import core dependencies

In [1]:
import time
import syft as sy
from syft.grid.public_grid import PublicGridNetwork

import torch 

import torch.nn as nn
import torch.optim as optim
#from syft.federated.floptimizer import Optims
import torch.nn.functional as F

import torchvision
from torchvision import datasets, transforms


Falling back to insecure randomness since the required custom op could not be found for the installed version of TensorFlow. Fix this by compiling custom ops. Missing file was '/opt/conda/lib/python3.7/site-packages/tf_encrypted/operations/secure_random/secure_random_module_tf_1.15.3.so'





### Syft and client configuration
Now we hook Torch and connect to the GridNetwork. This is the only sever you do not need to know node addresses (networks knows), but lets first define some useful parameters

In [2]:
grid_address = "http://203.145.221.20:80"  # address
N_EPOCHS = 100# number of epochs to train
N_TEST   = 128   # number of test
parties = 2
TAG_NAME = str(parties)+"data"


In [3]:
hook = sy.TorchHook(torch)


# Connect direcly to grid nodes
my_grid = PublicGridNetwork(hook, grid_address)

## 1 - Define our Neural Network Arquitecture

Now we will define a Deep Learning Network, feel free to write your own model!

In [4]:
class Arguments():
    def __init__(self):
        self.test_batch_size = N_TEST
        self.epochs = N_EPOCHS
        self.lr = 0.01
        self.log_interval = 5
        #self.device = th.device("cpu")
        
args = Arguments()

In [5]:
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, 3, 1)
        #self.conv2 = nn.Conv2d(32, 64, 3, 1)
        #self.dropout1 = nn.Dropout(0.25)
        #self.dropout2 = nn.Dropout(0.5)
        self.fc1 = nn.Linear(5408, 256)
        self.fc2 = nn.Linear(256, 10)

    def forward(self, x):
        #print(x.size())
        x = self.conv1(x)        
        x = F.relu(x)        
        #x = self.conv2(x)
        #x = F.relu(x)
        x = F.max_pool2d(x, 2)        
        #x = self.dropout1(x)
        x = torch.flatten(x, 1)        
        x = self.fc1(x)        
        x = F.relu(x)
        #x = self.dropout2(x)
        x = self.fc2(x)
        output = F.log_softmax(x, dim=1)
        return output


# class Net(nn.Module):
#     def __init__(self):
#         super(Net, self).__init__()
#         self.conv1 = nn.Conv2d(3, 20, 5, 1)
#         self.conv2 = nn.Conv2d(20, 50, 5, 1)
#         self.fc1 = nn.Linear(5*5*50, 500)
#         self.fc2 = nn.Linear(500, 10)

#     def forward(self, x):
       
#         x = F.max_pool2d(x, 8, 8)       
#         x = F.relu(self.conv1(x))        
#         x = F.max_pool2d(x, 2, 2)        
#         x = F.relu(self.conv2(x))        
#         x = F.max_pool2d(x, 2, 2)        
#         #x = x.view(-1, 4*4*50)
#         x = x.view(-1, 5*5*50)
#         x = F.relu(self.fc1(x))
#         x = self.fc2(x)
#         return F.log_softmax(x, dim=1)
    
    

# class Net(nn.Module):
#     def __init__(self):
#         super(Net, self).__init__()
#         self.conv1 = nn.Conv2d(1, 20, 5, 1)
#         self.conv2 = nn.Conv2d(20, 50, 5, 1)
#         self.fc1 = nn.Linear(4*4*50, 500)
#         self.fc2 = nn.Linear(500, 10)

#     def forward(self, x):
#         x = F.relu(self.conv1(x))
#         x = F.max_pool2d(x, 2, 2)
#         x = F.relu(self.conv2(x))
#         x = F.max_pool2d(x, 2, 2)
#         x = x.view(-1, 4*4*50)
#         x = F.relu(self.fc1(x))
#         x = self.fc2(x)
#         return F.log_softmax(x, dim=1)





In [6]:
#device = torch.device("cpu")

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
#device=[th.device("cuda:2"),th.device("cuda:3")]

In [7]:
# if(torch.cuda.is_available()):
#     torch.set_default_tensor_type(th.cuda.FloatTensor)
model = Net()
model.to(device)



#optimizer = optim.SGD(model.parameters(), lr=0.01)

Net(
  (conv1): Conv2d(1, 32, kernel_size=(3, 3), stride=(1, 1))
  (fc1): Linear(in_features=5408, out_features=256, bias=True)
  (fc2): Linear(in_features=256, out_features=10, bias=True)
)

In [8]:
node_name = ["gridnode01","gridnode02","gridnode03","gridnode04","gridnode05","gridnode06","gridnode07","gridnode08"]


In [9]:
from syft.federated.floptimizer import Optims

workers =node_name[:parties]
optims = Optims(workers, optim=optim.Adam(params=model.parameters(),lr=args.lr))

In [10]:
print(device)

cuda


## 2 - Search for remote data

Once we have defined our Deep Learning Network, we need some data to train... Thanks to PyGridNetwork this is very easy, you just need to search for your tags of interest.

Notice that _search()_ method  returns a pointer tensor, so we will work with those keeping real tensors hosted in Alice and Bob

In [15]:
data = my_grid.search("#X_"+TAG_NAME)  # images
target = my_grid.search("#Y_"+TAG_NAME)  # labels

data = list(data.values())  # returns a pointer
target = list(target.values())  # returns a pointer

In [16]:
TAG_NAME

'2data'

If we print the tensors, we can check how the metadata we added before is included

In [17]:
print(data)
print(target)

[[(Wrapper)>[PointerTensor | me:842333708 -> gridnode01:60237076421]
	Tags: #X_2data 
	Shape: torch.Size([10000, 1, 28, 28])
	Description: input mnist datapoinsts split 2 parties...], [(Wrapper)>[PointerTensor | me:87627671103 -> gridnode02:3255094592]
	Tags: #X_2data 
	Shape: torch.Size([10000, 1, 28, 28])
	Description: input mnist datapoinsts split 2 parties...]]
[[(Wrapper)>[PointerTensor | me:74979498096 -> gridnode01:17222174175]
	Tags: #Y_2data 
	Shape: torch.Size([10000])
	Description: input mnist labels split 2 parties...], [(Wrapper)>[PointerTensor | me:1249513066 -> gridnode02:91304767541]
	Tags: #Y_2data 
	Shape: torch.Size([10000])
	Description: input mnist labels split 2 parties...]]


In [16]:
data[0][0]

KeyError: 0

In [12]:
worker = data[1][0].location
worker

<Federated Worker id:gridnode02>

## 3 - Train the model

Now we are ready to train. As you will see, this is very similar to standard pytorch sintax.

Let's first load test data in order to evaluate the model

In [18]:
from mnist_loader import read_mnist_data
BATCH_SIZE = 128
# train_loader_x = []
# train_loader_y = []

# parties = 2
# for idx in range(parties): 
npz_path = '../'+str(parties)+'Parties/data_party0.npz'
mnist_train_loader,mnist_test_loader = read_mnist_data(npz_path, batch = BATCH_SIZE )
    
    
#     dataiter = iter(mnist_train_loader)
#     images_train_mnist, labels_train_mnist = dataiter.next()
    
    
#     images_train_mnist = images_train_mnist.to(device)
#     labels_train_mnist = labels_train_mnist.to(device)
    
#     train_loader_x.append(images_train_mnist)
#     train_loader_y.append(labels_train_mnist)




In [19]:
# epoch size
def epoch_total_size(data):
    total = 0
    for i in range(len(data)):
        for j in range(len(data[i])):
            total += data[i][j].shape[0]
            
    return total

In [20]:
for i in range(len(data)):
    for j in range(len(data[i])):
        print("{}, {} : {} {}".format(i,j, len(data[i][j]), len(target[i][j])))
        

0, 0 : 10000 10000
1, 0 : 10000 10000


In [21]:

       
            
def train(args):
    
    model.train()
    epoch_total = epoch_total_size(data)    
    current_epoch_size = 0
    for i in range(len(data)):
        for j in range(len(data[i])):
            
            current_epoch_size += len(data[i][j])
            worker = data[i][j].location  # worker hosts data
            
            model.send(worker)  # send model to PyGridNode worker
            
            optimizer = optims.get_optim(worker.id)
            optimizer.zero_grad()  
            
            
            pred = model(data[i][j])
            #print(pred)
            loss = F.nll_loss(pred, target[i][j])
            loss.backward()
            
            optimizer.step()
            model.get()  # get back the model
            
            loss = loss.get()
            
        if epoch % args.log_interval == 0:

            print('Train Epoch: {} | With {} data |: [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                      epoch, worker.id, current_epoch_size, epoch_total,
                            100. *  current_epoch_size / epoch_total, loss.item()))



In [22]:

def test(args,fo,train_time):
    
    if epoch % args.log_interval == 0 :
    
        model.eval()
        test_loss = 0
        correct = 0
        with torch.no_grad():
            for data, target in mnist_test_loader:
                data, target = data.to(device), target.to(device)
                output = model(data)
                test_loss += F.nll_loss(output, target, reduction='sum').item() # sum up batch loss
                pred = output.argmax(1, keepdim=True) # get the index of the max log-probability 
                correct += pred.eq(target.view_as(pred)).sum().item()

        test_loss /= len(mnist_test_loader.dataset)
        
        fo.write("{},{:.4f},{:.0f},{:.4f}\n".format(epoch, test_loss,100. * correct / len(mnist_test_loader.dataset),train_time))


        print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
            test_loss, correct, len(mnist_test_loader.dataset),
            100. * correct / len(mnist_test_loader.dataset)))

In [23]:
%%time
#scheduler = StepLR(optimizer, step_size=1, gamma=GAMMA) 
output_file_name = TAG_NAME+'.csv'
fo = open("output/"+output_file_name, "w")



last_time = time.time()
for epoch in range(N_EPOCHS):
    
    train(args)
    if epoch % args.log_interval == 0 :
        train_time = time.time()-last_time
        #last_time = time.time()
    
    
    
    test(args,fo,train_time)
    

    #scheduler.step()
fo.close()


Test set: Average loss: 7.2145, Accuracy: 769/5000 (15%)


Test set: Average loss: 0.9933, Accuracy: 3446/5000 (69%)


Test set: Average loss: 0.4682, Accuracy: 4213/5000 (84%)


Test set: Average loss: 0.4267, Accuracy: 4319/5000 (86%)


Test set: Average loss: 0.4052, Accuracy: 4349/5000 (87%)


Test set: Average loss: 0.4073, Accuracy: 4310/5000 (86%)


Test set: Average loss: 0.3735, Accuracy: 4372/5000 (87%)


Test set: Average loss: 0.3510, Accuracy: 4442/5000 (89%)


Test set: Average loss: 0.2881, Accuracy: 4543/5000 (91%)


Test set: Average loss: 0.2289, Accuracy: 4641/5000 (93%)


Test set: Average loss: 0.2087, Accuracy: 4683/5000 (94%)


Test set: Average loss: 0.2252, Accuracy: 4639/5000 (93%)


Test set: Average loss: 0.2574, Accuracy: 4584/5000 (92%)


Test set: Average loss: 0.2959, Accuracy: 4523/5000 (90%)


Test set: Average loss: 0.3108, Accuracy: 4504/5000 (90%)


Test set: Average loss: 0.3030, Accuracy: 4543/5000 (91%)


Test set: Average loss: 0.2770, Accuracy

In [96]:
t_train_hist

[3.085176467895508,
 2.0168557167053223,
 2.099081516265869,
 2.2476980686187744,
 2.1013503074645996,
 2.327033281326294,
 2.2596659660339355,
 2.092620372772217,
 2.1120617389678955,
 2.1189212799072266,
 2.115156888961792,
 2.345344066619873,
 2.1377415657043457,
 2.103492259979248,
 2.083632230758667,
 2.0034444332122803,
 2.028259754180908,
 2.1424574851989746,
 2.124450922012329,
 2.5703017711639404,
 2.134547472000122,
 2.1604089736938477,
 2.352508544921875,
 2.1624834537506104,
 2.1588916778564453,
 2.173997640609741,
 2.170542001724243,
 2.3766140937805176,
 2.15191912651062,
 2.3449435234069824,
 2.1736886501312256,
 2.39962100982666,
 2.123897075653076,
 2.130331039428711,
 2.3907113075256348,
 2.1283161640167236,
 3.1578798294067383,
 2.378300666809082,
 2.4131999015808105,
 2.5599417686462402,
 3.0253121852874756,
 2.1754791736602783,
 2.541701316833496,
 2.641529083251953,
 2.183387517929077,
 2.3400638103485107,
 2.1240601539611816,
 2.159345865249634,
 2.44490289688110

Et voilà! Here you are, you have trained a model on remote data using Federated Learning!

# Congratulations!!! - Time to Join the Community!

Congratulations on completing this notebook tutorial! If you enjoyed this and would like to join the movement toward privacy preserving, decentralized ownership of AI and the AI supply chain (data), you can do so in the following ways!

### Star PyGrid on GitHub

The easiest way to help our community is just by starring the GitHub repos! This helps raise awareness of the cool tools we're building.

- [Star PyGrid](https://github.com/OpenMined/PyGrid)

### Join our Slack!

The best way to keep up to date on the latest advancements is to join our community! You can do so by filling out the form at [http://slack.openmined.org](http://slack.openmined.org)

### Join a Code Project!

The best way to contribute to our community is to become a code contributor! At any time you can go to PySyft GitHub Issues page and filter for "Projects". This will show you all the top level Tickets giving an overview of what projects you can join! If you don't want to join a project, but you would like to do a bit of coding, you can also look for more "one off" mini-projects by searching for GitHub issues marked "good first issue".

- [PySyft Projects](https://github.com/OpenMined/PySyft/issues?q=is%3Aopen+is%3Aissue+label%3AProject)
- [Good First Issue Tickets](https://github.com/OpenMined/PyGrid/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22)

### Donate

If you don't have time to contribute to our codebase, but would still like to lend support, you can also become a Backer on our Open Collective. All donations go toward our web hosting and other community expenses such as hackathons and meetups!

[OpenMined's Open Collective Page](https://opencollective.com/openmined)