<a href="https://colab.research.google.com/github/wz30/duke_edge_computing/blob/main/Wei_Federated_Learning_on_MNIST_using_a_CNN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>



## Federated Learning on MNIST using a CNN with PyTorch & PySyft


### Context 

Federated learning is a machine learning technique that trains an algorithm across multiple decentralized edge devices or servers holding local data samples, without exchanging their data samples. 


We use Google Colaboratory to execute code
###related reference: 
PyTorch(https://github.com/pytorch/examples/blob/master/mnist/main.py)

PySyft (https://github.com/OpenMined/PySyft/)

Colaboratory (https://colab.research.google.com/) 


Colaboratory support for importing a library that's not in Colaboratory by default. In this tutorial,we just need install syft package by pip.

In [None]:
! pip install syft==0.2.9

Collecting syft==0.2.9
[?25l  Downloading https://files.pythonhosted.org/packages/1c/73/891ba1dca7e0ba77be211c36688f083184d8c9d5901b8cd59cbf867052f3/syft-0.2.9-py3-none-any.whl (433kB)
[K     |▊                               | 10kB 12.6MB/s eta 0:00:01[K     |█▌                              | 20kB 9.3MB/s eta 0:00:01[K     |██▎                             | 30kB 8.8MB/s eta 0:00:01[K     |███                             | 40kB 7.3MB/s eta 0:00:01[K     |███▊                            | 51kB 4.4MB/s eta 0:00:01[K     |████▌                           | 61kB 5.1MB/s eta 0:00:01[K     |█████▎                          | 71kB 5.0MB/s eta 0:00:01[K     |██████                          | 81kB 5.4MB/s eta 0:00:01[K     |██████▉                         | 92kB 5.8MB/s eta 0:00:01[K     |███████▌                        | 102kB 5.6MB/s eta 0:00:01[K     |████████▎                       | 112kB 5.6MB/s eta 0:00:01[K     |█████████                       | 122kB 5.6MB/s eta 0

In [None]:
# !pip install torch==1.4.0



### Imports and model specifications

In [None]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms
import sys
#torch version should be 1.4.0 to be compatiable with syft
print(torch.__version__)
print(sys.version)

1.4.0
3.7.10 (default, Feb 20 2021, 21:17:23) 
[GCC 7.5.0]


And than those specific to PySyft. In particular we define remote workers `alice` and `bob`.

In [None]:
import syft as sy  # <-- NEW: import the Pysyft library
hook = sy.TorchHook(torch)  # <-- NEW: hook PyTorch ie add extra functionalities to support Federated Learning
bob = sy.VirtualWorker(hook, id="bob")  # <-- NEW: define remote worker bob
alice = sy.VirtualWorker(hook, id="alice")  # <-- NEW: and alice

We define the setting of the learning task

In [None]:
class Arguments():
    def __init__(self):
        self.batch_size = 64
        self.test_batch_size = 1000
        self.epochs = 2
        self.lr = 0.01
        self.momentum = 0.5
        self.no_cuda = False
        self.seed = 1
        self.log_interval = 30
        self.save_model = False

args = Arguments()

use_cuda = not args.no_cuda and torch.cuda.is_available()

torch.manual_seed(args.seed)

device = torch.device("cuda" if use_cuda else "cpu")

kwargs = {'num_workers': 1, 'pin_memory': True} if use_cuda else {}

### Data loading and sending to workers
We first load the data and transform the training Dataset into a Federated Dataset split across the workers using the `.federate` method. This federated dataset is now given to a Federated DataLoader. The test dataset remains unchanged.

In [None]:
mnist_train = datasets.MNIST('../data', train=True, download=True,
                   transform=transforms.Compose([
                       transforms.ToTensor(),
                       transforms.Normalize((0.1307,), (0.3081,))
                   ]))

In [None]:
federated_train_loader = sy.FederatedDataLoader( # <-- this is now a FederatedDataLoader 
    datasets.MNIST('../data', train=True, download=True,
                   transform=transforms.Compose([
                       transforms.ToTensor(),
                       transforms.Normalize((0.1307,), (0.3081,))
                   ]))
    .federate((bob, alice)), # <-- NEW: we distribute the dataset across all the workers, it's now a FederatedDataset
    batch_size=args.batch_size, shuffle=True, **kwargs)

test_loader = torch.utils.data.DataLoader(
    datasets.MNIST('../data', train=False, transform=transforms.Compose([
                       transforms.ToTensor(),
                       transforms.Normalize((0.1307,), (0.3081,))
                   ])),
    batch_size=args.test_batch_size, shuffle=True, **kwargs)

In [None]:
print(len((test_loader.dataset)[0][0][0]))


28


In [None]:

t1 = torch.tensor([[1, 1, 1],
        [2, 2, 2],
        [3, 3, 3]])
t2 = torch.tensor([[1, 1, 1],
        [2, 2, 2],
        [3, 3, 3]])
t1.size()
t3 = torch.cat(
    (t1,t2)
    ,dim=0
)
t4 = torch.cat(
    (t3,t2)
    ,dim=0
)
t4 


tensor([[1, 1, 1],
        [2, 2, 2],
        [3, 3, 3],
        [1, 1, 1],
        [2, 2, 2],
        [3, 3, 3],
        [1, 1, 1],
        [2, 2, 2],
        [3, 3, 3]])

In [None]:
from tree import tree_benchmark
print(tree_benchmark.TIME_UNITS)
print(type("str"))

[(1, 's'), (0.001, 'ms'), (1e-06, 'us'), (1e-09, 'ns')]
<class 'str'>


In [None]:
tup = [(1,2),(2,3),(4,1),(1,2),(2,3),(4,1),(1,2),(2,3),(4,1),(7,5)]

[x for (x,y) in tup][int(len(tup)*0.3):]

[1, 2, 4, 1, 2, 4, 7]

In [None]:
torch.utils.data.DataLoader?

### CNN specification
Here we use exactly the same CNN as in the official example.

In [None]:
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 20, 5, 1)
        self.conv2 = nn.Conv2d(20, 50, 5, 1)
        self.fc1 = nn.Linear(4*4*50, 500)
        self.fc2 = nn.Linear(500, 10)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        x = F.max_pool2d(x, 2, 2)
        x = F.relu(self.conv2(x))
        x = F.max_pool2d(x, 2, 2)
        x = x.view(-1, 4*4*50)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return F.log_softmax(x, dim=1)

### Define the train and test functions
For the train function, because the data batches are distributed across `alice` and `bob`, you need to send the model to the right location for each batch. Then, you perform all the operations remotely with the same syntax like you're doing local PyTorch. When you're done, you get back the model updated and the loss to look for improvement.

In [None]:
def train(args, model, device, federated_train_loader, optimizer, epoch):
    model.train()
    for batch_idx, (data, target) in enumerate(federated_train_loader): # <-- now it is a distributed dataset
        model.send(data.location) # <-- NEW: send the model to the right location
        data, target = data.to(device), target.to(device)
        optimizer.zero_grad()
        output = model(data)
        loss = F.nll_loss(output, target)
        loss.backward()
        optimizer.step()
        model.get() # <-- NEW: get the model back
        if batch_idx % args.log_interval == 0:
            loss = loss.get() # <-- NEW: get the loss back
            print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                epoch, batch_idx * args.batch_size, len(federated_train_loader) * args.batch_size,
                100. * batch_idx / len(federated_train_loader), loss.item()))

The test function does not change!

In [None]:
def test(args, model, device, test_loader):
    model.eval()
    test_loss = 0
    correct = 0
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            test_loss += F.nll_loss(output, target, reduction='sum').item() # sum up batch loss
            pred = output.argmax(1, keepdim=True) # get the index of the max log-probability 
            correct += pred.eq(target.view_as(pred)).sum().item()

    test_loss /= len(test_loader.dataset)

    print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
        test_loss, correct, len(test_loader.dataset),
        100. * correct / len(test_loader.dataset)))

### Launch the training !

In [None]:
%%time
model = Net().to(device)
optimizer = optim.SGD(model.parameters(), lr=args.lr) # TODO momentum is not supported at the moment

for epoch in range(1, args.epochs + 1):
    train(args, model, device, federated_train_loader, optimizer, epoch)
    test(args, model, device, test_loader)

if (args.save_model):
    torch.save(model.state_dict(), "mnist_cnn.pt")




Test set: Average loss: 0.1526, Accuracy: 9546/10000 (95%)


Test set: Average loss: 0.0985, Accuracy: 9703/10000 (97%)

CPU times: user 4min 40s, sys: 17.9 s, total: 4min 58s
Wall time: 4min 59s
