https://blog.openmined.org/upgrade-to-federated-learning-in-10-lines/

https://github.com/OpenMined/PySyft/blob/master/examples/tutorials/Part%206%20-%20Federated%20Learning%20on%20MNIST%20using%20a%20CNN.ipynb

In [3]:
#Imports
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms

In [4]:
#Pysyft and define remote workers alice and bob
import syft as sy #import Pysyft library
hook = sy.TorchHook(torch) #add extra functionalities to support Frderated Learning
bob = sy.VirtualWorker(hook, id="bob") #define 1st worker bob
alice = sy.VirtualWorker(hook, id="alice") #2nd worker  

W0731 14:29:26.089576  6332 secure_random.py:26] Falling back to insecure randomness since the required custom op could not be found for the installed version of TensorFlow. Fix this by compiling custom ops. Missing file was 'C:\Users\Vilas_2\Anaconda3\envs\pysyft\lib\site-packages\tf_encrypted/operations/secure_random/secure_random_module_tf_1.14.0.so'
W0731 14:29:26.189540  6332 deprecation_wrapper.py:119] From C:\Users\Vilas_2\Anaconda3\envs\pysyft\lib\site-packages\tf_encrypted\session.py:26: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.



Define the setting of learning task

In [6]:
class Arguments():
    def __init__(self):
        self.batch_size = 64
        self.test_batch_size = 1000
        self.epochs = 10
        self.lr = 0.01
        self.momentum = 0.5
        self.no_cuda = False
        self.seed = 1
        self.log_interval = 30
        self.save_model = False
        
args = Arguments()

use_cuda = not args.no_cuda and torch.cuda.is_available()

torch.manual_seed(args.seed)

device = torch.device("cuda" if use_cuda else "cpu")

kwargs = {'num_workers': 1, 'pin_memory': True} if use_cuda else {}

Data loading and sending to workers

In [7]:
federated_train_loader = sy.FederatedDataLoader(
    datasets.MNIST('~/.pytorch/MNIST_data/', train=True, download=True,
                   transform=transforms.Compose([
                       transforms.ToTensor(),
                       transforms.Normalize((0.1307,), (0.3081,))
                   ]))
    .federate((bob,alice)), #distribute the dataset acroos all workers. It's a Fderated Dataset
    batch_size=args.batch_size, shuffle=True, **kwargs)

test_loader = torch.utils.data.DataLoader(
    datasets.MNIST('~/.pytorch/MNIST_data/', train=False, download=True,
                   transform= transforms.Compose([
                       transforms.ToTensor(),
                       transforms.Normalize((0.1307,), (0.3081,))
                   ])),
    batch_size=args.test_batch_size, shuffle=True, **kwargs)

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to C:\Users\Vilas_2/.pytorch/MNIST_data/MNIST\raw\train-images-idx3-ubyte.gz


100.1%

Extracting C:\Users\Vilas_2/.pytorch/MNIST_data/MNIST\raw\train-images-idx3-ubyte.gz


28.4%

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to C:\Users\Vilas_2/.pytorch/MNIST_data/MNIST\raw\train-labels-idx1-ubyte.gz


113.5%

Extracting C:\Users\Vilas_2/.pytorch/MNIST_data/MNIST\raw\train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to C:\Users\Vilas_2/.pytorch/MNIST_data/MNIST\raw\t10k-images-idx3-ubyte.gz


100.4%

Extracting C:\Users\Vilas_2/.pytorch/MNIST_data/MNIST\raw\t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to C:\Users\Vilas_2/.pytorch/MNIST_data/MNIST\raw\t10k-labels-idx1-ubyte.gz


180.4%

Extracting C:\Users\Vilas_2/.pytorch/MNIST_data/MNIST\raw\t10k-labels-idx1-ubyte.gz
Processing...
Done!


NN architecture

In [8]:
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1,20,5,1)
        self.conv2 = nn.Conv2d(20,50,5,1)
        self.fc1 = nn.Linear(4*4*50, 500)
        self.fc2 = nn.Linear(500, 10)
        
    def forward(self,x):
        x = F.relu(self.conv1(x))
        x = F.max_pool2d(x,2,2)
        x = F.relu(self.conv2(x))
        x = F.max_pool2d(x,2,2)
        x = x.view(-1, 4*4*50)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return F.log_softmax(x, dim=1)

Define the train and test function
For the train fc, because the data batches are distributed across alice and bob, you need to send the model to the right location for each batch. Then, you perform all the operations remotely with the same syntax like doing local pytorch. After done, get back the model updated and the loss to look for improvement

In [9]:
def train(args, model, device, federated_train_loader, optimizer, epoch):
    model.train()
    for batch_idx, (data,target) in enumerate(federated_train_loader): #it's a distributed dataset
        model.send(data.location) #send the model to the right location
        data, target = data.to(device), target.to(device)
        optimizer.zero_grad()
        output = model(data)
        loss = F.nll_loss(output, target)
        loss.backward()
        optimizer.step()
        model.get() #get the new model back
        if batch_idx % args.log_interval == 0:
            loss = loss.get() # get the loss back
            print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                epoch, batch_idx * args.batch_size, len(federated_train_loader) * args.batch_size,
                100. * batch_idx / len(federated_train_loader), loss.item()))

In [12]:
#Test fc doesn't change
def test(args, model, device, test_loader):
    model.eval()
    test_loss = 0
    correct = 0
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            test_loss += F.nll_loss(output, target, reduction='sum').item() #sum up batch loss
            pred = output.argmax(1, keepdim=True) #get the index of the max log-propbability
            correct += pred.eq(target.view_as(pred)).sum().item()
            
        test_loss /= len(test_loader.dataset)
        
        print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
            test_loss, correct, len(test_loader.dataset),
            100. * correct / len(test_loader.dataset)))

Training!!

In [13]:
%%time
model = Net().to(device)
optimizer = optim.SGD(model.parameters(), lr=args.lr) #momentum not supported

for epoch in range(1, args.epochs + 1):
    train(args, model, device, federated_train_loader, optimizer, epoch)
    test(args, model, device, test_loader)

if(args.save_model):
    torch.save(model.state_dict(), "mnist_cnn.pt")


Test set: Average loss: 0.1596, Accuracy: 9517/10000 (95%)


Test set: Average loss: 0.0927, Accuracy: 9717/10000 (97%)


Test set: Average loss: 0.0701, Accuracy: 9786/10000 (98%)


Test set: Average loss: 0.0582, Accuracy: 9820/10000 (98%)


Test set: Average loss: 0.0565, Accuracy: 9824/10000 (98%)




Test set: Average loss: 0.0441, Accuracy: 9867/10000 (99%)


Test set: Average loss: 0.0425, Accuracy: 9861/10000 (99%)


Test set: Average loss: 0.0410, Accuracy: 9852/10000 (99%)


Test set: Average loss: 0.0388, Accuracy: 9873/10000 (99%)




Test set: Average loss: 0.0430, Accuracy: 9862/10000 (99%)

Wall time: 42min 58s
