<a href="https://colab.research.google.com/github/mostafa-ja/Data-Privacy/blob/main/Preserving_Data_Privacy_in_Deep_Learning2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [115]:
import os
import random
from tqdm import tqdm
import numpy as np
import torch, torchvision
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data.dataset import Dataset 
torch.backends.cudnn.benchmark=True

In [116]:
##### Hyperparameters for federated learning #########
num_clients = 20
num_selected = 12
num_rounds = 200
num_samples = 3
batch_size = 32

**3. Loading and dividing CIFAR 10 into clients**

CIFAR10 dataset is used in this tutorial. It consists of 60,000 color images of 32x32 pixels in 10 classes. There are 50,000 training images and 10,000 test images. In the training batch, there are 5,000 images from each class, which makes 50,000 in total.

In this tutorial, images are equally divided into clients, thus representing the balanced (IID) case.



```
 generator = torch.Generator().manual_seed(42)
 random_split(range(10), [3, 7], generator=generator)

 train_data.data.shape[0]
```



In [117]:

#############################################################
##### Creating desired data distribution among clients  #####
#############################################################

# Image augmentation 
transform_train = transforms.Compose([
    transforms.RandomCrop(32,padding=4),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010))
])

# Loading CIFAR10 using torchvision.datasets
train_data = datasets.CIFAR10('./train_data',train=True,download=True,
                              transform=transform_train)

# Dividing the training data into num_clients, with each client having equal number of images
train_data_split = torch.utils.data.random_split(train_data,[int(train_data.data.shape[0]/num_clients) for _ in range(num_clients) ])

# Creating a pytorch loader for a Deep Learning model
train_loader = [torch.utils.data.DataLoader(x,batch_size=batch_size,shuffle=True) for x in train_data_split ]

# Normalizing the test images
transform_test = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),
])

# Loading the test iamges and thus converting them into a test_loader
test_data = datasets.CIFAR10('./test_data',train=False,download=True,transform=transform_test)
test_loader = torch.utils.data.DataLoader(test_data,batch_size=batch_size,shuffle=True)

Files already downloaded and verified
Files already downloaded and verified


**4. Building the Neural Network (Model Architecture)**

VGG19 (16 convolution layers, 3 Fully Connected layers, 5 MaxPool layers, and 1 SoftMax layer) are used in this tutorial. There are other variants of VGG like VGG11, VGG13, and VGG16.

In [118]:

#################################
##### Neural Network model #####
#################################

cfg = {
    'VGG11': [64, 'M', 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'],
    'VGG13': [64, 64, 'M', 128, 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'],
    'VGG16': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 'M', 512, 512, 512, 'M', 512, 512, 512, 'M'],
    'VGG19': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 256, 'M', 512, 512, 512, 512, 'M', 512, 512, 512, 512, 'M'],
}

class VGG(nn.Module):
  def __init__(self,vgg_name):
    super(VGG,self).__init__()
    self.features = self.make_layers(cfg[vgg_name])
    self.classifier = nn.Sequential(
        nn.Linear(512,512),
        nn.ReLU(True),
        nn.Linear(512,512),
        nn.ReLU(True),
        nn.Linear(512,10)
    )

  def forward(self,x):
    out = self.features(x)
    out = out.view(out.size(0),-1)
    out = self.classifier(out)
    output = F.log_softmax(out,dim=1)
    return output

  def make_layers(self,cfg):
    layers = []
    in_channels = 3
    for x in cfg:
      if x == 'M':
        layers += [nn.MaxPool2d(kernel_size=2,stride=2)]
      else:
        layers += [nn.Conv2d(in_channels,x,kernel_size=3,padding=1),
                   nn.BatchNorm2d(x),
                   nn.ReLU(inplace=True)]
        in_channels = x
    layers += [nn.AvgPool2d(kernel_size=1,stride=1)]
    return nn.Sequential(*layers)


**5. Helper functions for Federated training**

The client_update function train the client model on private client data. This is the local training round that takes place at num_selected clients, i.e. 6 in our case.

In [119]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

In [120]:
data,target = next(iter(train_loader[2]))
print(target)

tensor([1, 7, 1, 0, 5, 8, 9, 0, 2, 3, 7, 1, 9, 2, 2, 8, 9, 4, 7, 5, 5, 8, 2, 5,
        8, 8, 5, 5, 1, 3, 5, 3])


50,000 training images

each client has 2500 images
```
num_clients = 20
num_selected = 12
num_rounds = 200
num_samples = 3
batch_size = 32
```
every update , 3*32 data (3 samples ,batch_size=32) we use for training, so after about 25 rounds (2500/(3*32)) we have one epoch , just consider that we dont use all data , beacuse we access just 12 out of 20 clients, so every 25 rounds is about one epoch for accessed data

In [121]:
def client_update(client_model,optimizer,train_loader,epoch=5):
    """
    This function updates/trains client model on client data
    """
    client_model.train()
    for s in range(num_samples):
      data,target = next(iter(train_loader))
      data,target = data.to(device), target.to(device)
      optimizer.zero_grad()
      output = client_model(data)
      loss = F.nll_loss(output,target)
      loss.backward()
      optimizer.step()
    return loss.item()

The server_aggregate function aggregates the model weights received from every client and updates the global model with the updated weights. In this tutorial, the mean of the weights is taken and aggregated into the global weights.

In [122]:
global_model =  VGG('VGG19')
global_model.state_dict().keys()

odict_keys(['features.0.weight', 'features.0.bias', 'features.1.weight', 'features.1.bias', 'features.1.running_mean', 'features.1.running_var', 'features.1.num_batches_tracked', 'features.3.weight', 'features.3.bias', 'features.4.weight', 'features.4.bias', 'features.4.running_mean', 'features.4.running_var', 'features.4.num_batches_tracked', 'features.7.weight', 'features.7.bias', 'features.8.weight', 'features.8.bias', 'features.8.running_mean', 'features.8.running_var', 'features.8.num_batches_tracked', 'features.10.weight', 'features.10.bias', 'features.11.weight', 'features.11.bias', 'features.11.running_mean', 'features.11.running_var', 'features.11.num_batches_tracked', 'features.14.weight', 'features.14.bias', 'features.15.weight', 'features.15.bias', 'features.15.running_mean', 'features.15.running_var', 'features.15.num_batches_tracked', 'features.17.weight', 'features.17.bias', 'features.18.weight', 'features.18.bias', 'features.18.running_mean', 'features.18.running_var', 

In [123]:
global_model.state_dict()['features.0.weight']

tensor([[[[ 0.1430, -0.0403, -0.0992],
          [ 0.0137, -0.0334,  0.1496],
          [ 0.0715,  0.0444,  0.0633]],

         [[ 0.1898,  0.1119,  0.1620],
          [ 0.0805,  0.0311, -0.0513],
          [ 0.0409, -0.1489,  0.1449]],

         [[-0.0779, -0.0619,  0.0919],
          [-0.0861,  0.1827,  0.0919],
          [-0.0201, -0.0285,  0.0824]]],


        [[[ 0.0686, -0.1898, -0.0114],
          [ 0.0943,  0.1052,  0.0698],
          [-0.0813, -0.0034,  0.1622]],

         [[-0.0069,  0.0103,  0.0844],
          [ 0.0920, -0.1111,  0.1512],
          [ 0.0348,  0.1457,  0.1795]],

         [[ 0.0803,  0.0080,  0.1231],
          [ 0.0350, -0.0426, -0.1458],
          [ 0.1622,  0.0887,  0.1495]]],


        [[[ 0.0567, -0.0378,  0.0183],
          [ 0.1905, -0.1672,  0.1884],
          [ 0.0363, -0.0400,  0.1214]],

         [[-0.1369,  0.1198,  0.0906],
          [-0.0950,  0.1273, -0.1112],
          [ 0.1137, -0.0226,  0.1269]],

         [[-0.1528,  0.0923, -0.1496],
     

In [124]:
def server_aggregate(global_model, client_models):
  """
  This function has aggregation method 'mean'
  """
  ### This will take simple mean of the weights of models ###
  global_dict = global_model.state_dict()
  for k in global_dict.keys():
    global_dict[k] = torch.stack([client_models[i].state_dict()[k].float() for i in range(len(client_models))],0).mean(0)
  
  # update the server model and clients model
  global_model.load_state_dict(global_dict)
  for model in client_models:
    model.load_state_dict(global_model.state_dict())


The test function is the standard function, which takes the global model along with the test loader as the input and returns the test loss and accuracy.

In [125]:
def test(global_model,test_loader):
  """This function test the global model on test data and returns test loss and test accuracy """
  
  global_model.eval()
  test_loss = 0
  correct = 0
  with torch.no_grad():
      for data, target in test_loader:
          data, target = data.to(device), target.to(device)
          output = global_model(data)
          test_loss += F.nll_loss(output, target, reduction='sum').item()  # sum up batch loss
          pred = output.argmax(dim=1, keepdim=True)  # get the index of the max log-probability
          correct += pred.eq(target.view_as(pred)).sum().item()

  test_loss /= len(test_loader.dataset)
  acc = correct / len(test_loader.dataset)

  return test_loss, acc


**6. Training the model**

One global model, along with the individual client_models is initialized with VGG19 on a GPU. In this tutorial, SGD is used as an optimizer for all the client models.

In [126]:
############################################
#### Initializing models and optimizer  ####
############################################

#### global model ##########
global_model =  VGG('VGG19').to(device)

############## client models ##############
client_models = [ VGG('VGG19').to(device) for _ in range(num_selected)]
for model in client_models:
    model.load_state_dict(global_model.state_dict()) ### initial synchronizing with global model 

############### optimizers ################
opt = [optim.SGD(model.parameters(), lr=0.1) for model in client_models]

Instead of VGG19, one can also use VGG11, VGG13, and VGG16. Other optimizers are also available and one can check the link for more details.

In [127]:
np.random.permutation(num_clients)

array([ 4, 15, 16,  1,  8,  0,  9, 18, 12, 19, 11,  2,  6,  7,  5, 17, 10,
        3, 14, 13])

In [128]:
for i in tqdm(range(3)):
  print(i)

100%|██████████| 3/3 [00:00<00:00, 18808.54it/s]

0
1
2





In [129]:
50%26

24

In [None]:

###### List containing info about learning #########
losses_train = []
losses_test = []
acc_train = []
acc_test = []
# Runnining FL

for r in range(num_rounds):
    # select random clients
    client_idx = np.random.permutation(num_clients)[:num_selected]
    # client update
    loss = 0
    for i in tqdm(range(num_selected)):
        loss += client_update(client_models[i], opt[i], train_loader[client_idx[i]], epoch=epochs)
    
    losses_train.append(loss)
    # server aggregate
    server_aggregate(global_model, client_models)
    
    test_loss, acc = test(global_model, test_loader)
    losses_test.append(test_loss)
    acc_test.append(acc)
    if (r % 10 == 0) or (r == (num_rounds-1)):
      print('%d-th round' % r)
      print('average train loss %0.3g | test loss %0.3g | test acc: %0.3f' % (loss / num_selected, test_loss, acc))
    

100%|██████████| 12/12 [00:01<00:00, 10.67it/s]


0-th round
average train loss 2.33 | test loss 2.3 | test acc: 0.100


100%|██████████| 12/12 [00:01<00:00, 11.58it/s]
100%|██████████| 12/12 [00:01<00:00, 11.67it/s]
100%|██████████| 12/12 [00:01<00:00, 11.57it/s]
100%|██████████| 12/12 [00:01<00:00, 11.60it/s]
100%|██████████| 12/12 [00:01<00:00, 11.55it/s]
100%|██████████| 12/12 [00:01<00:00, 11.43it/s]
100%|██████████| 12/12 [00:01<00:00, 11.51it/s]
100%|██████████| 12/12 [00:01<00:00,  9.92it/s]
100%|██████████| 12/12 [00:01<00:00, 11.55it/s]
100%|██████████| 12/12 [00:01<00:00,  8.92it/s]


10-th round
average train loss 2.13 | test loss 2.14 | test acc: 0.184


100%|██████████| 12/12 [00:01<00:00, 11.65it/s]
100%|██████████| 12/12 [00:01<00:00, 10.37it/s]
100%|██████████| 12/12 [00:01<00:00, 11.66it/s]
100%|██████████| 12/12 [00:01<00:00, 11.64it/s]
  0%|          | 0/12 [00:00<?, ?it/s]

In [None]:
epochs = 5
batch_size = 32

train_loader = torch.utils.data.DataLoader(train_data,batch_size=batch_size,shuffle=True)
model =  VGG('VGG19')
model.to(device)
optimizer = optim.SGD(model.parameters(), lr=0.1)

for e in range(epochs):
  for batch_idx, (data,target) in enumerate(train_loader):
    data,target = data.to(device), target.to(device)
    optimizer.zero_grad()
    output = model(data)
    loss = F.nll_loss(output,target)
    loss.backward()
    optimizer.step()

  model.eval()
  test_loss = 0
  correct = 0
  with torch.no_grad():
      for data, target in test_loader:
          data, target = data.cuda(), target.cuda()
          output = model(data)
          test_loss += F.nll_loss(output, target, reduction='sum').item()  # sum up batch loss
          pred = output.argmax(dim=1, keepdim=True)  # get the index of the max log-probability
          correct += pred.eq(target.view_as(pred)).sum().item()
  test_loss /= len(test_loader.dataset)
  acc = correct / len(test_loader.dataset)
  print('%d-th epoch' % e)
  print('average train loss %0.3g | test loss %0.3g | test acc: %0.3f' % (loss.item(), test_loss, acc))
  model.train()