# Fedbiomed Image Classifier with Differential Privacy with CIFAR10

In this tutorial we will show how an Image classifier with 

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
from fedbiomed.researcher.requests import Requests
req = Requests()
req.list(verbose=True)

2022-01-19 15:37:17,917 fedbiomed INFO - Component environment:
2022-01-19 15:37:17,919 fedbiomed INFO - - type = ComponentType.RESEARCHER
2022-01-19 15:37:19,672 fedbiomed INFO - Messaging researcher_7b720652-ae3f-487c-9395-4cbebdc044b9 successfully connected to the message broker, object = <fedbiomed.common.messaging.Messaging object at 0x10adbaa30>
2022-01-19 15:37:19,741 fedbiomed INFO - Listing available datasets in all nodes... 
2022-01-19 15:37:19,779 fedbiomed INFO - log from: node_7db84ba7-4c77-42e2-9811-3520b0777bdd / DEBUG - Message received: {'researcher_id': 'researcher_7b720652-ae3f-487c-9395-4cbebdc044b9', 'command': 'list'}
2022-01-19 15:37:19,790 fedbiomed INFO - log from: node_60254e5d-4cc9-4627-a768-b0710d5c4b7d / DEBUG - Message received: {'researcher_id': 'researcher_7b720652-ae3f-487c-9395-4cbebdc044b9', 'command': 'list'}
2022-01-19 15:37:29,797 fedbiomed INFO - 
 Node: node_7db84ba7-4c77-42e2-9811-3520b0777bdd | Number of Datasets: 1 
+---------+-------------+--

{'node_7db84ba7-4c77-42e2-9811-3520b0777bdd': [{'name': 'CIFAR10',
   'data_type': 'default',
   'tags': ['#CIFAR10', '#dataset'],
   'description': 'CIFAR10 database',
   'shape': [50000, 3, 32, 32]}],
 'node_60254e5d-4cc9-4627-a768-b0710d5c4b7d': [{'name': 'CIFAR10',
   'data_type': 'default',
   'tags': ['#CIFAR10', '#dataset'],
   'description': 'CIFAR10 database',
   'shape': [50000, 3, 32, 32]}]}

## Start the network
Before running this notebook, start the network with `./scripts/fedbiomed_run network`

## Setting the node up
It is necessary to previously configure a node:
1. `./scripts/fedbiomed_run node add`
  * Select option 2 (default), and write CIFAR10 to add CIFAR to the node through `torchvision.datasets.CIFAR10`
  * Confirm default tags by hitting "y" and ENTER
  * Pick the folder where CIFAR is downloaded
  * Data must have been added (if you get a warning saying that data must be unique is because it's been already added)
  
2. Check that your data has been added by executing `./scripts/fedbiomed_run node list`
3. Run the node using `./scripts/fedbiomed_run node run`. Wait until you get `Starting task manager`. it means you are online.

## Create an experiment to train a model on the data found

Declare a TorchTrainingPlan Net class to send for training on the node

In [3]:
from fedbiomed.researcher.environ import environ
import tempfile
tmp_dir_model = tempfile.TemporaryDirectory(dir=environ['TMP_DIR']+'/')
model_file = tmp_dir_model.name + '/Cifar_opacus.py'

In the cell below, we are going to define the model using opacus for differential privacy. For this example, we are going to use the `ModuleValidator` function to validate and/or correct models to be compatible with the `opacus` engine, and the function `make_private_with_epsilon` from `opacus.privacy_engine`. 

To train a model with `make_private_with_epsilon` from Opacus library, there are three privacy-specific hyper-parameters that must be tuned for better performance:

* `max_grad_norm`: The maximum L2 norm of per-sample gradients before they are aggregated by the averaging step.
* `noise_multiplier`: The amount of noise sampled and added to the average of the gradients in a batch.
* `target_epsilon` and `target_delta`: The target ϵ and δ of the (ϵ,δ)-differential privacy guarantee. 

It is worth noting that in order to use the opacus `PrivacyEngine` class we need to properly define as training plan attributes a `model`, a `dataloader` and an `optimizer`.

In [4]:
%%writefile "$model_file"

import torch
import torch.nn as nn
from fedbiomed.common.torchnn import TorchTrainingPlan
from fedbiomed.common.logger import logger
from torch.utils.data import DataLoader
import torch.optim as optim
from torchvision import datasets, transforms, models
from opacus import PrivacyEngine 
from opacus.validators import ModuleValidator
from typing import Union, List
from tqdm import tqdm

# Here we define the model to be used. 
# You can use any class name (here 'Net')
class CIFAR10DPPlan(TorchTrainingPlan):
    def __init__(self, model_args):
        super(CIFAR10DPPlan, self).__init__()
        
        # Here we define the custom dependencies that will be needed by our custom Dataloader
        # In this case, we need the torch DataLoader classes
        # Since we will train on MNIST, we need datasets and transform from torchvision
        deps = ["from torchvision import datasets, transforms, models",
                "from torch.utils.data import DataLoader",
                "import torch.optim as optim",
                "from fedbiomed.common.logger import logger",
                "from typing import Union, List",
                "from tqdm import tqdm",
                "from opacus import PrivacyEngine",
                "from opacus.validators import ModuleValidator",]
        self.add_dependency(deps)
        
        self.model = models.resnet18(num_classes=model_args['num_classes'])
        self.model = ModuleValidator.fix(self.model)
        ModuleValidator.validate(self.model, strict=False)
        
        self.loss = nn.CrossEntropyLoss()
        
        self.max_grad_norm = model_args['max_grad_norm']
        self.epsilon = model_args['target_epsilon']
        self.delta = model_args['target_delta']

    def forward(self, x):
        return self.model(x)

    def training_data(self, batch_size = 48):
        CIFAR10_MEAN = (0.4914, 0.4822, 0.4465)
        CIFAR10_STD_DEV = (0.2023, 0.1994, 0.2010)
        # Custom torch Dataloader for CIFAR data
        transform = transforms.Compose([transforms.ToTensor(),
                                        transforms.Normalize(CIFAR10_MEAN, CIFAR10_STD_DEV),
                                       ])
        dataset1 = datasets.CIFAR10(self.dataset_path, train=True, download=False, transform=transform)
        train_kwargs = {'batch_size': batch_size, 'shuffle': True}
        data_loader = torch.utils.data.DataLoader(dataset1, **train_kwargs)
        return data_loader
    
    def training_step(self, data, target):
        output = self.forward(data)
        loss   = self.loss(output, target)
        return loss
    
    def training_routine(self,
                         epochs: int = 2,
                         log_interval: int = 10,
                         lr: Union[int, float] = 1e-3,
                         batch_size: int = 48,
                         batch_maxnum: int = 0,
                         dry_run: bool = False,
                         monitor=None):
        
        self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
        
        self.optimizer = optim.RMSprop(self.model.parameters(), lr=lr)
        
        training_data = self.training_data(batch_size=batch_size)
        
        # enter PrivacyEngine
        privacy_engine = PrivacyEngine()
        self.model, self.optimizer, training_data = privacy_engine.make_private_with_epsilon(
                                                                    module=self.model,
                                                                    optimizer=self.optimizer,
                                                                    data_loader=training_data,
                                                                    epochs=epochs,
                                                                    target_epsilon=self.epsilon,
                                                                    target_delta=self.delta,
                                                                    max_grad_norm=self.max_grad_norm,
                                                                )

        for epoch in range(1, epochs + 1):
            self.model.train()
            # (below) sampling data (with `training_data` method defined on
            # researcher's notebook)
            for batch_idx, (data, target) in enumerate(tqdm(training_data)):
                #self.model.train()  # model training
                data, target = data.to(self.device), target.to(self.device)
                self.optimizer.zero_grad()
                # (below) calling method `training_step` defined on
                # researcher's notebook
                res = self.training_step(data, target)
                res.backward()
                self.optimizer.step()

                # do not take into account more than batch_maxnum
                # batches from the dataset
                if (batch_maxnum > 0) and (batch_idx >= batch_maxnum):
                    #print('Reached {} batches for this epoch, ignore remaining data'.format(batch_maxnum))
                    logger.debug('Reached {} batches for this epoch, ignore remaining data'.format(batch_maxnum))
                    break

                if batch_idx % log_interval == 0:
                    logger.info('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                        epoch,
                        batch_idx * len(data),
                        len(training_data.dataset),
                        100 * batch_idx / len(training_data),
                        res.item()))
                    eps = privacy_engine.get_epsilon(self.delta)
                    logger.info('Epsilon={:.2f}, Delta={}'.format(eps,self.delta))

                    # Send scalar values via general/feedback topic
                    if monitor is not None:
                        monitor.add_scalar('Loss', res.item(), batch_idx, epoch)

                    if dry_run:
                        return

    def save(self, filename, params: dict = None) -> None:
        if params is not None:

            # params keys are changed by the privacy engine (as _module.param_key): should be re-named
            params_keys = list(params.keys())
            for key in params_keys:
                if '_module' in key:
                    newkey = key.replace('_module.', '')
                    params[newkey] = params.pop(key)
                    
            return(torch.save(params, filename))
        else:
            return torch.save(self.state_dict(), filename)

Writing /Users/balelli/ownCloud/INRIA_EPIONE/FedBioMed/fedbiomed/var/tmp/tmpsfuj28gd/Cifar_opacus.py


This group of arguments correspond respectively:
* `model_args`: a dictionary with the arguments related to the model (e.g. number of layers, features, etc.). This will be passed to the model class on the node side. For instance, the privacy parameters should be passed here.
* `training_args`: a dictionary containing the arguments for the training routine (e.g. batch size, learning rate, epochs, etc.). This will be passed to the routine on the node side.

**NOTE:** typos and/or lack of positional (required) arguments will raise error. 🤓

In [5]:
model_args = {'num_classes': 10, 'max_grad_norm': 1.2, 'target_epsilon': 50.0, 'target_delta': 1e-5}

training_args = {
    'batch_size': 48, 
    'lr': 1e-3, 
    'epochs': 1, 
    'dry_run': False,  
    'batch_maxnum': 100 # Fast pass for development : only use ( batch_maxnum * batch_size ) samples
}

# Train the federated model

Define an experiment
- search nodes serving data for these `tags`, optionally filter on a list of node ID with `nodes`
- run a round of local training on nodes with model defined in `model_path` + federation with `aggregator`
- run for `rounds` rounds, applying the `node_selection_strategy` between the rounds

In [6]:
from fedbiomed.researcher.experiment import Experiment
from fedbiomed.researcher.aggregators.fedavg import FedAverage

tags =  ['#CIFAR10', '#dataset']
rounds = 2

exp = Experiment(tags=tags,
                 #nodes=None,
                 model_path=model_file,
                 model_args=model_args,
                 model_class='CIFAR10DPPlan',
                 training_args=training_args,
                 rounds=rounds,
                 aggregator=FedAverage(),
                 node_selection_strategy=None)

2022-01-19 15:38:07,078 fedbiomed INFO - Searching dataset with data tags: ['#CIFAR10', '#dataset'] for all nodes
2022-01-19 15:38:07,096 fedbiomed INFO - log from: node_60254e5d-4cc9-4627-a768-b0710d5c4b7d / DEBUG - Message received: {'researcher_id': 'researcher_7b720652-ae3f-487c-9395-4cbebdc044b9', 'tags': ['#CIFAR10', '#dataset'], 'command': 'search'}
2022-01-19 15:38:07,101 fedbiomed INFO - log from: node_7db84ba7-4c77-42e2-9811-3520b0777bdd / DEBUG - Message received: {'researcher_id': 'researcher_7b720652-ae3f-487c-9395-4cbebdc044b9', 'tags': ['#CIFAR10', '#dataset'], 'command': 'search'}
2022-01-19 15:38:17,085 fedbiomed INFO - Node selected for training -> node_60254e5d-4cc9-4627-a768-b0710d5c4b7d
2022-01-19 15:38:17,087 fedbiomed INFO - Node selected for training -> node_7db84ba7-4c77-42e2-9811-3520b0777bdd
2022-01-19 15:38:17,089 fedbiomed INFO - Checking data quality of federated datasets...
2022-01-19 15:38:18,589 fedbiomed DEBUG - torchnn saved model filename: /Users/bal

Let's start the experiment.

By default, this function doesn't stop until all the `rounds` are done for all the nodes

In [7]:
exp.run()

2022-01-19 15:38:29,083 fedbiomed INFO - Sampled nodes in round 0 ['node_60254e5d-4cc9-4627-a768-b0710d5c4b7d', 'node_7db84ba7-4c77-42e2-9811-3520b0777bdd']
01/19/2022 15:38:29:INFO:Sampled nodes in round 0 ['node_60254e5d-4cc9-4627-a768-b0710d5c4b7d', 'node_7db84ba7-4c77-42e2-9811-3520b0777bdd']
2022-01-19 15:38:29,085 fedbiomed INFO - Send message to node node_60254e5d-4cc9-4627-a768-b0710d5c4b7d - {'researcher_id': 'researcher_7b720652-ae3f-487c-9395-4cbebdc044b9', 'job_id': '1fc15b9f-1853-4f9f-adc0-f14f55a98ffa', 'training_args': {'batch_size': 48, 'lr': 0.001, 'epochs': 1, 'dry_run': False, 'batch_maxnum': 100}, 'model_args': {'num_classes': 10, 'max_grad_norm': 1.2, 'target_epsilon': 50.0, 'target_delta': 1e-05}, 'command': 'train', 'model_url': 'http://localhost:8844/media/uploads/2022/01/19/my_model_824834d9-5723-4162-b1dd-441be5880e51.py', 'params_url': 'http://localhost:8844/media/uploads/2022/01/19/aggregated_params_init_d899660b-3e10-48bc-9594-b817e992efc1.pt', 'model_class

2022-01-19 15:38:29,159 fedbiomed INFO - log from: node_7db84ba7-4c77-42e2-9811-3520b0777bdd / DEBUG - [TASKS QUEUE] Item:{'researcher_id': 'researcher_7b720652-ae3f-487c-9395-4cbebdc044b9', 'job_id': '1fc15b9f-1853-4f9f-adc0-f14f55a98ffa', 'params_url': 'http://localhost:8844/media/uploads/2022/01/19/aggregated_params_init_d899660b-3e10-48bc-9594-b817e992efc1.pt', 'training_args': {'batch_size': 48, 'lr': 0.001, 'epochs': 1, 'dry_run': False, 'batch_maxnum': 100}, 'training_data': {'node_7db84ba7-4c77-42e2-9811-3520b0777bdd': ['dataset_a8713194-9f39-4e93-9b6d-b0d527b0dea6']}, 'model_args': {'num_classes': 10, 'max_grad_norm': 1.2, 'target_epsilon': 50.0, 'target_delta': 1e-05}, 'model_url': 'http://localhost:8844/media/uploads/2022/01/19/my_model_824834d9-5723-4162-b1dd-441be5880e51.py', 'model_class': 'CIFAR10DPPlan', 'command': 'train'}
01/19/2022 15:38:29:INFO:log from: node_7db84ba7-4c77-42e2-9811-3520b0777bdd / DEBUG - [TASKS QUEUE] Item:{'researcher_id': 'researcher_7b720652-ae3

2022-01-19 15:44:46,345 fedbiomed INFO - log from: node_60254e5d-4cc9-4627-a768-b0710d5c4b7d / INFO - Epsilon=23.25, Delta=1e-05
01/19/2022 15:44:46:INFO:log from: node_60254e5d-4cc9-4627-a768-b0710d5c4b7d / INFO - Epsilon=23.25, Delta=1e-05
2022-01-19 15:46:14,611 fedbiomed INFO - log from: node_7db84ba7-4c77-42e2-9811-3520b0777bdd / INFO - Epsilon=23.77, Delta=1e-05
01/19/2022 15:46:14:INFO:log from: node_7db84ba7-4c77-42e2-9811-3520b0777bdd / INFO - Epsilon=23.77, Delta=1e-05
2022-01-19 15:46:46,621 fedbiomed INFO - log from: node_60254e5d-4cc9-4627-a768-b0710d5c4b7d / INFO - Epsilon=23.77, Delta=1e-05
01/19/2022 15:46:46:INFO:log from: node_60254e5d-4cc9-4627-a768-b0710d5c4b7d / INFO - Epsilon=23.77, Delta=1e-05
2022-01-19 15:48:09,169 fedbiomed INFO - log from: node_7db84ba7-4c77-42e2-9811-3520b0777bdd / INFO - Epsilon=24.29, Delta=1e-05
01/19/2022 15:48:09:INFO:log from: node_7db84ba7-4c77-42e2-9811-3520b0777bdd / INFO - Epsilon=24.29, Delta=1e-05
2022-01-19 15:48:36,573 fedbiome

2022-01-19 15:54:32,350 fedbiomed INFO - Sampled nodes in round 1 ['node_60254e5d-4cc9-4627-a768-b0710d5c4b7d', 'node_7db84ba7-4c77-42e2-9811-3520b0777bdd']
01/19/2022 15:54:32:INFO:Sampled nodes in round 1 ['node_60254e5d-4cc9-4627-a768-b0710d5c4b7d', 'node_7db84ba7-4c77-42e2-9811-3520b0777bdd']
2022-01-19 15:54:32,354 fedbiomed INFO - Send message to node node_60254e5d-4cc9-4627-a768-b0710d5c4b7d - {'researcher_id': 'researcher_7b720652-ae3f-487c-9395-4cbebdc044b9', 'job_id': '1fc15b9f-1853-4f9f-adc0-f14f55a98ffa', 'training_args': {'batch_size': 48, 'lr': 0.001, 'epochs': 1, 'dry_run': False, 'batch_maxnum': 100}, 'model_args': {'num_classes': 10, 'max_grad_norm': 1.2, 'target_epsilon': 50.0, 'target_delta': 1e-05}, 'command': 'train', 'model_url': 'http://localhost:8844/media/uploads/2022/01/19/my_model_824834d9-5723-4162-b1dd-441be5880e51.py', 'params_url': 'http://localhost:8844/media/uploads/2022/01/19/aggregated_params_dd8b9bda-13e6-4792-9dae-fb906d0d76ee.pt', 'model_class': 'C

2022-01-19 15:54:32,477 fedbiomed INFO - log from: node_7db84ba7-4c77-42e2-9811-3520b0777bdd / DEBUG - [TASKS QUEUE] Item:{'researcher_id': 'researcher_7b720652-ae3f-487c-9395-4cbebdc044b9', 'job_id': '1fc15b9f-1853-4f9f-adc0-f14f55a98ffa', 'params_url': 'http://localhost:8844/media/uploads/2022/01/19/aggregated_params_dd8b9bda-13e6-4792-9dae-fb906d0d76ee.pt', 'training_args': {'batch_size': 48, 'lr': 0.001, 'epochs': 1, 'dry_run': False, 'batch_maxnum': 100}, 'training_data': {'node_7db84ba7-4c77-42e2-9811-3520b0777bdd': ['dataset_a8713194-9f39-4e93-9b6d-b0d527b0dea6']}, 'model_args': {'num_classes': 10, 'max_grad_norm': 1.2, 'target_epsilon': 50.0, 'target_delta': 1e-05}, 'model_url': 'http://localhost:8844/media/uploads/2022/01/19/my_model_824834d9-5723-4162-b1dd-441be5880e51.py', 'model_class': 'CIFAR10DPPlan', 'command': 'train'}
01/19/2022 15:54:32:INFO:log from: node_7db84ba7-4c77-42e2-9811-3520b0777bdd / DEBUG - [TASKS QUEUE] Item:{'researcher_id': 'researcher_7b720652-ae3f-487

2022-01-19 16:00:09,621 fedbiomed INFO - log from: node_7db84ba7-4c77-42e2-9811-3520b0777bdd / INFO - Epsilon=23.25, Delta=1e-05
01/19/2022 16:00:09:INFO:log from: node_7db84ba7-4c77-42e2-9811-3520b0777bdd / INFO - Epsilon=23.25, Delta=1e-05
2022-01-19 16:01:01,181 fedbiomed INFO - log from: node_60254e5d-4cc9-4627-a768-b0710d5c4b7d / INFO - Epsilon=23.77, Delta=1e-05
01/19/2022 16:01:01:INFO:log from: node_60254e5d-4cc9-4627-a768-b0710d5c4b7d / INFO - Epsilon=23.77, Delta=1e-05
2022-01-19 16:01:26,266 fedbiomed INFO - log from: node_7db84ba7-4c77-42e2-9811-3520b0777bdd / INFO - Epsilon=23.77, Delta=1e-05
01/19/2022 16:01:26:INFO:log from: node_7db84ba7-4c77-42e2-9811-3520b0777bdd / INFO - Epsilon=23.77, Delta=1e-05
2022-01-19 16:02:15,885 fedbiomed INFO - log from: node_60254e5d-4cc9-4627-a768-b0710d5c4b7d / INFO - Epsilon=24.29, Delta=1e-05
01/19/2022 16:02:15:INFO:log from: node_60254e5d-4cc9-4627-a768-b0710d5c4b7d / INFO - Epsilon=24.29, Delta=1e-05
2022-01-19 16:02:41,821 fedbiome

Different timings (in seconds) are reported for each dataset of a node participating in a round :
- `rtime_training` real time (clock time) spent in the training function on the node
- `ptime_training` process time (user and system CPU) spent in the training function on the node
- `rtime_total` real time (clock time) spent in the researcher between sending the request and handling the response, at the `Job()` layer

In [8]:
print("\nList the training rounds : ", exp.training_replies.keys())

print("\nList the nodes for the last training round and their timings : ")
round_data = exp.training_replies[rounds - 1].data
for c in range(len(round_data)):
    print("\t- {id} :\
    \n\t\trtime_training={rtraining:.2f} seconds\
    \n\t\tptime_training={ptraining:.2f} seconds\
    \n\t\trtime_total={rtotal:.2f} seconds".format(id = round_data[c]['node_id'],
        rtraining = round_data[c]['timing']['rtime_training'],
        ptraining = round_data[c]['timing']['ptime_training'],
        rtotal = round_data[c]['timing']['rtime_total']))
print('\n')


List the training rounds :  dict_keys([0, 1])

List the nodes for the last training round and their timings : 
	- node_60254e5d-4cc9-4627-a768-b0710d5c4b7d :    
		rtime_training=743.95 seconds    
		ptime_training=789.76 seconds    
		rtime_total=766.36 seconds
	- node_7db84ba7-4c77-42e2-9811-3520b0777bdd :    
		rtime_training=765.06 seconds    
		ptime_training=806.46 seconds    
		rtime_total=785.81 seconds




# Test Model

We define a little testing routine to extract the accuracy metrics on the testing dataset
## Important
This is done to test the model because it can be accessed in a developpement environment  
In production, the data wont be accessible on the nodes, need a test dataset on the server or accessible from the server.

In [9]:

import torch
import torch.nn as nn

import torch.nn.functional as F
from torch.utils.data import DataLoader
from torchvision import transforms
from torch.utils.data import Dataset, DataLoader
import pandas as pd
import numpy as np
from PIL import Image
import os

def testing_Accuracy(model, data_loader):
    model.eval()
    test_loss = 0
    correct = 0
    
    device = "cpu"

    correct = 0
    
    loader_size = len(data_loader)
    with torch.no_grad():
        for idx, (data, target) in enumerate(data_loader):
            data, target = data.to(device), target.to(device)
            output = model(data)
            test_loss += F.nll_loss(output, target, reduction='sum').item()  # sum up batch loss
            pred = output.argmax(dim=1, keepdim=True)  # get the index of the max log-probability
            correct += pred.eq(target.view_as(pred)).sum().item()
            
            #only uses 10% of the dataset, results are similar but faster
            if idx >= loader_size / 10:
                pass
                break

    
        pred = output.argmax(dim=1, keepdim=True)

    test_loss /= len(data_loader.dataset)
    accuracy = 100* correct/(data_loader.batch_size * idx)

    return(test_loss, accuracy)

Test dataset

In [12]:
import torch
import torchvision
import torchvision.transforms as transforms
from torchvision.datasets import CIFAR10
import os

# These values, specific to the CIFAR10 dataset, are assumed to be known.
# If necessary, they can be computed with modest privacy budget.
CIFAR10_MEAN = (0.4914, 0.4822, 0.4465)
CIFAR10_STD_DEV = (0.2023, 0.1994, 0.2010)

transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(CIFAR10_MEAN, CIFAR10_STD_DEV),
])

base_dir = tmp_dir_model.name 
if not os.path.isdir(os.path.join(base_dir, "cifar10")):
    os.makedirs(os.path.join(base_dir, "cifar10"))
test_data_dir = os.path.join(base_dir, "cifar10")

test_dataset = CIFAR10(
    root=test_data_dir, train=False, download=True, transform=transform)

test_loader = torch.utils.data.DataLoader(
    test_dataset,
    batch_size=48,
    shuffle=False,
)

Files already downloaded and verified


We define a util function to calculate the accuracy:

In [13]:
def accuracy(preds, labels):
    return (preds == labels).mean()

We define the model, and we assign to it the model parameters estimated at the last federated optimization round.

In [14]:
from torchvision import models
from opacus.validators import ModuleValidator

model = models.resnet18(num_classes=10)
model = ModuleValidator.fix(model)
ModuleValidator.validate(model, strict=False)

model = exp.model_instance
model.load_state_dict(exp.aggregated_params[rounds - 1]['params'])

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)

We define a function to validate our model on our test dataset.

In [15]:
def test(model, test_loader, device):
    model.eval()
    criterion = nn.CrossEntropyLoss()
    losses = []
    top1_acc = []

    with torch.no_grad():
        for images, target in test_loader:
            images = images.to(device)
            target = target.to(device)

            output = model(images)
            loss = criterion(output, target)
            preds = np.argmax(output.detach().cpu().numpy(), axis=1)
            labels = target.detach().cpu().numpy()
            acc = accuracy(preds, labels)

            losses.append(loss.item())
            top1_acc.append(acc)

    top1_avg = np.mean(top1_acc)

    print(
        f"\tTest set:"
        f"Loss: {np.mean(losses):.6f} "
        f"Acc: {top1_avg * 100:.6f} "
    )
    return np.mean(top1_acc)

And we finally test our model!

In [16]:
top1_acc = test(model, test_loader, device)

	Test set:Loss: 2.196303 Acc: 18.042265 
