# Fedbiomed Researcher base example with OPACUS

Use for developing (autoreloads changes made across packages)

In this notebook we show how `opacus` (https://opacus.ai/) can be used in Fed-BioMed. Opacus is a library which allows to train PyTorch models with differential privacy. We will train the basic MNIST example using two nodes.

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
from fedbiomed.researcher.requests import Requests
req = Requests()
req.list(verbose=True)

2022-01-21 09:03:08,447 fedbiomed INFO - Component environment:
2022-01-21 09:03:08,449 fedbiomed INFO - - type = ComponentType.RESEARCHER
2022-01-21 09:03:09,336 fedbiomed INFO - Messaging researcher_bc5777a3-5027-465f-8742-3d3aa04a29c9 successfully connected to the message broker, object = <fedbiomed.common.messaging.Messaging object at 0x106e86a30>
2022-01-21 09:03:09,381 fedbiomed INFO - Listing available datasets in all nodes... 
2022-01-21 09:03:09,413 fedbiomed INFO - log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Message received: {'researcher_id': 'researcher_bc5777a3-5027-465f-8742-3d3aa04a29c9', 'command': 'list'}
2022-01-21 09:03:09,419 fedbiomed INFO - log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Message received: {'researcher_id': 'researcher_bc5777a3-5027-465f-8742-3d3aa04a29c9', 'command': 'list'}
2022-01-21 09:03:19,397 fedbiomed INFO - 
 Node: node_edb44109-8e5f-4741-adfb-5e68136b3bab | Number of Datasets: 1 
+--------+-------------+---

{'node_edb44109-8e5f-4741-adfb-5e68136b3bab': [{'name': 'MNIST',
   'data_type': 'default',
   'tags': ['#MNIST', '#dataset'],
   'description': '50.0 percent of MNIST database',
   'shape': [30000, 1, 28, 28]}],
 'node_8dce5575-7403-48f0-91b6-e07e58ae5c47': [{'name': 'MNIST',
   'data_type': 'default',
   'tags': ['#MNIST', '#dataset'],
   'description': '50.0 percent of MNIST database',
   'shape': [30000, 1, 28, 28]}]}

## Start the network
Before running this notebook, start the network with `./scripts/fedbiomed_run network`

## Setting the nodes up
It is necessary to previously configure a node:
1. `./scripts/fedbiomed_run node add`
  * Select option 2 (default)
  * Write MNIST to add MNIST to the node through `torchvision.datasets.MNIST`
  * Select the desired ratio of the MNIST dataset to be added to the current node
  * Confirm default tags by hitting "y" and ENTER
  * Pick the folder where MNIST is downloaded (this is due torch issue https://github.com/pytorch/vision/issues/3549)
  * Data must have been added (if you get a warning saying that data must be unique is because it's been already added)
  
2. Check that your data has been added by executing `./scripts/fedbiomed_run node list`
3. Run the node using `./scripts/fedbiomed_run node run`. Wait until you get `Starting task manager`. it means you are online.

3. Following the same procedure, create another node with MNIST.

## Define an experiment model and parameters

Declare a torch.nn MyTrainingPlan class to send for training on the node

In [3]:
from fedbiomed.researcher.environ import environ
import tempfile
tmp_dir_model = tempfile.TemporaryDirectory(dir=environ['TMP_DIR']+'/')
model_file = tmp_dir_model.name + '/class_export_mnist.py'

In the cell below, we are going to define the model using opacus for differential privacy. For this example, we are going to use the function `make_private` from `opacus.privacy_engine`. Two hyperparameters should be defined:
* `noise_multiplier`: The ratio of the standard deviation of the Gaussian noise to the L2-sensitivity of the function to which the noise is added (How much noise to add)
* `max_grad_norm`: The maximum norm of the per-sample gradients. Any gradient with norm higher than this will be clipped to this value.

It is worth noting that in order to use the opacus `PrivacyEngine` class we need to properly define as training plan attributes a `model`, a `dataloader` and an `optimizer`.

In [4]:
%%writefile "$model_file"

import torch
import torch.nn as nn
from fedbiomed.common.torchnnDP import TorchTrainingDPPlan
from torch.utils.data import DataLoader
from torchvision import datasets, transforms

# Here we define the model to be used. 
# You can use any class name (here 'Net')
class MyTrainingPlan(TorchTrainingDPPlan):
    def __init__(self, model_args):
        super(MyTrainingPlan, self).__init__()
        
        # Here we define the custom dependencies that will be needed by our custom Dataloader
        # In this case, we need the torch DataLoader classes
        # Since we will train on MNIST, we need datasets and transform from torchvision
        deps = ["from torchvision import datasets, transforms",
                "from torch.utils.data import DataLoader",]
        self.add_dependency(deps)
        
        self.diff_privacy = model_args['diff_privacy']
        self.privacy_func = model_args['privacy_func']
        self.noise_multiplier = model_args['noise_multiplier']
        self.max_grad_norm = model_args['max_grad_norm']
        
        self.model = self.make_model()
        
    def make_model(self):
        model = nn.Sequential(nn.Conv2d(1, 32, 3, 1),
                              nn.ReLU(),
                              nn.Conv2d(32, 64, 3, 1),
                              nn.ReLU(),
                              nn.MaxPool2d(2),
                              nn.Dropout(0.25),
                              nn.Flatten(),
                              nn.Linear(9216, 128),
                              nn.ReLU(),
                              nn.Dropout(0.5),
                              nn.Linear(128, 10),
                              nn.LogSoftmax(dim=1))
        return model

    def forward(self, x):
        return self.model(x)

    def training_data(self, batch_size = 48):
        # Custom torch Dataloader for MNIST data
        transform = transforms.Compose([transforms.ToTensor(),
        transforms.Normalize((0.1307,), (0.3081,))])
        dataset1 = datasets.MNIST(self.dataset_path, train=True, download=False, transform=transform)
        train_kwargs = {'batch_size': batch_size, 'shuffle': True}
        data_loader = torch.utils.data.DataLoader(dataset1, **train_kwargs)
        return data_loader
    
    def training_step(self, data, target):
        output = self.forward(data)
        loss   = torch.nn.functional.nll_loss(output, target)
        return loss

Writing /Users/balelli/ownCloud/INRIA_EPIONE/FedBioMed/fedbiomed/var/tmp/tmpwrt0_h02/class_export_mnist.py


This group of arguments correspond respectively:
* `model_args`: a dictionary with the arguments related to the model (e.g. number of layers, features, etc.). This will be passed to the model class on the node side. For instance, the privacy parameters should be passed here.
* `training_args`: a dictionary containing the arguments for the training routine (e.g. batch size, learning rate, epochs, etc.). This will be passed to the routine on the node side.

**NOTE:** typos and/or lack of positional (required) arguments will raise error. 🤓

In [5]:
model_args = {'diff_privacy': True, 'privacy_func': 'make_private', 'noise_multiplier':1., 'max_grad_norm':1.0}

training_args = {
    'batch_size': 48, 
    'lr': 1e-3, 
    'epochs': 3, 
    'dry_run': False,  
    'batch_maxnum': 250 # Fast pass for development : only use ( batch_maxnum * batch_size ) samples
}

## Declare and run the experiment

- search nodes serving data for these `tags`, optionally filter on a list of node ID with `nodes`
- run a round of local training on nodes with model defined in `model_path` + federation with `aggregator`
- run for `rounds` rounds, applying the `node_selection_strategy` between the rounds

In [6]:
from fedbiomed.researcher.experiment import Experiment
from fedbiomed.researcher.aggregators.fedavg import FedAverage

tags =  ['#MNIST', '#dataset']
rounds = 3

exp = Experiment(tags=tags,
                 #nodes=None,
                 model_path=model_file,
                 model_args=model_args,
                 model_class='MyTrainingPlan',
                 training_args=training_args,
                 rounds=rounds,
                 aggregator=FedAverage(),
                 node_selection_strategy=None)

2022-01-21 09:04:07,119 fedbiomed INFO - Searching dataset with data tags: ['#MNIST', '#dataset'] for all nodes
2022-01-21 09:04:07,134 fedbiomed INFO - log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Message received: {'researcher_id': 'researcher_bc5777a3-5027-465f-8742-3d3aa04a29c9', 'tags': ['#MNIST', '#dataset'], 'command': 'search'}
2022-01-21 09:04:07,140 fedbiomed INFO - log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Message received: {'researcher_id': 'researcher_bc5777a3-5027-465f-8742-3d3aa04a29c9', 'tags': ['#MNIST', '#dataset'], 'command': 'search'}
2022-01-21 09:04:17,130 fedbiomed INFO - Node selected for training -> node_8dce5575-7403-48f0-91b6-e07e58ae5c47
2022-01-21 09:04:17,136 fedbiomed INFO - Node selected for training -> node_edb44109-8e5f-4741-adfb-5e68136b3bab
2022-01-21 09:04:17,143 fedbiomed INFO - Checking data quality of federated datasets...
2022-01-21 09:04:17,852 fedbiomed DEBUG - torchnn saved model filename: /Users/balelli/o

Let's start the experiment.

By default, this function doesn't stop until all the `rounds` are done for all the nodes

In [7]:
exp.run()

2022-01-21 09:04:27,259 fedbiomed INFO - Sampled nodes in round 0 ['node_8dce5575-7403-48f0-91b6-e07e58ae5c47', 'node_edb44109-8e5f-4741-adfb-5e68136b3bab']
01/21/2022 09:04:27:INFO:Sampled nodes in round 0 ['node_8dce5575-7403-48f0-91b6-e07e58ae5c47', 'node_edb44109-8e5f-4741-adfb-5e68136b3bab']
2022-01-21 09:04:27,261 fedbiomed INFO - Send message to node node_8dce5575-7403-48f0-91b6-e07e58ae5c47 - {'researcher_id': 'researcher_bc5777a3-5027-465f-8742-3d3aa04a29c9', 'job_id': 'f899a968-6dad-4fe9-9f87-778158c56298', 'training_args': {'batch_size': 48, 'lr': 0.001, 'epochs': 3, 'dry_run': False, 'batch_maxnum': 250}, 'model_args': {'diff_privacy': True, 'privacy_func': 'make_private', 'noise_multiplier': 1.0, 'max_grad_norm': 1.0}, 'command': 'train', 'model_url': 'http://localhost:8844/media/uploads/2022/01/21/my_model_aa71d89a-5986-4efd-8b3c-7a8d190a7e2f.py', 'params_url': 'http://localhost:8844/media/uploads/2022/01/21/aggregated_params_init_d1333807-f435-49a1-8653-1910fb64e68c.pt',

01/21/2022 09:04:27:INFO:log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Message received: {'researcher_id': 'researcher_bc5777a3-5027-465f-8742-3d3aa04a29c9', 'job_id': 'f899a968-6dad-4fe9-9f87-778158c56298', 'training_args': {'batch_size': 48, 'lr': 0.001, 'epochs': 3, 'dry_run': False, 'batch_maxnum': 250}, 'model_args': {'diff_privacy': True, 'privacy_func': 'make_private', 'noise_multiplier': 1.0, 'max_grad_norm': 1.0}, 'command': 'train', 'model_url': 'http://localhost:8844/media/uploads/2022/01/21/my_model_aa71d89a-5986-4efd-8b3c-7a8d190a7e2f.py', 'params_url': 'http://localhost:8844/media/uploads/2022/01/21/aggregated_params_init_d1333807-f435-49a1-8653-1910fb64e68c.pt', 'model_class': 'MyTrainingPlan', 'training_data': {'node_edb44109-8e5f-4741-adfb-5e68136b3bab': ['dataset_7f491de7-9375-4f96-a07d-f58bca25deb5']}}
2022-01-21 09:04:27,357 fedbiomed INFO - log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - [TASKS QUEUE] Item:{'researcher_id': 'researcher



2022-01-21 09:07:58,365 fedbiomed INFO - log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Reached 250 batches for this epoch, ignore remaining data
01/21/2022 09:07:58:INFO:log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Reached 250 batches for this epoch, ignore remaining data
2022-01-21 09:08:00,776 fedbiomed INFO - log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Reached 250 batches for this epoch, ignore remaining data
01/21/2022 09:08:00:INFO:log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Reached 250 batches for this epoch, ignore remaining data




2022-01-21 09:11:03,033 fedbiomed INFO - log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Reached 250 batches for this epoch, ignore remaining data
01/21/2022 09:11:03:INFO:log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Reached 250 batches for this epoch, ignore remaining data
2022-01-21 09:11:04,685 fedbiomed INFO - log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Reached 250 batches for this epoch, ignore remaining data
01/21/2022 09:11:04:INFO:log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Reached 250 batches for this epoch, ignore remaining data


2022-01-21 09:14:16,967 fedbiomed INFO - log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Reached 250 batches for this epoch, ignore remaining data
01/21/2022 09:14:16:INFO:log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Reached 250 batches for this epoch, ignore remaining data
2022-01-21 09:14:18,367 fedbiomed INFO - log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / INFO - results uploaded successfully 
01/21/2022 09:14:18:INFO:log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / INFO - results uploaded successfully 
2022-01-21 09:14:19,450 fedbiomed INFO - log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Reached 250 batches for this epoch, ignore remaining data
01/21/2022 09:14:19:INFO:log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Reached 250 batches for this epoch, ignore remaining data
2022-01-21 09:14:20,956 fedbiomed INFO - log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / INFO - results uploaded successfully 
01/21

2022-01-21 09:14:28,057 fedbiomed INFO - Downloading model params after training on node_edb44109-8e5f-4741-adfb-5e68136b3bab - from http://localhost:8844/media/uploads/2022/01/21/node_params_b77ec1a5-3667-4dcc-afe0-001c34ef18a8.pt
01/21/2022 09:14:28:INFO:Downloading model params after training on node_edb44109-8e5f-4741-adfb-5e68136b3bab - from http://localhost:8844/media/uploads/2022/01/21/node_params_b77ec1a5-3667-4dcc-afe0-001c34ef18a8.pt
2022-01-21 09:14:28,321 fedbiomed INFO - Nodes that successfully reply in round 0 ['node_8dce5575-7403-48f0-91b6-e07e58ae5c47', 'node_edb44109-8e5f-4741-adfb-5e68136b3bab']
01/21/2022 09:14:28:INFO:Nodes that successfully reply in round 0 ['node_8dce5575-7403-48f0-91b6-e07e58ae5c47', 'node_edb44109-8e5f-4741-adfb-5e68136b3bab']
2022-01-21 09:14:29,198 fedbiomed INFO - Saved aggregated params for round 0 in /Users/balelli/ownCloud/INRIA_EPIONE/FedBioMed/fedbiomed/var/experiments/Experiment_0002/aggregated_params_a06a614c-4906-4a76-bc08-4ba3b68d7e3

2022-01-21 09:14:29,293 fedbiomed INFO - log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Message received: {'researcher_id': 'researcher_bc5777a3-5027-465f-8742-3d3aa04a29c9', 'job_id': 'f899a968-6dad-4fe9-9f87-778158c56298', 'training_args': {'batch_size': 48, 'lr': 0.001, 'epochs': 3, 'dry_run': False, 'batch_maxnum': 250}, 'model_args': {'diff_privacy': True, 'privacy_func': 'make_private', 'noise_multiplier': 1.0, 'max_grad_norm': 1.0}, 'command': 'train', 'model_url': 'http://localhost:8844/media/uploads/2022/01/21/my_model_aa71d89a-5986-4efd-8b3c-7a8d190a7e2f.py', 'params_url': 'http://localhost:8844/media/uploads/2022/01/21/aggregated_params_a06a614c-4906-4a76-bc08-4ba3b68d7e3b.pt', 'model_class': 'MyTrainingPlan', 'training_data': {'node_edb44109-8e5f-4741-adfb-5e68136b3bab': ['dataset_7f491de7-9375-4f96-a07d-f58bca25deb5']}}
01/21/2022 09:14:29:INFO:log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Message received: {'researcher_id': 'researcher_bc577



2022-01-21 09:17:41,158 fedbiomed INFO - log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Reached 250 batches for this epoch, ignore remaining data
01/21/2022 09:17:41:INFO:log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Reached 250 batches for this epoch, ignore remaining data
2022-01-21 09:17:41,184 fedbiomed INFO - log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Reached 250 batches for this epoch, ignore remaining data
01/21/2022 09:17:41:INFO:log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Reached 250 batches for this epoch, ignore remaining data




2022-01-21 09:20:44,296 fedbiomed INFO - log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Reached 250 batches for this epoch, ignore remaining data
01/21/2022 09:20:44:INFO:log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Reached 250 batches for this epoch, ignore remaining data
2022-01-21 09:20:45,553 fedbiomed INFO - log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Reached 250 batches for this epoch, ignore remaining data
01/21/2022 09:20:45:INFO:log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Reached 250 batches for this epoch, ignore remaining data


2022-01-21 09:23:35,114 fedbiomed INFO - log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Reached 250 batches for this epoch, ignore remaining data
01/21/2022 09:23:35:INFO:log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Reached 250 batches for this epoch, ignore remaining data
2022-01-21 09:23:36,506 fedbiomed INFO - log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / INFO - results uploaded successfully 
01/21/2022 09:23:36:INFO:log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / INFO - results uploaded successfully 
2022-01-21 09:23:38,079 fedbiomed INFO - log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Reached 250 batches for this epoch, ignore remaining data


01/21/2022 09:23:38:INFO:log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Reached 250 batches for this epoch, ignore remaining data
2022-01-21 09:23:39,430 fedbiomed INFO - log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / INFO - results uploaded successfully 
01/21/2022 09:23:39:INFO:log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / INFO - results uploaded successfully 
2022-01-21 09:23:44,643 fedbiomed INFO - Downloading model params after training on node_8dce5575-7403-48f0-91b6-e07e58ae5c47 - from http://localhost:8844/media/uploads/2022/01/21/node_params_5e4e9ba3-7467-49c3-bf6b-103ad21174f4.pt
01/21/2022 09:23:44:INFO:Downloading model params after training on node_8dce5575-7403-48f0-91b6-e07e58ae5c47 - from http://localhost:8844/media/uploads/2022/01/21/node_params_5e4e9ba3-7467-49c3-bf6b-103ad21174f4.pt
2022-01-21 09:23:44,892 fedbiomed INFO - Downloading model params after training on node_edb44109-8e5f-4741-adfb-5e68136b3bab - from http://localhost:8844/me

01/21/2022 09:23:46:INFO:log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - [TASKS QUEUE] Item:{'researcher_id': 'researcher_bc5777a3-5027-465f-8742-3d3aa04a29c9', 'job_id': 'f899a968-6dad-4fe9-9f87-778158c56298', 'params_url': 'http://localhost:8844/media/uploads/2022/01/21/aggregated_params_8c9f2031-11db-48d1-bd4e-95f6ae8ad137.pt', 'training_args': {'batch_size': 48, 'lr': 0.001, 'epochs': 3, 'dry_run': False, 'batch_maxnum': 250}, 'training_data': {'node_8dce5575-7403-48f0-91b6-e07e58ae5c47': ['dataset_7adabeb5-d317-4167-a9e5-71b3cd07ab42']}, 'model_args': {'diff_privacy': True, 'privacy_func': 'make_private', 'noise_multiplier': 1.0, 'max_grad_norm': 1.0}, 'model_url': 'http://localhost:8844/media/uploads/2022/01/21/my_model_aa71d89a-5986-4efd-8b3c-7a8d190a7e2f.py', 'model_class': 'MyTrainingPlan', 'command': 'train'}
2022-01-21 09:23:46,019 fedbiomed INFO - log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Message received: {'researcher_id': 'researcher_bc57



2022-01-21 09:26:45,167 fedbiomed INFO - log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Reached 250 batches for this epoch, ignore remaining data
01/21/2022 09:26:45:INFO:log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Reached 250 batches for this epoch, ignore remaining data
2022-01-21 09:26:48,004 fedbiomed INFO - log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Reached 250 batches for this epoch, ignore remaining data
01/21/2022 09:26:48:INFO:log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Reached 250 batches for this epoch, ignore remaining data




2022-01-21 09:30:00,610 fedbiomed INFO - log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Reached 250 batches for this epoch, ignore remaining data
01/21/2022 09:30:00:INFO:log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Reached 250 batches for this epoch, ignore remaining data
2022-01-21 09:30:03,262 fedbiomed INFO - log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Reached 250 batches for this epoch, ignore remaining data
01/21/2022 09:30:03:INFO:log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Reached 250 batches for this epoch, ignore remaining data




2022-01-21 09:33:06,682 fedbiomed INFO - log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Reached 250 batches for this epoch, ignore remaining data
01/21/2022 09:33:06:INFO:log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Reached 250 batches for this epoch, ignore remaining data
2022-01-21 09:33:07,819 fedbiomed INFO - log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / INFO - results uploaded successfully 
01/21/2022 09:33:07:INFO:log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / INFO - results uploaded successfully 
2022-01-21 09:33:09,978 fedbiomed INFO - log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Reached 250 batches for this epoch, ignore remaining data
01/21/2022 09:33:09:INFO:log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Reached 250 batches for this epoch, ignore remaining data
2022-01-21 09:33:11,116 fedbiomed INFO - log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / INFO - results uploaded successfully 
01/21

Local training results for each round and each node are available in `exp.training_replies` (index 0 to (`rounds` - 1) ).

For example you can view the training results for the last round below.

Different timings (in seconds) are reported for each dataset of a node participating in a round :
- `rtime_training` real time (clock time) spent in the training function on the node
- `ptime_training` process time (user and system CPU) spent in the training function on the node
- `rtime_total` real time (clock time) spent in the researcher between sending the request and handling the response, at the `Job()` layer

In [8]:
print("\nList the training rounds : ", exp.training_replies.keys())

print("\nList the nodes for the last training round and their timings : ")
round_data = exp.training_replies[rounds - 1].data
for c in range(len(round_data)):
    print("\t- {id} :\
    \n\t\trtime_training={rtraining:.2f} seconds\
    \n\t\tptime_training={ptraining:.2f} seconds\
    \n\t\trtime_total={rtotal:.2f} seconds".format(id = round_data[c]['node_id'],
        rtraining = round_data[c]['timing']['rtime_training'],
        ptraining = round_data[c]['timing']['ptime_training'],
        rtotal = round_data[c]['timing']['rtime_total']))
print('\n')
    
exp.training_replies[rounds - 1].dataframe


List the training rounds :  dict_keys([0, 1, 2])

List the nodes for the last training round and their timings : 
	- node_edb44109-8e5f-4741-adfb-5e68136b3bab :    
		rtime_training=560.13 seconds    
		ptime_training=675.14 seconds    
		rtime_total=570.46 seconds
	- node_8dce5575-7403-48f0-91b6-e07e58ae5c47 :    
		rtime_training=563.43 seconds    
		ptime_training=680.31 seconds    
		rtime_total=570.74 seconds




Unnamed: 0,success,msg,dataset_id,node_id,params_path,params,timing
0,True,,dataset_7f491de7-9375-4f96-a07d-f58bca25deb5,node_edb44109-8e5f-4741-adfb-5e68136b3bab,/Users/balelli/ownCloud/INRIA_EPIONE/FedBioMed...,"{'model._module.0.weight': [[tensor([[-0.3079,...","{'rtime_training': 560.1312185659999, 'ptime_t..."
1,True,,dataset_7adabeb5-d317-4167-a9e5-71b3cd07ab42,node_8dce5575-7403-48f0-91b6-e07e58ae5c47,/Users/balelli/ownCloud/INRIA_EPIONE/FedBioMed...,"{'model._module.0.weight': [[tensor([[-0.3063,...","{'rtime_training': 563.4319361919997, 'ptime_t..."


Federated parameters for each round are available in `exp.aggregated_params` (index 0 to (`rounds` - 1) ).

For example you can view the federated parameters for the last round of the experiment :

In [9]:
print("\nList the training rounds : ", exp.aggregated_params.keys())

print("\nAccess the federated params for the last training round :")
print("\t- params_path: ", exp.aggregated_params[rounds - 1]['params_path'])
print("\t- parameter data: ", exp.aggregated_params[rounds - 1]['params'].keys())



List the training rounds :  dict_keys([0, 1, 2])

Access the federated params for the last training round :
	- params_path:  /Users/balelli/ownCloud/INRIA_EPIONE/FedBioMed/fedbiomed/var/experiments/Experiment_0002/aggregated_params_13813343-e815-493f-9130-b5d3ee09521d.pt
	- parameter data:  odict_keys(['model.0.weight', 'model.0.bias', 'model.2.weight', 'model.2.bias', 'model.7.weight', 'model.7.bias', 'model.10.weight', 'model.10.bias'])


# Testing

We define a little testing routine to extract the accuracy metrics on the testing dataset

In [10]:
import torch
import torch.nn.functional as F


def testing_Accuracy(model, data_loader):
    model.eval()
    test_loss = 0
    correct = 0
    device = 'cpu'

    correct = 0
    
    with torch.no_grad():
        for data, target in data_loader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            test_loss += F.nll_loss(output, target, reduction='sum').item()  # sum up batch loss
            pred = output.argmax(dim=1, keepdim=True)  # get the index of the max log-probability
            correct += pred.eq(target.view_as(pred)).sum().item()

        pred = output.argmax(dim=1, keepdim=True)

    test_loss /= len(data_loader.dataset)
    accuracy = 100* correct/len(data_loader.dataset)

    return(test_loss, accuracy)

In [11]:
from torchvision import datasets, transforms
import os

local_mnist = os.path.join(environ['TMP_DIR'], 'local_mnist')

transform = transforms.Compose([
            transforms.ToTensor(),
            transforms.Normalize((0.1307,), (0.3081,))
        ])

test_set = datasets.MNIST(root = local_mnist, download = True, train = False, transform = transform)
test_loader = torch.utils.data.DataLoader(test_set, batch_size=64, shuffle=True)

fed_model = exp.model_instance
fed_model.load_state_dict(exp.aggregated_params[rounds - 1]['params'])

acc_federated = testing_Accuracy(fed_model, test_loader)

print('\nAccuracy federated training:  {:.4f}'.format(acc_federated[1]))

print('\nError federated training:  {:.4f}'.format(acc_federated[0]))


Accuracy federated training:  86.4100

Error federated training:  0.5747


2022-01-21 09:43:05,118 fedbiomed INFO - log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Message received: {'researcher_id': 'researcher_bc5777a3-5027-465f-8742-3d3aa04a29c9', 'tags': ['#MNIST', '#dataset'], 'command': 'search'}
01/21/2022 09:43:05:INFO:log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Message received: {'researcher_id': 'researcher_bc5777a3-5027-465f-8742-3d3aa04a29c9', 'tags': ['#MNIST', '#dataset'], 'command': 'search'}
2022-01-21 09:43:05,134 fedbiomed INFO - log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Message received: {'researcher_id': 'researcher_bc5777a3-5027-465f-8742-3d3aa04a29c9', 'tags': ['#MNIST', '#dataset'], 'command': 'search'}
01/21/2022 09:43:05:INFO:log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Message received: {'researcher_id': 'researcher_bc5777a3-5027-465f-8742-3d3aa04a29c9', 'tags': ['#MNIST', '#dataset'], 'command': 'search'}
2022-01-21 09:43:33,430 fedbiomed INFO - log from: node_edb4

2022-01-21 09:43:34,805 fedbiomed INFO - log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Dataset_path/Users/balelli/data
01/21/2022 09:43:34:INFO:log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Dataset_path/Users/balelli/data
2022-01-21 09:44:07,028 fedbiomed INFO - log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Reached 100 batches for this epoch, ignore remaining data
01/21/2022 09:44:07:INFO:log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Reached 100 batches for this epoch, ignore remaining data
2022-01-21 09:44:07,079 fedbiomed INFO - log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Reached 100 batches for this epoch, ignore remaining data
01/21/2022 09:44:07:INFO:log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Reached 100 batches for this epoch, ignore remaining data


2022-01-21 09:44:38,466 fedbiomed INFO - log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Reached 100 batches for this epoch, ignore remaining data
01/21/2022 09:44:38:INFO:log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Reached 100 batches for this epoch, ignore remaining data
2022-01-21 09:44:38,863 fedbiomed INFO - log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Reached 100 batches for this epoch, ignore remaining data
01/21/2022 09:44:38:INFO:log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Reached 100 batches for this epoch, ignore remaining data


2022-01-21 09:45:02,886 fedbiomed INFO - log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Reached 100 batches for this epoch, ignore remaining data
01/21/2022 09:45:02:INFO:log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Reached 100 batches for this epoch, ignore remaining data
2022-01-21 09:45:03,556 fedbiomed INFO - log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Reached 100 batches for this epoch, ignore remaining data
01/21/2022 09:45:03:INFO:log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Reached 100 batches for this epoch, ignore remaining data
2022-01-21 09:45:05,283 fedbiomed INFO - log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / INFO - results uploaded successfully 
01/21/2022 09:45:05:INFO:log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / INFO - results uploaded successfully 
2022-01-21 09:45:05,536 fedbiomed INFO - log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / INFO - results uploaded successfully 
01/21

01/21/2022 09:45:15:INFO:log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - [TASKS QUEUE] Item:{'researcher_id': 'researcher_bc5777a3-5027-465f-8742-3d3aa04a29c9', 'job_id': 'c9be5f7d-533e-4224-b0df-7a1169c0e542', 'params_url': 'http://localhost:8844/media/uploads/2022/01/21/aggregated_params_f5259ecc-10f4-4358-a2d9-a3253052357b.pt', 'training_args': {'batch_size': 48, 'lr': 0.001, 'epochs': 3, 'dry_run': False, 'batch_maxnum': 100}, 'training_data': {'node_8dce5575-7403-48f0-91b6-e07e58ae5c47': ['dataset_7adabeb5-d317-4167-a9e5-71b3cd07ab42']}, 'model_args': {}, 'model_url': 'http://localhost:8844/media/uploads/2022/01/21/my_model_80387a84-c8eb-4f87-aaae-d6accbef52ef.py', 'model_class': 'MyTrainingPlan', 'command': 'train'}
2022-01-21 09:45:16,851 fedbiomed INFO - log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / INFO - {'monitor': <fedbiomed.node.history_monitor.HistoryMonitor object at 0x130c8fb50>, 'batch_size': 48, 'lr': 0.001, 'epochs': 3, 'dry_run': False, 'batch_

2022-01-21 09:45:51,282 fedbiomed INFO - log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Reached 100 batches for this epoch, ignore remaining data
01/21/2022 09:45:51:INFO:log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Reached 100 batches for this epoch, ignore remaining data
2022-01-21 09:46:23,416 fedbiomed INFO - log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Reached 100 batches for this epoch, ignore remaining data
01/21/2022 09:46:23:INFO:log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Reached 100 batches for this epoch, ignore remaining data
2022-01-21 09:46:25,400 fedbiomed INFO - log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Reached 100 batches for this epoch, ignore remaining data
01/21/2022 09:46:25:INFO:log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Reached 100 batches for this epoch, ignore remaining data


2022-01-21 09:46:48,072 fedbiomed INFO - log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Reached 100 batches for this epoch, ignore remaining data
01/21/2022 09:46:48:INFO:log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Reached 100 batches for this epoch, ignore remaining data
2022-01-21 09:46:49,719 fedbiomed INFO - log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / INFO - results uploaded successfully 
01/21/2022 09:46:49:INFO:log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / INFO - results uploaded successfully 
2022-01-21 09:46:50,563 fedbiomed INFO - log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Reached 100 batches for this epoch, ignore remaining data
01/21/2022 09:46:50:INFO:log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Reached 100 batches for this epoch, ignore remaining data
2022-01-21 09:46:52,579 fedbiomed INFO - log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / INFO - results uploaded successfully 
01/21

01/21/2022 09:47:02:INFO:log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Message received: {'researcher_id': 'researcher_bc5777a3-5027-465f-8742-3d3aa04a29c9', 'job_id': 'c9be5f7d-533e-4224-b0df-7a1169c0e542', 'training_args': {'batch_size': 48, 'lr': 0.001, 'epochs': 3, 'dry_run': False, 'batch_maxnum': 100}, 'model_args': {}, 'command': 'train', 'model_url': 'http://localhost:8844/media/uploads/2022/01/21/my_model_80387a84-c8eb-4f87-aaae-d6accbef52ef.py', 'params_url': 'http://localhost:8844/media/uploads/2022/01/21/aggregated_params_e52a3826-ace9-4217-9ee5-f02dba48c9a6.pt', 'model_class': 'MyTrainingPlan', 'training_data': {'node_8dce5575-7403-48f0-91b6-e07e58ae5c47': ['dataset_7adabeb5-d317-4167-a9e5-71b3cd07ab42']}}
2022-01-21 09:47:02,509 fedbiomed INFO - log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - [TASKS QUEUE] Item:{'researcher_id': 'researcher_bc5777a3-5027-465f-8742-3d3aa04a29c9', 'job_id': 'c9be5f7d-533e-4224-b0df-7a1169c0e542', 'params_url': 

2022-01-21 09:47:26,462 fedbiomed INFO - log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Reached 100 batches for this epoch, ignore remaining data
01/21/2022 09:47:26:INFO:log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Reached 100 batches for this epoch, ignore remaining data
2022-01-21 09:47:27,103 fedbiomed INFO - log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Reached 100 batches for this epoch, ignore remaining data
01/21/2022 09:47:27:INFO:log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Reached 100 batches for this epoch, ignore remaining data
2022-01-21 09:47:54,195 fedbiomed INFO - log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Reached 100 batches for this epoch, ignore remaining data
01/21/2022 09:47:54:INFO:log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Reached 100 batches for this epoch, ignore remaining data
2022-01-21 09:47:57,394 fedbiomed INFO - log from: node_8dce5575-7403-48f0-91b6-e0

2022-01-21 09:48:18,682 fedbiomed INFO - log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Reached 100 batches for this epoch, ignore remaining data
01/21/2022 09:48:18:INFO:log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / DEBUG - Reached 100 batches for this epoch, ignore remaining data
2022-01-21 09:48:20,469 fedbiomed INFO - log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / INFO - results uploaded successfully 
01/21/2022 09:48:20:INFO:log from: node_edb44109-8e5f-4741-adfb-5e68136b3bab / INFO - results uploaded successfully 
2022-01-21 09:48:21,156 fedbiomed INFO - log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Reached 100 batches for this epoch, ignore remaining data
01/21/2022 09:48:21:INFO:log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / DEBUG - Reached 100 batches for this epoch, ignore remaining data
2022-01-21 09:48:23,012 fedbiomed INFO - log from: node_8dce5575-7403-48f0-91b6-e07e58ae5c47 / INFO - results uploaded successfully 
01/21