# Missing data imputation with Fedbiomed using MIWAE

In this notebook we show how to impute missing not at random (MAR) data in a federated setting using MIWAE (https://arxiv.org/abs/2006.12871). 

In [1]:
%load_ext autoreload
%autoreload 2

## Prepare the data

For this experiment we will use the breast cancer data from sklearn.

In [2]:
import pandas as pd
import numpy as np

data_url = "http://lib.stat.cmu.edu/datasets/boston"
raw_df = pd.read_csv(data_url, sep="\s+", skiprows=22, header=None)
data = np.hstack([raw_df.values[::2, :], raw_df.values[1::2, :2]])
target = raw_df.values[1::2, 2]

In [3]:
from sklearn.model_selection import train_test_split

data_train, data_test, labels_train, labels_test = train_test_split(data, target, test_size=0.20, random_state=42)
df_data_train = pd.DataFrame(data_train)
N_train = len(df_data_train)
client_1, client_2, client_3 = np.split(df_data_train.sample(frac=1), \
                                        [int(.33*N_train), int(.66*len(df_data_train))])

Clients_data=[client_1, client_2, client_3]

# from each dataset we will remove randomly 50% of data
np.random.seed(1234)

perc_miss = 0.5 # 50% of missing data

Clients_missing = []
for c in Clients_data:
    n = c.shape[0] # number of observations
    p = c.shape[1] # number of features
    xmiss = np.copy(c)
    xmiss = (xmiss - np.mean(xmiss,0))/np.std(xmiss,0)
    xmiss_flat = xmiss.flatten()
    miss_pattern = np.random.choice(n*p, np.floor(n*p*perc_miss).astype(np.int_),\
                                    replace=False)
    xmiss_flat[miss_pattern] = np.nan 
    xmiss = xmiss_flat.reshape([n,p]) # in xmiss, the missing values are represented by nans
    mask = np.isfinite(xmiss) # binary mask that indicates which values are missing
    Clients_missing.append(xmiss)

import os 
os.makedirs('clients_data', exist_ok=True) 
for i in range(len(Clients_missing)):
    pd.DataFrame(Clients_missing[i]).to_csv('clients_data/client_'+str(i+1)+'.csv',index=False)

## Start the network
Before running this notebook, start the network with `./scripts/fedbiomed_run network`

## Setting the nodes up
It is necessary to previously configure a node:
1. `./scripts/fedbiomed_run node add`
  * Select option 1 (csv) to add client_1 dataset to the first node
  * Provide the correct tag by entering:  breast_cancer
  * Pick the folder where client_1 dataset has been saved
  * Data must have been added (if you get a warning saying that data must be unique is because it's been already added)
  
2. Check that your data has been added by executing `./scripts/fedbiomed_run node list`
3. Run the node using `./scripts/fedbiomed_run node start`. Wait until you get `Starting task manager`. it means you are online.
4. Following the same procedure, you can create additional nodes for clients 2 and 3.

Check available clients:

In [4]:
from fedbiomed.researcher.requests import Requests
req = Requests()
req.list(verbose=True)
xx = req.list()
dataset_size = [xx[i][0]['shape'][1] for i in xx]
assert min(dataset_size)==max(dataset_size)
data_size = dataset_size[0]

2022-04-20 16:34:54,351 fedbiomed INFO - Component environment:
2022-04-20 16:34:54,353 fedbiomed INFO - type = ComponentType.RESEARCHER
2022-04-20 16:34:54,386 fedbiomed INFO - Messaging researcher_aaf86456-e652-46b0-8054-b7bb516705db successfully connected to the message broker, object = <fedbiomed.common.messaging.Messaging object at 0x11a2070a0>
2022-04-20 16:34:54,502 fedbiomed INFO - Listing available datasets in all nodes... 
2022-04-20 16:35:04,521 fedbiomed INFO - 
 Node: node_13d7233c-daad-49e1-8f1c-c8dbac2aa845 | Number of Datasets: 1 
+---------------+-------------+-------------------+---------------+-----------+
| name          | data_type   | tags              | description   | shape     |
| breast_cancer | csv         | ['breast_cancer'] | breast_cancer | [134, 13] |
+---------------+-------------+-------------------+---------------+-----------+

2022-04-20 16:35:04,523 fedbiomed INFO - 
 Node: node_1ff16015-8a76-43a9-a0c9-9d9f9167f500 | Number of Datasets: 1 
+---------

## Define an experiment model and parameters

Declare a torch.nn MIWAETrainingPlan class to send for training on the node

Note : write **only** the code to export in the following cell

In [5]:
import torch
import torch.nn as nn
import torchvision
from torchvision import datasets, transforms
import numpy as np
import torch.distributions as td
import pandas as pd

from fedbiomed.common.training_plans import TorchTrainingPlan
from fedbiomed.common.data import DataManager
from fedbiomed.common.constants import ProcessTypes

# Here we define the model to be used. 
# You can use any class name (here 'Net')
class MIWAETrainingPlan(TorchTrainingPlan):
    def __init__(self, model_args: dict = {}):
        super(MIWAETrainingPlan, self).__init__(model_args)
        
        # Here we define the custom dependencies that will be needed by our custom Dataloader
        deps = ["from torchvision import datasets, transforms",
               "import torch.distributions as td",
               "import pandas as pd",
               "import numpy as np"]
        
        self.n_features=model_args['n_features']
        self.n_latent=model_args['n_latent']
        self.n_hidden=model_args['n_hidden']
        self.n_samples=model_args['n_samples']
        
        self.add_dependency(deps)
        
        # the encoder will output both the mean and the diagonal covariance
        self.encoder=nn.Sequential(
                        torch.nn.Linear(self.n_features, self.n_hidden),
                        torch.nn.ReLU(),
                        torch.nn.Linear(self.n_hidden, self.n_hidden),
                        torch.nn.ReLU(),
                        torch.nn.Linear(self.n_hidden, 2*self.n_latent),  
                        )
        # the decoder will output both the mean, the scale, 
        # and the number of degrees of freedoms (hence the 3*p)
        self.decoder = nn.Sequential(
                        torch.nn.Linear(self.n_latent, self.n_hidden),
                        torch.nn.ReLU(),
                        torch.nn.Linear(self.n_hidden, self.n_hidden),
                        torch.nn.ReLU(),
                        torch.nn.Linear(self.n_hidden, 3*self.n_features),  
                        )
        
        self.optimizer = torch.optim.Adam(list(self.encoder.parameters()) \
                                    + list(self.decoder.parameters()),lr=1e-3)
              
        self.encoder.apply(self.weights_init)
        self.decoder.apply(self.weights_init)
    
    def weights_init(self,layer):
        if type(layer) == nn.Linear: torch.nn.init.orthogonal_(layer.weight)
    
    def miwae_loss(self,iota_x,mask):
        batch_size = iota_x.shape[0]
        out_encoder = self.encoder(iota_x)
        # prior
        p_z = td.Independent(td.Normal(loc=torch.zeros(self.n_latent).to(self.device)\
                                       ,scale=torch.ones(self.n_latent).to(self.device)),1)
        
        q_zgivenxobs = td.Independent(td.Normal(loc=out_encoder[..., :self.n_latent],\
                                                scale=torch.nn.Softplus()\
                                                (out_encoder[..., self.n_latent:\
                                                             (2*self.n_latent)])),1)

        zgivenx = q_zgivenxobs.rsample([self.n_samples])
        zgivenx_flat = zgivenx.reshape([self.n_samples*batch_size,self.n_latent])

        out_decoder = self.decoder(zgivenx_flat)
        all_means_obs_model = out_decoder[..., :self.n_features]
        all_scales_obs_model = torch.nn.Softplus()(out_decoder[..., self.n_features:\
                                                               (2*self.n_features)]) + 0.001
        all_degfreedom_obs_model = torch.nn.Softplus()\
        (out_decoder[..., (2*self.n_features):(3*self.n_features)]) + 3

        data_flat = torch.Tensor.repeat(iota_x,[self.n_samples,1]).reshape([-1,1])
        tiledmask = torch.Tensor.repeat(mask,[self.n_samples,1])

        all_log_pxgivenz_flat = torch.distributions.StudentT\
        (loc=all_means_obs_model.reshape([-1,1]),\
         scale=all_scales_obs_model.reshape([-1,1]),\
         df=all_degfreedom_obs_model.reshape([-1,1])).log_prob(data_flat)
        all_log_pxgivenz = all_log_pxgivenz_flat.reshape([self.n_samples*batch_size,self.n_features])

        logpxobsgivenz = torch.sum(all_log_pxgivenz*tiledmask,1).reshape([self.n_samples,batch_size])
        logpz = p_z.log_prob(zgivenx)
        logq = q_zgivenxobs.log_prob(zgivenx)

        neg_bound = -torch.mean(torch.logsumexp(logpxobsgivenz + logpz - logq,0))

        return neg_bound

    def training_data(self,  batch_size = 48):
        
        df = pd.read_csv(self.dataset_path, sep=',', index_col=False)
        x_train = df.values
        x_mask = np.isfinite(x_train)
        # xhat_0: missing values are replaced by zeros. 
        #This x_hat0 is what will be fed to our encoder.
        xhat_0 = np.copy(x_train)
        xhat_0[np.isnan(x_train)] = 0
        train_kwargs = {'batch_size': batch_size, 'shuffle': True}
        
        data_manager = DataManager(dataset=xhat_0 , target=x_mask , **train_kwargs)
        
        return data_manager
    
    def training_step(self, data, mask):
        self.encoder.zero_grad()
        self.decoder.zero_grad()
        loss = self.miwae_loss(iota_x = data,mask = mask)
        return loss

This group of arguments correspond respectively:
* `model_args`: a dictionary with the arguments related to the model (e.g. number of layers, features, etc.). This will be passed to the model class on the node side. 
* `training_args`: a dictionary containing the arguments for the training routine (e.g. batch size, learning rate, epochs, etc.). This will be passed to the routine on the node side.
* data `tags` to search nodes for training.
* total number of `rounds`.
If FedProx optimisation is requested, `fedprox_mu` parameter must be defined here. It also must be a float between XX and YY.

**NOTE:** typos and/or lack of positional (required) arguments will raise error. 🤓

In [6]:
h = 128 # number of hidden units in (same for all MLPs)
d = 10 # dimension of the latent space, we choose d=1 for visualisation purposes
K = 20 # number of IS during training

n_epochs=5

model_args = {'n_features':data_size, 'n_latent':d,'n_hidden':h,'n_samples':K}

training_args = {
    'batch_size': 48, 
    'lr': 1e-3, 
    #'fedprox_mu': 0.01, 
    'log_interval' : 1,
    'epochs': n_epochs, 
    'dry_run': False,  
    'batch_maxnum': 200 # Fast pass for development : only use ( batch_maxnum * batch_size ) samples
}

tags =  ['breast_cancer']
rounds = 15

## Declare and run the experiment

- search nodes serving data for these `tags`, optionally filter on a list of node ID with `nodes`
- run a round of local training on nodes with model defined in `model_path` + federation with `aggregator`
- run for `round_limit` rounds, applying the `node_selection_strategy` between the rounds

In [7]:
from fedbiomed.researcher.experiment import Experiment
from fedbiomed.researcher.aggregators.fedavg import FedAverage

exp = Experiment(tags=tags,
                 model_args=model_args,
                 model_class=MIWAETrainingPlan,
                 training_args=training_args,
                 round_limit=rounds,
                 aggregator=FedAverage(),
                 node_selection_strategy=None)

2022-04-20 16:35:40,758 fedbiomed INFO - Searching dataset with data tags: ['breast_cancer'] for all nodes
2022-04-20 16:35:50,780 fedbiomed INFO - Node selected for training -> node_13d7233c-daad-49e1-8f1c-c8dbac2aa845
2022-04-20 16:35:50,781 fedbiomed INFO - Node selected for training -> node_1ff16015-8a76-43a9-a0c9-9d9f9167f500
2022-04-20 16:35:50,781 fedbiomed INFO - Node selected for training -> node_8a14aca2-59e6-45fd-b00a-4c74206b334f
2022-04-20 16:35:50,792 fedbiomed INFO - Checking data quality of federated datasets...
2022-04-20 16:35:50,936 fedbiomed DEBUG - Model file has been saved: /Users/balelli/ownCloud/INRIA_EPIONE/FedBioMed/fedbiomed/var/experiments/Experiment_0016/my_model_6088de6e-6e59-40c6-8f72-66832340ebbf.py
2022-04-20 16:35:51,033 fedbiomed DEBUG - upload (HTTP POST request) of file /Users/balelli/ownCloud/INRIA_EPIONE/FedBioMed/fedbiomed/var/experiments/Experiment_0016/my_model_6088de6e-6e59-40c6-8f72-66832340ebbf.py successful, with status code 201
2022-04-20 

Let's start the experiment.

By default, this function doesn't stop until all the `round_limit` rounds are done for all the nodes

In [8]:
exp.run()

2022-04-20 16:35:56,506 fedbiomed INFO - Sampled nodes in round 0 ['node_13d7233c-daad-49e1-8f1c-c8dbac2aa845', 'node_1ff16015-8a76-43a9-a0c9-9d9f9167f500', 'node_8a14aca2-59e6-45fd-b00a-4c74206b334f']
2022-04-20 16:35:56,507 fedbiomed INFO - [1mSending request[0m 
					[1m To[0m: node_13d7233c-daad-49e1-8f1c-c8dbac2aa845 
					[1m Request: [0m: Perform training with the arguments: {'researcher_id': 'researcher_aaf86456-e652-46b0-8054-b7bb516705db', 'job_id': 'f7104d85-2919-4baf-b5f6-871d7f1c2f10', 'training_args': {'test_ratio': 0.0, 'test_on_local_updates': False, 'test_on_global_updates': False, 'test_metric': None, 'test_metric_args': {}, 'batch_size': 48, 'lr': 0.001, 'log_interval': 1, 'epochs': 5, 'dry_run': False, 'batch_maxnum': 200}, 'training': True, 'model_args': {'n_features': 13, 'n_latent': 10, 'n_hidden': 128, 'n_samples': 20}, 'command': 'train', 'model_url': 'http://localhost:8844/media/uploads/2022/04/20/my_model_6088de6e-6e59-40c6-8f72-66832340ebbf.py', 'param

2022-04-20 16:35:58,014 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_8a14aca2-59e6-45fd-b00a-4c74206b334f 
					 Epoch: 2 | Completed: 48/138 (33%) 
 					 Loss: [1m6.500494[0m 
					 ---------
2022-04-20 16:35:58,043 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_8a14aca2-59e6-45fd-b00a-4c74206b334f 
					 Epoch: 2 | Completed: 96/138 (67%) 
 					 Loss: [1m5.707023[0m 
					 ---------
2022-04-20 16:35:58,046 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_1ff16015-8a76-43a9-a0c9-9d9f9167f500 
					 Epoch: 2 | Completed: 48/133 (33%) 
 					 Loss: [1m5.928631[0m 
					 ---------
2022-04-20 16:35:58,078 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_8a14aca2-59e6-45fd-b00a-4c74206b334f 
					 Epoch: 2 | Completed: 126/138 (100%) 
 					 Loss: [1m5.895255[0m 
					 ---------
2022-04-20 16:35:58,108 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_8a14aca2-59e6-45fd-b00a-4c74206b334f 
					 Epoch: 3 | Completed: 48/138 (33%) 
 					 Loss: [

2022-04-20 16:36:06,600 fedbiomed DEBUG - upload (HTTP GET request) of file node_params_8b53ae96-c1b0-4325-8983-f7bc6194d772.pt successful, with status code 200
2022-04-20 16:36:06,621 fedbiomed INFO - Downloading model params after training on node_8a14aca2-59e6-45fd-b00a-4c74206b334f - from http://localhost:8844/media/uploads/2022/04/20/node_params_fed05b37-73d8-428d-990a-9a0b0a65e219.pt
2022-04-20 16:36:06,661 fedbiomed DEBUG - upload (HTTP GET request) of file node_params_128d3722-dee2-42ec-9b9f-27e40de2b99e.pt successful, with status code 200
2022-04-20 16:36:06,667 fedbiomed INFO - Downloading model params after training on node_1ff16015-8a76-43a9-a0c9-9d9f9167f500 - from http://localhost:8844/media/uploads/2022/04/20/node_params_780ac2cf-f87a-4b75-a1fc-ae9c9c078f1a.pt
2022-04-20 16:36:06,698 fedbiomed DEBUG - upload (HTTP GET request) of file node_params_74825ca7-7a2f-4d0f-a745-50efe779a121.pt successful, with status code 200
2022-04-20 16:36:06,709 fedbiomed INFO - Nodes that s

2022-04-20 16:36:07,142 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_13d7233c-daad-49e1-8f1c-c8dbac2aa845 
					 Epoch: 1 | Completed: 111/133 (100%) 
 					 Loss: [1m5.217207[0m 
					 ---------
2022-04-20 16:36:07,160 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_13d7233c-daad-49e1-8f1c-c8dbac2aa845 
					 Epoch: 2 | Completed: 48/133 (33%) 
 					 Loss: [1m6.215672[0m 
					 ---------
2022-04-20 16:36:07,162 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_1ff16015-8a76-43a9-a0c9-9d9f9167f500 
					 Epoch: 2 | Completed: 48/133 (33%) 
 					 Loss: [1m5.323886[0m 
					 ---------
					[1m NODE[0m node_8a14aca2-59e6-45fd-b00a-4c74206b334f
					[1m MESSAGE:[0m There is no test activated for the round. Please set flag for `test_on_global_updates`, `test_on_local_updates`, or both. Splitting dataset for testing will be ignored[0m
-----------------------------------------------------------------
2022-04-20 16:36:07,167 fedbiomed INFO - [1mINFO[0m
				

2022-04-20 16:36:07,605 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_13d7233c-daad-49e1-8f1c-c8dbac2aa845 
					 Epoch: 5 | Completed: 111/133 (100%) 
 					 Loss: [1m6.421310[0m 
					 ---------
2022-04-20 16:36:07,628 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_8a14aca2-59e6-45fd-b00a-4c74206b334f 
					 Epoch: 5 | Completed: 48/138 (33%) 
 					 Loss: [1m3.849918[0m 
					 ---------
2022-04-20 16:36:07,670 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_8a14aca2-59e6-45fd-b00a-4c74206b334f 
					 Epoch: 5 | Completed: 96/138 (67%) 
 					 Loss: [1m5.242220[0m 
					 ---------
2022-04-20 16:36:07,728 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_8a14aca2-59e6-45fd-b00a-4c74206b334f 
					 Epoch: 5 | Completed: 126/138 (100%) 
 					 Loss: [1m5.606287[0m 
					 ---------
2022-04-20 16:36:07,924 fedbiomed INFO - [1mINFO[0m
					[1m NODE[0m node_1ff16015-8a76-43a9-a0c9-9d9f9167f500
					[1m MESSAGE:[0m results uploaded successfully [0m

2022-04-20 16:36:17,371 fedbiomed INFO - [1mINFO[0m
					[1m NODE[0m node_1ff16015-8a76-43a9-a0c9-9d9f9167f500
					[1m MESSAGE:[0m training with arguments {'history_monitor': <fedbiomed.node.history_monitor.HistoryMonitor object at 0x13206fc70>, 'node_args': {'gpu': False, 'gpu_num': None, 'gpu_only': False}, 'batch_size': 48, 'lr': 0.001, 'log_interval': 1, 'epochs': 5, 'dry_run': False, 'batch_maxnum': 200}[0m
-----------------------------------------------------------------
2022-04-20 16:36:17,381 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_8a14aca2-59e6-45fd-b00a-4c74206b334f 
					 Epoch: 1 | Completed: 48/138 (33%) 
 					 Loss: [1m3.974263[0m 
					 ---------
2022-04-20 16:36:17,512 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_8a14aca2-59e6-45fd-b00a-4c74206b334f 
					 Epoch: 1 | Completed: 96/138 (67%) 
 					 Loss: [1m3.410810[0m 
					 ---------
2022-04-20 16:36:17,514 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_1ff16015-8a76-43a9-

2022-04-20 16:36:17,961 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_8a14aca2-59e6-45fd-b00a-4c74206b334f 
					 Epoch: 5 | Completed: 48/138 (33%) 
 					 Loss: [1m3.696109[0m 
					 ---------
2022-04-20 16:36:17,964 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_1ff16015-8a76-43a9-a0c9-9d9f9167f500 
					 Epoch: 4 | Completed: 111/133 (100%) 
 					 Loss: [1m5.416394[0m 
					 ---------
2022-04-20 16:36:17,984 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_13d7233c-daad-49e1-8f1c-c8dbac2aa845 
					 Epoch: 4 | Completed: 111/133 (100%) 
 					 Loss: [1m4.856227[0m 
					 ---------
2022-04-20 16:36:17,994 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_8a14aca2-59e6-45fd-b00a-4c74206b334f 
					 Epoch: 5 | Completed: 96/138 (67%) 
 					 Loss: [1m4.743807[0m 
					 ---------
2022-04-20 16:36:18,024 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_13d7233c-daad-49e1-8f1c-c8dbac2aa845 
					 Epoch: 5 | Completed: 48/133 (33%) 
 					 Loss: 

2022-04-20 16:36:27,441 fedbiomed DEBUG - researcher_aaf86456-e652-46b0-8054-b7bb516705db
					[1m NODE[0m node_13d7233c-daad-49e1-8f1c-c8dbac2aa845
					[1m MESSAGE:[0m There is no test activated for the round. Please set flag for `test_on_global_updates`, `test_on_local_updates`, or both. Splitting dataset for testing will be ignored[0m
-----------------------------------------------------------------
2022-04-20 16:36:27,574 fedbiomed INFO - [1mINFO[0m
					[1m NODE[0m node_13d7233c-daad-49e1-8f1c-c8dbac2aa845
					[1m MESSAGE:[0m training with arguments {'history_monitor': <fedbiomed.node.history_monitor.HistoryMonitor object at 0x12f418ca0>, 'node_args': {'gpu': False, 'gpu_num': None, 'gpu_only': False}, 'batch_size': 48, 'lr': 0.001, 'log_interval': 1, 'epochs': 5, 'dry_run': False, 'batch_maxnum': 200}[0m
-----------------------------------------------------------------
					[1m NODE[0m node_8a14aca2-59e6-45fd-b00a-4c74206b334f
					[1m MESSAGE:[0m There is no te

2022-04-20 16:36:27,929 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_13d7233c-daad-49e1-8f1c-c8dbac2aa845 
					 Epoch: 4 | Completed: 48/133 (33%) 
 					 Loss: [1m3.523148[0m 
					 ---------
2022-04-20 16:36:27,937 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_8a14aca2-59e6-45fd-b00a-4c74206b334f 
					 Epoch: 4 | Completed: 48/138 (33%) 
 					 Loss: [1m3.230213[0m 
					 ---------
2022-04-20 16:36:27,961 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_13d7233c-daad-49e1-8f1c-c8dbac2aa845 
					 Epoch: 4 | Completed: 96/133 (67%) 
 					 Loss: [1m4.586354[0m 
					 ---------
2022-04-20 16:36:27,965 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_1ff16015-8a76-43a9-a0c9-9d9f9167f500 
					 Epoch: 4 | Completed: 48/133 (33%) 
 					 Loss: [1m4.369085[0m 
					 ---------
2022-04-20 16:36:27,994 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_8a14aca2-59e6-45fd-b00a-4c74206b334f 
					 Epoch: 4 | Completed: 96/138 (67%) 
 					 Loss: [1m

2022-04-20 16:36:37,695 fedbiomed DEBUG - researcher_aaf86456-e652-46b0-8054-b7bb516705db
2022-04-20 16:36:37,701 fedbiomed INFO - [1mSending request[0m 
					[1m To[0m: node_8a14aca2-59e6-45fd-b00a-4c74206b334f 
					[1m Request: [0m: Perform training with the arguments: {'researcher_id': 'researcher_aaf86456-e652-46b0-8054-b7bb516705db', 'job_id': 'f7104d85-2919-4baf-b5f6-871d7f1c2f10', 'training_args': {'test_ratio': 0.0, 'test_on_local_updates': False, 'test_on_global_updates': False, 'test_metric': None, 'test_metric_args': {}, 'batch_size': 48, 'lr': 0.001, 'log_interval': 1, 'epochs': 5, 'dry_run': False, 'batch_maxnum': 200}, 'training': True, 'model_args': {'n_features': 13, 'n_latent': 10, 'n_hidden': 128, 'n_samples': 20}, 'command': 'train', 'model_url': 'http://localhost:8844/media/uploads/2022/04/20/my_model_6088de6e-6e59-40c6-8f72-66832340ebbf.py', 'params_url': 'http://localhost:8844/media/uploads/2022/04/20/aggregated_params_a59dc0ae-50fa-49a6-b9b7-48a537f734c4.p

2022-04-20 16:36:38,169 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_13d7233c-daad-49e1-8f1c-c8dbac2aa845 
					 Epoch: 3 | Completed: 96/133 (67%) 
 					 Loss: [1m4.322546[0m 
					 ---------
2022-04-20 16:36:38,187 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_1ff16015-8a76-43a9-a0c9-9d9f9167f500 
					 Epoch: 3 | Completed: 96/133 (67%) 
 					 Loss: [1m3.885866[0m 
					 ---------
2022-04-20 16:36:38,192 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_8a14aca2-59e6-45fd-b00a-4c74206b334f 
					 Epoch: 3 | Completed: 96/138 (67%) 
 					 Loss: [1m2.473432[0m 
					 ---------
2022-04-20 16:36:38,215 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_1ff16015-8a76-43a9-a0c9-9d9f9167f500 
					 Epoch: 3 | Completed: 111/133 (100%) 
 					 Loss: [1m3.626667[0m 
					 ---------
2022-04-20 16:36:38,227 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_8a14aca2-59e6-45fd-b00a-4c74206b334f 
					 Epoch: 3 | Completed: 126/138 (100%) 
 					 Loss: 

2022-04-20 16:36:48,232 fedbiomed DEBUG - researcher_aaf86456-e652-46b0-8054-b7bb516705db
2022-04-20 16:36:48,236 fedbiomed INFO - [1mSending request[0m 
					[1m To[0m: node_1ff16015-8a76-43a9-a0c9-9d9f9167f500 
					[1m Request: [0m: Perform training with the arguments: {'researcher_id': 'researcher_aaf86456-e652-46b0-8054-b7bb516705db', 'job_id': 'f7104d85-2919-4baf-b5f6-871d7f1c2f10', 'training_args': {'test_ratio': 0.0, 'test_on_local_updates': False, 'test_on_global_updates': False, 'test_metric': None, 'test_metric_args': {}, 'batch_size': 48, 'lr': 0.001, 'log_interval': 1, 'epochs': 5, 'dry_run': False, 'batch_maxnum': 200}, 'training': True, 'model_args': {'n_features': 13, 'n_latent': 10, 'n_hidden': 128, 'n_samples': 20}, 'command': 'train', 'model_url': 'http://localhost:8844/media/uploads/2022/04/20/my_model_6088de6e-6e59-40c6-8f72-66832340ebbf.py', 'params_url': 'http://localhost:8844/media/uploads/2022/04/20/aggregated_params_b3b6db8d-834e-46c2-b294-366b07480290.p

2022-04-20 16:36:48,693 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_8a14aca2-59e6-45fd-b00a-4c74206b334f 
					 Epoch: 2 | Completed: 96/138 (67%) 
 					 Loss: [1m2.054846[0m 
					 ---------
2022-04-20 16:36:48,717 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_13d7233c-daad-49e1-8f1c-c8dbac2aa845 
					 Epoch: 2 | Completed: 111/133 (100%) 
 					 Loss: [1m2.540046[0m 
					 ---------
2022-04-20 16:36:48,721 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_1ff16015-8a76-43a9-a0c9-9d9f9167f500 
					 Epoch: 3 | Completed: 48/133 (33%) 
 					 Loss: [1m3.299503[0m 
					 ---------
2022-04-20 16:36:48,731 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_8a14aca2-59e6-45fd-b00a-4c74206b334f 
					 Epoch: 2 | Completed: 126/138 (100%) 
 					 Loss: [1m0.726768[0m 
					 ---------
2022-04-20 16:36:48,750 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_13d7233c-daad-49e1-8f1c-c8dbac2aa845 
					 Epoch: 3 | Completed: 48/133 (33%) 
 					 Loss: 

2022-04-20 16:36:58,489 fedbiomed DEBUG - upload (HTTP POST request) of file /Users/balelli/ownCloud/INRIA_EPIONE/FedBioMed/fedbiomed/var/experiments/Experiment_0016/aggregated_params_ba6a56ac-f039-4b03-90aa-fcf2d555f956.pt successful, with status code 201
2022-04-20 16:36:58,491 fedbiomed INFO - Saved aggregated params for round 5 in /Users/balelli/ownCloud/INRIA_EPIONE/FedBioMed/fedbiomed/var/experiments/Experiment_0016/aggregated_params_ba6a56ac-f039-4b03-90aa-fcf2d555f956.pt
2022-04-20 16:36:58,492 fedbiomed INFO - Sampled nodes in round 6 ['node_13d7233c-daad-49e1-8f1c-c8dbac2aa845', 'node_1ff16015-8a76-43a9-a0c9-9d9f9167f500', 'node_8a14aca2-59e6-45fd-b00a-4c74206b334f']
2022-04-20 16:36:58,497 fedbiomed INFO - [1mSending request[0m 
					[1m To[0m: node_13d7233c-daad-49e1-8f1c-c8dbac2aa845 
					[1m Request: [0m: Perform training with the arguments: {'researcher_id': 'researcher_aaf86456-e652-46b0-8054-b7bb516705db', 'job_id': 'f7104d85-2919-4baf-b5f6-871d7f1c2f10', 'train

2022-04-20 16:36:58,762 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_13d7233c-daad-49e1-8f1c-c8dbac2aa845 
					 Epoch: 2 | Completed: 48/133 (33%) 
 					 Loss: [1m2.512949[0m 
					 ---------
2022-04-20 16:36:58,764 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_8a14aca2-59e6-45fd-b00a-4c74206b334f 
					 Epoch: 2 | Completed: 48/138 (33%) 
 					 Loss: [1m1.541566[0m 
					 ---------
2022-04-20 16:36:58,775 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_1ff16015-8a76-43a9-a0c9-9d9f9167f500 
					 Epoch: 1 | Completed: 48/133 (33%) 
 					 Loss: [1m2.421355[0m 
					 ---------
2022-04-20 16:36:58,795 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_13d7233c-daad-49e1-8f1c-c8dbac2aa845 
					 Epoch: 2 | Completed: 96/133 (67%) 
 					 Loss: [1m2.361355[0m 
					 ---------
2022-04-20 16:36:58,797 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_8a14aca2-59e6-45fd-b00a-4c74206b334f 
					 Epoch: 2 | Completed: 96/138 (67%) 
 					 Loss: [1m

2022-04-20 16:36:59,804 fedbiomed INFO - [1mINFO[0m
					[1m NODE[0m node_13d7233c-daad-49e1-8f1c-c8dbac2aa845
					[1m MESSAGE:[0m results uploaded successfully [0m
-----------------------------------------------------------------
2022-04-20 16:36:59,930 fedbiomed INFO - [1mINFO[0m
					[1m NODE[0m node_1ff16015-8a76-43a9-a0c9-9d9f9167f500
					[1m MESSAGE:[0m results uploaded successfully [0m
-----------------------------------------------------------------
2022-04-20 16:37:08,534 fedbiomed INFO - Downloading model params after training on node_8a14aca2-59e6-45fd-b00a-4c74206b334f - from http://localhost:8844/media/uploads/2022/04/20/node_params_0063e729-f5a1-4888-9a5a-da6b616fe537.pt
2022-04-20 16:37:08,562 fedbiomed DEBUG - upload (HTTP GET request) of file node_params_a37e69ac-e255-4616-acde-9d075b00b05b.pt successful, with status code 200
2022-04-20 16:37:08,575 fedbiomed INFO - Downloading model params after training on node_13d7233c-daad-49e1-8f1c-c8dbac2aa845 - f

2022-04-20 16:37:08,974 fedbiomed INFO - [1mINFO[0m
					[1m NODE[0m node_8a14aca2-59e6-45fd-b00a-4c74206b334f
					[1m MESSAGE:[0m training with arguments {'history_monitor': <fedbiomed.node.history_monitor.HistoryMonitor object at 0x1307e2a90>, 'node_args': {'gpu': False, 'gpu_num': None, 'gpu_only': False}, 'batch_size': 48, 'lr': 0.001, 'log_interval': 1, 'epochs': 5, 'dry_run': False, 'batch_maxnum': 200}[0m
-----------------------------------------------------------------
2022-04-20 16:37:08,987 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_13d7233c-daad-49e1-8f1c-c8dbac2aa845 
					 Epoch: 1 | Completed: 48/133 (33%) 
 					 Loss: [1m2.888330[0m 
					 ---------
2022-04-20 16:37:09,002 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_1ff16015-8a76-43a9-a0c9-9d9f9167f500 
					 Epoch: 1 | Completed: 48/133 (33%) 
 					 Loss: [1m2.214565[0m 
					 ---------
2022-04-20 16:37:09,022 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_8a14aca2-59e6-45fd-

2022-04-20 16:37:09,460 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_13d7233c-daad-49e1-8f1c-c8dbac2aa845 
					 Epoch: 5 | Completed: 96/133 (67%) 
 					 Loss: [1m2.051786[0m 
					 ---------
2022-04-20 16:37:09,467 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_1ff16015-8a76-43a9-a0c9-9d9f9167f500 
					 Epoch: 5 | Completed: 48/133 (33%) 
 					 Loss: [1m1.448137[0m 
					 ---------
2022-04-20 16:37:09,481 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_8a14aca2-59e6-45fd-b00a-4c74206b334f 
					 Epoch: 5 | Completed: 96/138 (67%) 
 					 Loss: [1m1.658111[0m 
					 ---------
2022-04-20 16:37:09,498 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_1ff16015-8a76-43a9-a0c9-9d9f9167f500 
					 Epoch: 5 | Completed: 96/133 (67%) 
 					 Loss: [1m1.644996[0m 
					 ---------
2022-04-20 16:37:09,517 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_1ff16015-8a76-43a9-a0c9-9d9f9167f500 
					 Epoch: 5 | Completed: 111/133 (100%) 
 					 Loss: [

2022-04-20 16:37:19,155 fedbiomed INFO - [1mINFO[0m
					[1m NODE[0m node_1ff16015-8a76-43a9-a0c9-9d9f9167f500
					[1m MESSAGE:[0m training with arguments {'history_monitor': <fedbiomed.node.history_monitor.HistoryMonitor object at 0x1324ac460>, 'node_args': {'gpu': False, 'gpu_num': None, 'gpu_only': False}, 'batch_size': 48, 'lr': 0.001, 'log_interval': 1, 'epochs': 5, 'dry_run': False, 'batch_maxnum': 200}[0m
-----------------------------------------------------------------
					[1m NODE[0m node_8a14aca2-59e6-45fd-b00a-4c74206b334f
					[1m MESSAGE:[0m There is no test activated for the round. Please set flag for `test_on_global_updates`, `test_on_local_updates`, or both. Splitting dataset for testing will be ignored[0m
-----------------------------------------------------------------
2022-04-20 16:37:19,226 fedbiomed INFO - [1mINFO[0m
					[1m NODE[0m node_8a14aca2-59e6-45fd-b00a-4c74206b334f
					[1m MESSAGE:[0m training with arguments {'history_monitor': <fedbi

2022-04-20 16:37:19,648 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_1ff16015-8a76-43a9-a0c9-9d9f9167f500 
					 Epoch: 4 | Completed: 96/133 (67%) 
 					 Loss: [1m1.866271[0m 
					 ---------
2022-04-20 16:37:19,670 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_1ff16015-8a76-43a9-a0c9-9d9f9167f500 
					 Epoch: 4 | Completed: 111/133 (100%) 
 					 Loss: [1m1.831112[0m 
					 ---------
2022-04-20 16:37:19,678 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_8a14aca2-59e6-45fd-b00a-4c74206b334f 
					 Epoch: 4 | Completed: 96/138 (67%) 
 					 Loss: [1m1.426189[0m 
					 ---------
2022-04-20 16:37:19,692 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_13d7233c-daad-49e1-8f1c-c8dbac2aa845 
					 Epoch: 4 | Completed: 48/133 (33%) 
 					 Loss: [1m3.495399[0m 
					 ---------
2022-04-20 16:37:19,697 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_1ff16015-8a76-43a9-a0c9-9d9f9167f500 
					 Epoch: 5 | Completed: 48/133 (33%) 
 					 Loss: [

2022-04-20 16:37:29,302 fedbiomed DEBUG - researcher_aaf86456-e652-46b0-8054-b7bb516705db
2022-04-20 16:37:29,307 fedbiomed INFO - [1mSending request[0m 
					[1m To[0m: node_8a14aca2-59e6-45fd-b00a-4c74206b334f 
					[1m Request: [0m: Perform training with the arguments: {'researcher_id': 'researcher_aaf86456-e652-46b0-8054-b7bb516705db', 'job_id': 'f7104d85-2919-4baf-b5f6-871d7f1c2f10', 'training_args': {'test_ratio': 0.0, 'test_on_local_updates': False, 'test_on_global_updates': False, 'test_metric': None, 'test_metric_args': {}, 'batch_size': 48, 'lr': 0.001, 'log_interval': 1, 'epochs': 5, 'dry_run': False, 'batch_maxnum': 200}, 'training': True, 'model_args': {'n_features': 13, 'n_latent': 10, 'n_hidden': 128, 'n_samples': 20}, 'command': 'train', 'model_url': 'http://localhost:8844/media/uploads/2022/04/20/my_model_6088de6e-6e59-40c6-8f72-66832340ebbf.py', 'params_url': 'http://localhost:8844/media/uploads/2022/04/20/aggregated_params_1c9dd0f1-bf9a-46aa-9640-f8f44ebdabb0.p

2022-04-20 16:37:29,730 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_1ff16015-8a76-43a9-a0c9-9d9f9167f500 
					 Epoch: 3 | Completed: 96/133 (67%) 
 					 Loss: [1m2.429948[0m 
					 ---------
2022-04-20 16:37:29,746 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_13d7233c-daad-49e1-8f1c-c8dbac2aa845 
					 Epoch: 3 | Completed: 96/133 (67%) 
 					 Loss: [1m2.044652[0m 
					 ---------
2022-04-20 16:37:29,751 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_8a14aca2-59e6-45fd-b00a-4c74206b334f 
					 Epoch: 3 | Completed: 96/138 (67%) 
 					 Loss: [1m0.813055[0m 
					 ---------
2022-04-20 16:37:29,762 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_1ff16015-8a76-43a9-a0c9-9d9f9167f500 
					 Epoch: 3 | Completed: 111/133 (100%) 
 					 Loss: [1m1.477690[0m 
					 ---------
2022-04-20 16:37:29,771 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_13d7233c-daad-49e1-8f1c-c8dbac2aa845 
					 Epoch: 3 | Completed: 111/133 (100%) 
 					 Loss: 

2022-04-20 16:37:39,554 fedbiomed DEBUG - researcher_aaf86456-e652-46b0-8054-b7bb516705db
2022-04-20 16:37:39,561 fedbiomed INFO - [1mSending request[0m 
					[1m To[0m: node_1ff16015-8a76-43a9-a0c9-9d9f9167f500 
					[1m Request: [0m: Perform training with the arguments: {'researcher_id': 'researcher_aaf86456-e652-46b0-8054-b7bb516705db', 'job_id': 'f7104d85-2919-4baf-b5f6-871d7f1c2f10', 'training_args': {'test_ratio': 0.0, 'test_on_local_updates': False, 'test_on_global_updates': False, 'test_metric': None, 'test_metric_args': {}, 'batch_size': 48, 'lr': 0.001, 'log_interval': 1, 'epochs': 5, 'dry_run': False, 'batch_maxnum': 200}, 'training': True, 'model_args': {'n_features': 13, 'n_latent': 10, 'n_hidden': 128, 'n_samples': 20}, 'command': 'train', 'model_url': 'http://localhost:8844/media/uploads/2022/04/20/my_model_6088de6e-6e59-40c6-8f72-66832340ebbf.py', 'params_url': 'http://localhost:8844/media/uploads/2022/04/20/aggregated_params_75f075be-5ae6-42f6-a7d7-1e16a089e8d7.p

2022-04-20 16:37:39,914 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_1ff16015-8a76-43a9-a0c9-9d9f9167f500 
					 Epoch: 2 | Completed: 111/133 (100%) 
 					 Loss: [1m1.583808[0m 
					 ---------
2022-04-20 16:37:39,933 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_13d7233c-daad-49e1-8f1c-c8dbac2aa845 
					 Epoch: 2 | Completed: 111/133 (100%) 
 					 Loss: [1m1.415156[0m 
					 ---------
2022-04-20 16:37:39,940 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_8a14aca2-59e6-45fd-b00a-4c74206b334f 
					 Epoch: 2 | Completed: 126/138 (100%) 
 					 Loss: [1m1.406034[0m 
					 ---------
2022-04-20 16:37:39,948 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_1ff16015-8a76-43a9-a0c9-9d9f9167f500 
					 Epoch: 3 | Completed: 48/133 (33%) 
 					 Loss: [1m1.571013[0m 
					 ---------
2022-04-20 16:37:39,967 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_13d7233c-daad-49e1-8f1c-c8dbac2aa845 
					 Epoch: 3 | Completed: 48/133 (33%) 
 					 Loss

2022-04-20 16:37:49,824 fedbiomed DEBUG - upload (HTTP POST request) of file /Users/balelli/ownCloud/INRIA_EPIONE/FedBioMed/fedbiomed/var/experiments/Experiment_0016/aggregated_params_a3bd77c8-9442-4ff4-b7b7-bc7ebd3050d5.pt successful, with status code 201
2022-04-20 16:37:49,826 fedbiomed INFO - Saved aggregated params for round 10 in /Users/balelli/ownCloud/INRIA_EPIONE/FedBioMed/fedbiomed/var/experiments/Experiment_0016/aggregated_params_a3bd77c8-9442-4ff4-b7b7-bc7ebd3050d5.pt
2022-04-20 16:37:49,829 fedbiomed INFO - Sampled nodes in round 11 ['node_13d7233c-daad-49e1-8f1c-c8dbac2aa845', 'node_1ff16015-8a76-43a9-a0c9-9d9f9167f500', 'node_8a14aca2-59e6-45fd-b00a-4c74206b334f']
2022-04-20 16:37:49,831 fedbiomed INFO - [1mSending request[0m 
					[1m To[0m: node_13d7233c-daad-49e1-8f1c-c8dbac2aa845 
					[1m Request: [0m: Perform training with the arguments: {'researcher_id': 'researcher_aaf86456-e652-46b0-8054-b7bb516705db', 'job_id': 'f7104d85-2919-4baf-b5f6-871d7f1c2f10', 'tra

2022-04-20 16:37:50,114 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_1ff16015-8a76-43a9-a0c9-9d9f9167f500 
					 Epoch: 1 | Completed: 111/133 (100%) 
 					 Loss: [1m0.839825[0m 
					 ---------
2022-04-20 16:37:50,120 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_13d7233c-daad-49e1-8f1c-c8dbac2aa845 
					 Epoch: 1 | Completed: 111/133 (100%) 
 					 Loss: [1m2.105926[0m 
					 ---------
2022-04-20 16:37:50,167 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_8a14aca2-59e6-45fd-b00a-4c74206b334f 
					 Epoch: 1 | Completed: 126/138 (100%) 
 					 Loss: [1m1.403836[0m 
					 ---------
2022-04-20 16:37:50,169 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_13d7233c-daad-49e1-8f1c-c8dbac2aa845 
					 Epoch: 2 | Completed: 48/133 (33%) 
 					 Loss: [1m2.608248[0m 
					 ---------
2022-04-20 16:37:50,174 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_1ff16015-8a76-43a9-a0c9-9d9f9167f500 
					 Epoch: 2 | Completed: 48/133 (33%) 
 					 Loss

2022-04-20 16:37:51,027 fedbiomed INFO - [1mINFO[0m
					[1m NODE[0m node_8a14aca2-59e6-45fd-b00a-4c74206b334f
					[1m MESSAGE:[0m results uploaded successfully [0m
-----------------------------------------------------------------
2022-04-20 16:37:51,205 fedbiomed INFO - [1mINFO[0m
					[1m NODE[0m node_13d7233c-daad-49e1-8f1c-c8dbac2aa845
					[1m MESSAGE:[0m results uploaded successfully [0m
-----------------------------------------------------------------
2022-04-20 16:37:59,863 fedbiomed INFO - Downloading model params after training on node_1ff16015-8a76-43a9-a0c9-9d9f9167f500 - from http://localhost:8844/media/uploads/2022/04/20/node_params_107edec6-87d6-4e5e-9ced-caaf85ce873e.pt
2022-04-20 16:37:59,900 fedbiomed DEBUG - upload (HTTP GET request) of file node_params_596d7cee-66b0-40bc-893e-0e6c0c3c3ae9.pt successful, with status code 200
2022-04-20 16:37:59,922 fedbiomed INFO - Downloading model params after training on node_8a14aca2-59e6-45fd-b00a-4c74206b334f - f

2022-04-20 16:38:00,339 fedbiomed INFO - [1mINFO[0m
					[1m NODE[0m node_1ff16015-8a76-43a9-a0c9-9d9f9167f500
					[1m MESSAGE:[0m training with arguments {'history_monitor': <fedbiomed.node.history_monitor.HistoryMonitor object at 0x1324ac460>, 'node_args': {'gpu': False, 'gpu_num': None, 'gpu_only': False}, 'batch_size': 48, 'lr': 0.001, 'log_interval': 1, 'epochs': 5, 'dry_run': False, 'batch_maxnum': 200}[0m
-----------------------------------------------------------------
2022-04-20 16:38:00,341 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_13d7233c-daad-49e1-8f1c-c8dbac2aa845 
					 Epoch: 1 | Completed: 48/133 (33%) 
 					 Loss: [1m1.398825[0m 
					 ---------
2022-04-20 16:38:00,344 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_8a14aca2-59e6-45fd-b00a-4c74206b334f 
					 Epoch: 1 | Completed: 48/138 (33%) 
 					 Loss: [1m1.785488[0m 
					 ---------
2022-04-20 16:38:00,346 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_1ff16015-8a76-43a9-

2022-04-20 16:38:00,792 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_1ff16015-8a76-43a9-a0c9-9d9f9167f500 
					 Epoch: 5 | Completed: 48/133 (33%) 
 					 Loss: [1m1.339352[0m 
					 ---------
2022-04-20 16:38:00,803 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_13d7233c-daad-49e1-8f1c-c8dbac2aa845 
					 Epoch: 5 | Completed: 96/133 (67%) 
 					 Loss: [1m1.139385[0m 
					 ---------
2022-04-20 16:38:00,822 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_1ff16015-8a76-43a9-a0c9-9d9f9167f500 
					 Epoch: 5 | Completed: 96/133 (67%) 
 					 Loss: [1m1.732152[0m 
					 ---------
2022-04-20 16:38:00,829 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_13d7233c-daad-49e1-8f1c-c8dbac2aa845 
					 Epoch: 5 | Completed: 111/133 (100%) 
 					 Loss: [1m2.895575[0m 
					 ---------
2022-04-20 16:38:00,843 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_8a14aca2-59e6-45fd-b00a-4c74206b334f 
					 Epoch: 5 | Completed: 96/138 (67%) 
 					 Loss: [

					[1m NODE[0m node_1ff16015-8a76-43a9-a0c9-9d9f9167f500
					[1m MESSAGE:[0m There is no test activated for the round. Please set flag for `test_on_global_updates`, `test_on_local_updates`, or both. Splitting dataset for testing will be ignored[0m
-----------------------------------------------------------------
					[1m NODE[0m node_13d7233c-daad-49e1-8f1c-c8dbac2aa845
					[1m MESSAGE:[0m There is no test activated for the round. Please set flag for `test_on_global_updates`, `test_on_local_updates`, or both. Splitting dataset for testing will be ignored[0m
-----------------------------------------------------------------
2022-04-20 16:38:10,543 fedbiomed INFO - [1mINFO[0m
					[1m NODE[0m node_1ff16015-8a76-43a9-a0c9-9d9f9167f500
					[1m MESSAGE:[0m training with arguments {'history_monitor': <fedbiomed.node.history_monitor.HistoryMonitor object at 0x1320fbca0>, 'node_args': {'gpu': False, 'gpu_num': None, 'gpu_only': False}, 'batch_size': 48, 'lr': 0.001, 'log_in

2022-04-20 16:38:10,916 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_13d7233c-daad-49e1-8f1c-c8dbac2aa845 
					 Epoch: 4 | Completed: 48/133 (33%) 
 					 Loss: [1m0.630263[0m 
					 ---------
2022-04-20 16:38:10,968 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_1ff16015-8a76-43a9-a0c9-9d9f9167f500 
					 Epoch: 4 | Completed: 96/133 (67%) 
 					 Loss: [1m1.370957[0m 
					 ---------
2022-04-20 16:38:10,970 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_8a14aca2-59e6-45fd-b00a-4c74206b334f 
					 Epoch: 4 | Completed: 96/138 (67%) 
 					 Loss: [1m0.782645[0m 
					 ---------
2022-04-20 16:38:10,977 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_13d7233c-daad-49e1-8f1c-c8dbac2aa845 
					 Epoch: 4 | Completed: 96/133 (67%) 
 					 Loss: [1m1.932192[0m 
					 ---------
2022-04-20 16:38:11,007 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_13d7233c-daad-49e1-8f1c-c8dbac2aa845 
					 Epoch: 4 | Completed: 111/133 (100%) 
 					 Loss: [

2022-04-20 16:38:20,645 fedbiomed DEBUG - researcher_aaf86456-e652-46b0-8054-b7bb516705db
2022-04-20 16:38:20,654 fedbiomed INFO - [1mSending request[0m 
					[1m To[0m: node_8a14aca2-59e6-45fd-b00a-4c74206b334f 
					[1m Request: [0m: Perform training with the arguments: {'researcher_id': 'researcher_aaf86456-e652-46b0-8054-b7bb516705db', 'job_id': 'f7104d85-2919-4baf-b5f6-871d7f1c2f10', 'training_args': {'test_ratio': 0.0, 'test_on_local_updates': False, 'test_on_global_updates': False, 'test_metric': None, 'test_metric_args': {}, 'batch_size': 48, 'lr': 0.001, 'log_interval': 1, 'epochs': 5, 'dry_run': False, 'batch_maxnum': 200}, 'training': True, 'model_args': {'n_features': 13, 'n_latent': 10, 'n_hidden': 128, 'n_samples': 20}, 'command': 'train', 'model_url': 'http://localhost:8844/media/uploads/2022/04/20/my_model_6088de6e-6e59-40c6-8f72-66832340ebbf.py', 'params_url': 'http://localhost:8844/media/uploads/2022/04/20/aggregated_params_da947bed-0381-4ea4-90af-5f82e90622dc.p

2022-04-20 16:38:21,075 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_1ff16015-8a76-43a9-a0c9-9d9f9167f500 
					 Epoch: 3 | Completed: 48/133 (33%) 
 					 Loss: [1m1.230116[0m 
					 ---------
2022-04-20 16:38:21,099 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_13d7233c-daad-49e1-8f1c-c8dbac2aa845 
					 Epoch: 3 | Completed: 111/133 (100%) 
 					 Loss: [1m0.709811[0m 
					 ---------
2022-04-20 16:38:21,113 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_8a14aca2-59e6-45fd-b00a-4c74206b334f 
					 Epoch: 3 | Completed: 96/138 (67%) 
 					 Loss: [1m0.134534[0m 
					 ---------
2022-04-20 16:38:21,121 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_1ff16015-8a76-43a9-a0c9-9d9f9167f500 
					 Epoch: 3 | Completed: 96/133 (67%) 
 					 Loss: [1m1.654163[0m 
					 ---------
2022-04-20 16:38:21,146 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_8a14aca2-59e6-45fd-b00a-4c74206b334f 
					 Epoch: 3 | Completed: 126/138 (100%) 
 					 Loss: 

15

Local training results for each round and each node are available via `exp.training_replies()` (index 0 to (`rounds` - 1) ).

For example you can view the training results for the last round below.

Different timings (in seconds) are reported for each dataset of a node participating in a round :
- `rtime_training` real time (clock time) spent in the training function on the node
- `ptime_training` process time (user and system CPU) spent in the training function on the node
- `rtime_total` real time (clock time) spent in the researcher between sending the request and handling the response, at the `Job()` layer

In [9]:
print("\nList the training rounds : ", exp.training_replies().keys())

print("\nList the nodes for the last training round and their timings : ")
round_data = exp.training_replies()[rounds - 1].data()
for c in range(len(round_data)):
    print("\t- {id} :\
    \n\t\trtime_training={rtraining:.2f} seconds\
    \n\t\tptime_training={ptraining:.2f} seconds\
    \n\t\trtime_total={rtotal:.2f} seconds".format(id = round_data[c]['node_id'],
        rtraining = round_data[c]['timing']['rtime_training'],
        ptraining = round_data[c]['timing']['ptime_training'],
        rtotal = round_data[c]['timing']['rtime_total']))
print('\n')
    
exp.training_replies()[rounds - 1].dataframe()


List the training rounds :  dict_keys([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])

List the nodes for the last training round and their timings : 
	- node_13d7233c-daad-49e1-8f1c-c8dbac2aa845 :    
		rtime_training=0.63 seconds    
		ptime_training=0.34 seconds    
		rtime_total=10.05 seconds
	- node_1ff16015-8a76-43a9-a0c9-9d9f9167f500 :    
		rtime_training=0.62 seconds    
		ptime_training=0.35 seconds    
		rtime_total=10.10 seconds
	- node_8a14aca2-59e6-45fd-b00a-4c74206b334f :    
		rtime_training=0.59 seconds    
		ptime_training=0.35 seconds    
		rtime_total=10.13 seconds




Unnamed: 0,success,msg,dataset_id,node_id,params_path,params,timing
0,True,,dataset_4ce1b3d7-a0a7-4d62-a278-f5f23ce224e5,node_13d7233c-daad-49e1-8f1c-c8dbac2aa845,/Users/balelli/ownCloud/INRIA_EPIONE/FedBioMed...,"{'encoder.0.weight': [[tensor(0.1300), tensor(...","{'rtime_training': 0.6258554260000153, 'ptime_..."
1,True,,dataset_673a4c43-8bbf-4594-9f24-abac93b1b8c2,node_1ff16015-8a76-43a9-a0c9-9d9f9167f500,/Users/balelli/ownCloud/INRIA_EPIONE/FedBioMed...,"{'encoder.0.weight': [[tensor(0.1271), tensor(...","{'rtime_training': 0.6227430650000088, 'ptime_..."
2,True,,dataset_60be5906-be33-4075-91d8-f47637a40228,node_8a14aca2-59e6-45fd-b00a-4c74206b334f,/Users/balelli/ownCloud/INRIA_EPIONE/FedBioMed...,"{'encoder.0.weight': [[tensor(0.1370), tensor(...","{'rtime_training': 0.5941135950000103, 'ptime_..."


Federated parameters for each round are available via `exp.aggregated_params()` (index 0 to (`rounds` - 1) ).

For example you can view the federated parameters for the last round of the experiment :

In [10]:
print("\nList the training rounds : ", exp.aggregated_params().keys())

print("\nAccess the federated params for the last training round :")
print("\t- params_path: ", exp.aggregated_params()[rounds - 1]['params_path'])
print("\t- parameter data: ", exp.aggregated_params()[rounds - 1]['params'].keys())


List the training rounds :  dict_keys([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])

Access the federated params for the last training round :
	- params_path:  /Users/balelli/ownCloud/INRIA_EPIONE/FedBioMed/fedbiomed/var/experiments/Experiment_0016/aggregated_params_794085f9-2910-489c-909c-1fc053f04b87.pt
	- parameter data:  odict_keys(['encoder.0.weight', 'encoder.0.bias', 'encoder.2.weight', 'encoder.2.bias', 'encoder.4.weight', 'encoder.4.bias', 'decoder.0.weight', 'decoder.0.bias', 'decoder.2.weight', 'decoder.2.bias', 'decoder.4.weight', 'decoder.4.bias'])


# Test and comparison to local training

## 1. Testing on an external dataset

First of all we are going to test the performance of the final federated model to impute missing data on a test dataset. To this extent we are going to remove randomly 50% of samples from the test dataset, `data_test`, defined at the beginning of this notebook.

In [11]:
# from the test dataset, we will remove randomly 50% of data
np.random.seed(1234)

perc_miss = 0.5 # 50% of missing data

n = data_test.shape[0] # number of observations
p = data_test.shape[1] # number of features
xfull = np.copy(data_test)
xfull = (xfull - np.mean(xfull,0))/np.std(xfull,0)
xmiss = np.copy(xfull)
xmiss_flat = xmiss.flatten()
miss_pattern = np.random.choice(n*p, np.floor(n*p*perc_miss).astype(np.int_),\
                                replace=False)
xmiss_flat[miss_pattern] = np.nan 
xmiss = xmiss_flat.reshape([n,p]) # in xmiss, the missing values are represented by nans
mask = np.isfinite(xmiss) # binary mask that indicates which values are missing
xhat_0 = np.copy(xmiss)
xhat_0[np.isnan(xmiss)] = 0
xhat = np.copy(xhat_0) # This will be out imputed data matrix

We instantiate the model using last updated federated parameters:

In [12]:
L = 500

# extract federated model into PyTorch framework
model = exp.model_instance()
model.load_state_dict(exp.aggregated_params()[rounds - 1]['params'])

encoder = model.encoder
decoder = model.decoder

We define the MIWAE imputation routine:

In [13]:
p_z = td.Independent(td.Normal(loc=torch.zeros(d),scale=torch.ones(d)),1)

def miwae_impute(iota_x,mask,L):
    batch_size = iota_x.shape[0]
    out_encoder = encoder(iota_x)
    q_zgivenxobs = td.Independent(td.Normal(loc=out_encoder[..., :d],scale=torch.nn.Softplus()(out_encoder[..., d:(2*d)])),1)

    zgivenx = q_zgivenxobs.rsample([L])
    zgivenx_flat = zgivenx.reshape([L*batch_size,d])

    out_decoder = decoder(zgivenx_flat)
    all_means_obs_model = out_decoder[..., :p]
    all_scales_obs_model = torch.nn.Softplus()(out_decoder[..., p:(2*p)]) + 0.001
    all_degfreedom_obs_model = torch.nn.Softplus()(out_decoder[..., (2*p):(3*p)]) + 3

    data_flat = torch.Tensor.repeat(iota_x,[L,1]).reshape([-1,1])
    tiledmask = torch.Tensor.repeat(mask,[L,1])

    all_log_pxgivenz_flat = torch.distributions.StudentT(loc=all_means_obs_model.reshape([-1,1]),scale=all_scales_obs_model.reshape([-1,1]),df=all_degfreedom_obs_model.reshape([-1,1])).log_prob(data_flat)
    all_log_pxgivenz = all_log_pxgivenz_flat.reshape([L*batch_size,p])

    logpxobsgivenz = torch.sum(all_log_pxgivenz*tiledmask,1).reshape([L,batch_size])
    logpz = p_z.log_prob(zgivenx)
    logq = q_zgivenxobs.log_prob(zgivenx)

    xgivenz = td.Independent(td.StudentT(loc=all_means_obs_model, scale=all_scales_obs_model, df=all_degfreedom_obs_model),1)

    imp_weights = torch.nn.functional.softmax(logpxobsgivenz + logpz - logq,0) # these are w_1,....,w_L for all observations in the batch
    xms = xgivenz.mean.reshape([L,batch_size,p])  # that's the only line that changed!
    xm=torch.einsum('ki,kij->ij', imp_weights, xms) 

    return xm

In [14]:
def mse(xhat,xtrue,mask): # MSE function for imputations
    xhat = np.array(xhat)
    xtrue = np.array(xtrue)
    return np.mean(np.power(xhat-xtrue,2)[~mask])

And we finally do the imputation and evaluate the corresponding imputation error through MSE:

In [15]:
xhat[~mask] = miwae_impute(iota_x = torch.from_numpy(xhat_0).float(),mask = torch.from_numpy(mask).float(),L= L).cpu().data.numpy()[~mask]
err_test_data = np.array([mse(xhat,xfull,mask)])
print('Imputation MSE on testing data  %g' %err_test_data)
print('-----')

Imputation MSE on testing data  0.617355
-----


## 2. Testing on a client's dataset

We are now going to use the final federated model to impute missing data of client 1, which have been used for training:

In [16]:
data_client_1 = Clients_data[0]
n = data_client_1.shape[0] # number of observations
p = data_client_1.shape[1] # number of features

xfull = np.copy(data_client_1)
xfull = (xfull - np.mean(xfull,0))/np.std(xfull,0)
xmiss = np.copy(xfull)
xmiss_flat = xmiss.flatten()
miss_pattern = np.random.choice(n*p, np.floor(n*p*perc_miss).astype(np.int_),\
                                replace=False)
xmiss_flat[miss_pattern] = np.nan 
xmiss = xmiss_flat.reshape([n,p]) # in xmiss, the missing values are represented by nans
mask = np.isfinite(xmiss) # binary mask that indicates which values are missing
xhat_0 = np.copy(xmiss)
xhat_0[np.isnan(xmiss)] = 0
xhat = np.copy(xhat_0) # This will be out imputed data matrix

### Now we do the imputation

xhat[~mask] = miwae_impute(iota_x = torch.from_numpy(xhat_0).float(),mask = torch.from_numpy(mask).float(),L= L).cpu().data.numpy()[~mask]
err_cl1_data = np.array([mse(xhat,xfull,mask)])
print('Imputation MSE on data from client 1  %g' %err_cl1_data)
print('-----')

Imputation MSE on data from client 1  0.528037
-----


## 3. Local training and testing on a client

Finally, we test the performance of the same model trained locally and tested on the dataset from client 1. We will use a total of `epochs`x`rounds` local epochs.

In [17]:
p_z = td.Independent(td.Normal(loc=torch.zeros(d),scale=torch.ones(d)),1)

def miwae_loss(iota_x,mask):
    batch_size = iota_x.shape[0]
    out_encoder = encoder(iota_x)
    q_zgivenxobs = td.Independent(td.Normal(loc=out_encoder[..., :d],scale=torch.nn.Softplus()(out_encoder[..., d:(2*d)])),1)

    zgivenx = q_zgivenxobs.rsample([K])
    zgivenx_flat = zgivenx.reshape([K*batch_size,d])

    out_decoder = decoder(zgivenx_flat)
    all_means_obs_model = out_decoder[..., :p]
    all_scales_obs_model = torch.nn.Softplus()(out_decoder[..., p:(2*p)]) + 0.001
    all_degfreedom_obs_model = torch.nn.Softplus()(out_decoder[..., (2*p):(3*p)]) + 3

    data_flat = torch.Tensor.repeat(iota_x,[K,1]).reshape([-1,1])
    tiledmask = torch.Tensor.repeat(mask,[K,1])

    all_log_pxgivenz_flat = torch.distributions.StudentT(loc=all_means_obs_model.reshape([-1,1]),scale=all_scales_obs_model.reshape([-1,1]),df=all_degfreedom_obs_model.reshape([-1,1])).log_prob(data_flat)
    all_log_pxgivenz = all_log_pxgivenz_flat.reshape([K*batch_size,p])

    logpxobsgivenz = torch.sum(all_log_pxgivenz*tiledmask,1).reshape([K,batch_size])
    logpz = p_z.log_prob(zgivenx)
    logq = q_zgivenxobs.log_prob(zgivenx)

    neg_bound = -torch.mean(torch.logsumexp(logpxobsgivenz + logpz - logq,0))

    return neg_bound

We perform the local training:

In [18]:
n_epochs_local = n_epochs*rounds
bs = 48 # batch size

encoder = nn.Sequential(
    torch.nn.Linear(p, h),
    torch.nn.ReLU(),
    torch.nn.Linear(h, h),
    torch.nn.ReLU(),
    torch.nn.Linear(h, 2*d),  # the encoder will output both the mean and the diagonal covariance
)

decoder = nn.Sequential(
    torch.nn.Linear(d, h),
    torch.nn.ReLU(),
    torch.nn.Linear(h, h),
    torch.nn.ReLU(),
    torch.nn.Linear(h, 3*p),  # the decoder will output both the mean, the scale, and the number of degrees of freedoms (hence the 3*p)
)

optimizer = torch.optim.Adam(list(encoder.parameters()) + list(decoder.parameters()),lr=1e-3)

def weights_init(layer):
    if type(layer) == nn.Linear: torch.nn.init.orthogonal_(layer.weight)
        
encoder.apply(weights_init)
decoder.apply(weights_init)

for ep in range(1,n_epochs_local):
    perm = np.random.permutation(n) # We use the "random reshuffling" version of SGD
    batches_data = np.array_split(xhat_0[perm,], n/bs)
    batches_mask = np.array_split(mask[perm,], n/bs)
    for it in range(len(batches_data)):
        optimizer.zero_grad()
        encoder.zero_grad()
        decoder.zero_grad()
        b_data = torch.from_numpy(batches_data[it]).float()
        b_mask = torch.from_numpy(batches_mask[it]).float()
        loss = miwae_loss(iota_x = b_data,mask = b_mask)
        loss.backward()
        optimizer.step()
    if ep % rounds == 1:
        print('Epoch %g' %ep)
        print('MIWAE likelihood bound  %g' %(-np.log(K)-miwae_loss(iota_x = torch.from_numpy(xhat_0).float(),mask = torch.from_numpy(mask).float()).cpu().data.numpy())) # Gradient step      

Epoch 1
MIWAE likelihood bound  -9.84834
Epoch 16
MIWAE likelihood bound  -8.41023
Epoch 31
MIWAE likelihood bound  -5.86908
Epoch 46
MIWAE likelihood bound  -4.97335
Epoch 61
MIWAE likelihood bound  -4.69564


And we do the imputation on the same dataset:

In [19]:
xhat[~mask] = miwae_impute(iota_x = torch.from_numpy(xhat_0).float(),mask = torch.from_numpy(mask).float(),L= L).cpu().data.numpy()[~mask]
err_local_cl1_data = np.array([mse(xhat,xfull,mask)])
print('Imputation MSE of local training on data from client 1  %g' %err_local_cl1_data)
print('-----')

Imputation MSE of local training on data from client 1  0.54522
-----


## Comparison of obtained results:

In [20]:
print('Imputation MSE on testing data  %g' %err_test_data)
print('Imputation MSE on data from client 1  %g' %err_cl1_data)
print('Imputation MSE of local training on data from client 1  %g' %err_local_cl1_data)

Imputation MSE on testing data  0.617355
Imputation MSE on data from client 1  0.528037
Imputation MSE of local training on data from client 1  0.54522


As you can see, the federated model performs better than the local one!