# Missing data imputation with Fedbiomed using MIWAE

In this notebook we show how to impute missing not at random (MAR) data in a federated setting using MIWAE (https://arxiv.org/abs/2006.12871). 

In [1]:
%load_ext autoreload
%autoreload 2

## Prepare the data

For this experiment we will use the breast cancer data from sklearn.

In [2]:
import pandas as pd
import numpy as np

data_url = "http://lib.stat.cmu.edu/datasets/boston"
raw_df = pd.read_csv(data_url, sep="\s+", skiprows=22, header=None)
data = np.hstack([raw_df.values[::2, :], raw_df.values[1::2, :2]])
target = raw_df.values[1::2, 2]

In [3]:
from sklearn.model_selection import train_test_split

data_train, data_test, labels_train, labels_test = train_test_split(data, target, test_size=0.20, random_state=42)
df_data_train = pd.DataFrame(data_train)
N_train = len(df_data_train)
client_1, client_2, client_3 = np.split(df_data_train.sample(frac=1), \
                                        [int(.33*N_train), int(.66*len(df_data_train))])

Clients_data=[client_1, client_2, client_3]

# from each dataset we will remove randomly 50% of data
np.random.seed(1234)

# 50% of missing data for client 1, 30% for client 2, 60% for client 3
perc_miss_list = [0.5,0.3,0.6] 

Clients_missing = []
for perc,c in enumerate(Clients_data):
    perc_miss=perc_miss_list[perc]
    n = c.shape[0] # number of observations
    p = c.shape[1] # number of features
    xmiss = np.copy(c)
    xmiss = (xmiss - np.mean(xmiss,0))/np.std(xmiss,0)
    xmiss_flat = xmiss.flatten()
    miss_pattern = np.random.choice(n*p, np.floor(n*p*perc_miss).astype(np.int_),\
                                    replace=False)
    xmiss_flat[miss_pattern] = np.nan 
    xmiss = xmiss_flat.reshape([n,p]) # in xmiss, the missing values are represented by nans
    mask = np.isfinite(xmiss) # binary mask that indicates which values are missing
    Clients_missing.append(xmiss)

import os 
os.makedirs('clients_data', exist_ok=True) 
for i in range(len(Clients_missing)):
    pd.DataFrame(Clients_missing[i]).to_csv('clients_data/client_'+str(i+1)+'.csv',index=False)

## Start the network
Before running this notebook, start the network with `./scripts/fedbiomed_run network`

## Setting the nodes up
It is necessary to previously configure a node:
1. `./scripts/fedbiomed_run node add`
  * Select option 1 (csv) to add client_1 dataset to the first node
  * Provide the correct tag by entering:  breast_cancer
  * Pick the folder where client_1 dataset has been saved
  * Data must have been added (if you get a warning saying that data must be unique is because it's been already added)
  
2. Check that your data has been added by executing `./scripts/fedbiomed_run node list`
3. Run the node using `./scripts/fedbiomed_run node start`. Wait until you get `Starting task manager`. it means you are online.
4. Following the same procedure, you can create additional nodes for clients 2 and 3.

Check available clients:

In [4]:
from fedbiomed.researcher.requests import Requests
req = Requests()
req.list(verbose=True)
xx = req.list()
dataset_size = [xx[i][0]['shape'][1] for i in xx]
assert min(dataset_size)==max(dataset_size)
data_size = dataset_size[0]

2022-04-25 15:52:41,367 fedbiomed INFO - Component environment:
2022-04-25 15:52:41,369 fedbiomed INFO - type = ComponentType.RESEARCHER
2022-04-25 15:52:41,416 fedbiomed INFO - Messaging researcher_ef0cd244-d377-4a9e-9f48-f04499b7e67d successfully connected to the message broker, object = <fedbiomed.common.messaging.Messaging object at 0x124261460>
2022-04-25 15:52:41,472 fedbiomed INFO - Listing available datasets in all nodes... 
2022-04-25 15:52:51,516 fedbiomed INFO - 
 Node: node_9bb93b0b-515c-4b9b-9907-54018a3c2e84 | Number of Datasets: 1 
+---------------+-------------+-------------------+---------------+-----------+
| name          | data_type   | tags              | description   | shape     |
| breast_cancer | csv         | ['breast_cancer'] | breast_cancer | [134, 13] |
+---------------+-------------+-------------------+---------------+-----------+

2022-04-25 15:52:51,518 fedbiomed INFO - 
 Node: node_34845d75-1213-4389-b875-6f070482d764 | Number of Datasets: 1 
+---------

## Define an experiment model and parameters

Declare a torch.nn MIWAETrainingPlan class to send for training on the node

Note : write **only** the code to export in the following cell

In [5]:
import torch
import torch.nn as nn
import torchvision
from torchvision import datasets, transforms
import numpy as np
import torch.distributions as td
import pandas as pd

from fedbiomed.common.training_plans import TorchTrainingPlan
from fedbiomed.common.data import DataManager
from fedbiomed.common.constants import ProcessTypes

# Here we define the model to be used. 
# You can use any class name (here 'Net')
class MIWAETrainingPlan(TorchTrainingPlan):
    def __init__(self, model_args: dict = {}):
        super(MIWAETrainingPlan, self).__init__(model_args)
        
        # Here we define the custom dependencies that will be needed by our custom Dataloader
        deps = ["from torchvision import datasets, transforms",
               "import torch.distributions as td",
               "import pandas as pd",
               "import numpy as np"]
        
        self.n_features=model_args['n_features']
        self.n_latent=model_args['n_latent']
        self.n_hidden=model_args['n_hidden']
        self.n_samples=model_args['n_samples']
        
        self.add_dependency(deps)
        
        # the encoder will output both the mean and the diagonal covariance
        self.encoder=nn.Sequential(
                        torch.nn.Linear(self.n_features, self.n_hidden),
                        torch.nn.ReLU(),
                        torch.nn.Linear(self.n_hidden, self.n_hidden),
                        torch.nn.ReLU(),
                        torch.nn.Linear(self.n_hidden, 2*self.n_latent),  
                        )
        # the decoder will output both the mean, the scale, 
        # and the number of degrees of freedoms (hence the 3*p)
        self.decoder = nn.Sequential(
                        torch.nn.Linear(self.n_latent, self.n_hidden),
                        torch.nn.ReLU(),
                        torch.nn.Linear(self.n_hidden, self.n_hidden),
                        torch.nn.ReLU(),
                        torch.nn.Linear(self.n_hidden, 3*self.n_features),  
                        )
        
        self.optimizer = torch.optim.Adam(list(self.encoder.parameters()) \
                                    + list(self.decoder.parameters()),lr=1e-3)
              
        self.encoder.apply(self.weights_init)
        self.decoder.apply(self.weights_init)
    
    def weights_init(self,layer):
        if type(layer) == nn.Linear: torch.nn.init.orthogonal_(layer.weight)
    
    def miwae_loss(self,iota_x,mask):
        batch_size = iota_x.shape[0]
        out_encoder = self.encoder(iota_x)
        # prior
        p_z = td.Independent(td.Normal(loc=torch.zeros(self.n_latent).to(self.device)\
                                       ,scale=torch.ones(self.n_latent).to(self.device)),1)
        
        q_zgivenxobs = td.Independent(td.Normal(loc=out_encoder[..., :self.n_latent],\
                                                scale=torch.nn.Softplus()\
                                                (out_encoder[..., self.n_latent:\
                                                             (2*self.n_latent)])),1)

        zgivenx = q_zgivenxobs.rsample([self.n_samples])
        zgivenx_flat = zgivenx.reshape([self.n_samples*batch_size,self.n_latent])

        out_decoder = self.decoder(zgivenx_flat)
        all_means_obs_model = out_decoder[..., :self.n_features]
        all_scales_obs_model = torch.nn.Softplus()(out_decoder[..., self.n_features:\
                                                               (2*self.n_features)]) + 0.001
        all_degfreedom_obs_model = torch.nn.Softplus()\
        (out_decoder[..., (2*self.n_features):(3*self.n_features)]) + 3

        data_flat = torch.Tensor.repeat(iota_x,[self.n_samples,1]).reshape([-1,1])
        tiledmask = torch.Tensor.repeat(mask,[self.n_samples,1])

        all_log_pxgivenz_flat = torch.distributions.StudentT\
        (loc=all_means_obs_model.reshape([-1,1]),\
         scale=all_scales_obs_model.reshape([-1,1]),\
         df=all_degfreedom_obs_model.reshape([-1,1])).log_prob(data_flat)
        all_log_pxgivenz = all_log_pxgivenz_flat.reshape([self.n_samples*batch_size,self.n_features])

        logpxobsgivenz = torch.sum(all_log_pxgivenz*tiledmask,1).reshape([self.n_samples,batch_size])
        logpz = p_z.log_prob(zgivenx)
        logq = q_zgivenxobs.log_prob(zgivenx)

        neg_bound = -torch.mean(torch.logsumexp(logpxobsgivenz + logpz - logq,0))

        return neg_bound

    def training_data(self,  batch_size = 48):
        
        df = pd.read_csv(self.dataset_path, sep=',', index_col=False)
        x_train = df.values
        x_mask = np.isfinite(x_train)
        # xhat_0: missing values are replaced by zeros. 
        #This x_hat0 is what will be fed to our encoder.
        xhat_0 = np.copy(x_train)
        xhat_0[np.isnan(x_train)] = 0
        train_kwargs = {'batch_size': batch_size, 'shuffle': True}
        
        data_manager = DataManager(dataset=xhat_0 , target=x_mask , **train_kwargs)
        
        return data_manager
    
    def training_step(self, data, mask):
        self.encoder.zero_grad()
        self.decoder.zero_grad()
        loss = self.miwae_loss(iota_x = data,mask = mask)
        return loss

This group of arguments correspond respectively:
* `model_args`: a dictionary with the arguments related to the model (e.g. number of layers, features, etc.). This will be passed to the model class on the node side. 
* `training_args`: a dictionary containing the arguments for the training routine (e.g. batch size, learning rate, epochs, etc.). This will be passed to the routine on the node side.
* data `tags` to search nodes for training.
* total number of `rounds`.
If FedProx optimisation is requested, `fedprox_mu` parameter must be defined here. It also must be a float between XX and YY.

**NOTE:** typos and/or lack of positional (required) arguments will raise error. 🤓

In [6]:
h = 128 # number of hidden units in (same for all MLPs)
d = 10 # dimension of the latent space, we choose d=1 for visualisation purposes
K = 20 # number of IS during training

n_epochs=5

model_args = {'n_features':data_size, 'n_latent':d,'n_hidden':h,'n_samples':K}

training_args = {
    'batch_size': 48, 
    'lr': 1e-3, 
    #'fedprox_mu': 0.01, 
    'log_interval' : 1,
    'epochs': n_epochs, 
    'dry_run': False,  
    'batch_maxnum': 200 # Fast pass for development : only use ( batch_maxnum * batch_size ) samples
}

tags =  ['breast_cancer']
rounds = 15

## Declare and run the experiment

- search nodes serving data for these `tags`, optionally filter on a list of node ID with `nodes`
- run a round of local training on nodes with model defined in `model_path` + federation with `aggregator`
- run for `round_limit` rounds, applying the `node_selection_strategy` between the rounds

In [7]:
from fedbiomed.researcher.experiment import Experiment
from fedbiomed.researcher.aggregators.fedavg import FedAverage

exp = Experiment(tags=tags,
                 model_args=model_args,
                 model_class=MIWAETrainingPlan,
                 training_args=training_args,
                 round_limit=rounds,
                 aggregator=FedAverage(),
                 node_selection_strategy=None)

2022-04-25 15:53:07,983 fedbiomed INFO - Searching dataset with data tags: ['breast_cancer'] for all nodes
2022-04-25 15:53:17,996 fedbiomed INFO - Node selected for training -> node_72061288-2fe4-40b2-85db-34c2624c12bb
2022-04-25 15:53:17,998 fedbiomed INFO - Node selected for training -> node_9bb93b0b-515c-4b9b-9907-54018a3c2e84
2022-04-25 15:53:17,999 fedbiomed INFO - Node selected for training -> node_34845d75-1213-4389-b875-6f070482d764
2022-04-25 15:53:18,004 fedbiomed INFO - Checking data quality of federated datasets...
2022-04-25 15:53:18,111 fedbiomed DEBUG - Model file has been saved: /Users/balelli/ownCloud/INRIA_EPIONE/FedBioMed/fedbiomed/var/experiments/Experiment_0000/my_model_d2bf3bcf-56f9-412c-944c-461db1cea595.py
2022-04-25 15:53:18,459 fedbiomed DEBUG - upload (HTTP POST request) of file /Users/balelli/ownCloud/INRIA_EPIONE/FedBioMed/fedbiomed/var/experiments/Experiment_0000/my_model_d2bf3bcf-56f9-412c-944c-461db1cea595.py successful, with status code 201
2022-04-25 

Let's start the experiment.

By default, this function doesn't stop until all the `round_limit` rounds are done for all the nodes

In [8]:
exp.run()

2022-04-25 15:53:29,962 fedbiomed INFO - Sampled nodes in round 0 ['node_72061288-2fe4-40b2-85db-34c2624c12bb', 'node_9bb93b0b-515c-4b9b-9907-54018a3c2e84', 'node_34845d75-1213-4389-b875-6f070482d764']
2022-04-25 15:53:29,963 fedbiomed INFO - [1mSending request[0m 
					[1m To[0m: node_72061288-2fe4-40b2-85db-34c2624c12bb 
					[1m Request: [0m: Perform training with the arguments: {'researcher_id': 'researcher_ef0cd244-d377-4a9e-9f48-f04499b7e67d', 'job_id': 'f968d1c4-1d5a-47cf-9048-f942afc6f39b', 'training_args': {'test_ratio': 0.0, 'test_on_local_updates': False, 'test_on_global_updates': False, 'test_metric': None, 'test_metric_args': {}, 'batch_size': 48, 'lr': 0.001, 'log_interval': 1, 'epochs': 5, 'dry_run': False, 'batch_maxnum': 200}, 'training': True, 'model_args': {'n_features': 13, 'n_latent': 10, 'n_hidden': 128, 'n_samples': 20}, 'command': 'train', 'model_url': 'http://localhost:8844/media/uploads/2022/04/25/my_model_d2bf3bcf-56f9-412c-944c-461db1cea595.py', 'param

2022-04-25 15:53:31,799 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_34845d75-1213-4389-b875-6f070482d764 
					 Epoch: 2 | Completed: 48/133 (33%) 
 					 Loss: [1m10.077735[0m 
					 ---------
2022-04-25 15:53:31,838 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_9bb93b0b-515c-4b9b-9907-54018a3c2e84 
					 Epoch: 2 | Completed: 96/133 (67%) 
 					 Loss: [1m6.128199[0m 
					 ---------
2022-04-25 15:53:31,854 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_72061288-2fe4-40b2-85db-34c2624c12bb 
					 Epoch: 1 | Completed: 126/138 (100%) 
 					 Loss: [1m3.868602[0m 
					 ---------
2022-04-25 15:53:31,856 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_34845d75-1213-4389-b875-6f070482d764 
					 Epoch: 2 | Completed: 96/133 (67%) 
 					 Loss: [1m11.076214[0m 
					 ---------
2022-04-25 15:53:31,887 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_9bb93b0b-515c-4b9b-9907-54018a3c2e84 
					 Epoch: 2 | Completed: 111/133 (100%) 
 					 Loss

2022-04-25 15:53:40,048 fedbiomed DEBUG - upload (HTTP GET request) of file node_params_b28ddb97-0fc5-4750-810c-772d003aa96a.pt successful, with status code 200
2022-04-25 15:53:40,065 fedbiomed INFO - Downloading model params after training on node_34845d75-1213-4389-b875-6f070482d764 - from http://localhost:8844/media/uploads/2022/04/25/node_params_2c71c2f3-b685-4a4d-9f90-4397754f40be.pt
2022-04-25 15:53:40,094 fedbiomed DEBUG - upload (HTTP GET request) of file node_params_51c2316d-9c3c-4ad8-9038-2e5150603ce1.pt successful, with status code 200
2022-04-25 15:53:40,101 fedbiomed INFO - Downloading model params after training on node_72061288-2fe4-40b2-85db-34c2624c12bb - from http://localhost:8844/media/uploads/2022/04/25/node_params_a8da16d3-0760-4aa0-8606-1d3859321f8d.pt
2022-04-25 15:53:40,145 fedbiomed DEBUG - upload (HTTP GET request) of file node_params_15cff7c9-85c4-4ec2-b8a7-442fd41f8f52.pt successful, with status code 200
2022-04-25 15:53:40,160 fedbiomed INFO - Nodes that s

2022-04-25 15:53:40,648 fedbiomed INFO - [1mINFO[0m
					[1m NODE[0m node_34845d75-1213-4389-b875-6f070482d764
					[1m MESSAGE:[0m training with arguments {'history_monitor': <fedbiomed.node.history_monitor.HistoryMonitor object at 0x13bde34c0>, 'node_args': {'gpu': False, 'gpu_num': None, 'gpu_only': False}, 'batch_size': 48, 'lr': 0.001, 'log_interval': 1, 'epochs': 5, 'dry_run': False, 'batch_maxnum': 200}[0m
-----------------------------------------------------------------
2022-04-25 15:53:40,655 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_72061288-2fe4-40b2-85db-34c2624c12bb 
					 Epoch: 1 | Completed: 48/138 (33%) 
 					 Loss: [1m3.464229[0m 
					 ---------
2022-04-25 15:53:40,692 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_72061288-2fe4-40b2-85db-34c2624c12bb 
					 Epoch: 1 | Completed: 96/138 (67%) 
 					 Loss: [1m4.490770[0m 
					 ---------
2022-04-25 15:53:40,723 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_34845d75-1213-4389-

2022-04-25 15:53:41,869 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_34845d75-1213-4389-b875-6f070482d764 
					 Epoch: 5 | Completed: 96/133 (67%) 
 					 Loss: [1m8.266540[0m 
					 ---------
2022-04-25 15:53:41,878 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_72061288-2fe4-40b2-85db-34c2624c12bb 
					 Epoch: 5 | Completed: 96/138 (67%) 
 					 Loss: [1m3.585779[0m 
					 ---------
2022-04-25 15:53:41,907 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_34845d75-1213-4389-b875-6f070482d764 
					 Epoch: 5 | Completed: 111/133 (100%) 
 					 Loss: [1m8.263949[0m 
					 ---------
2022-04-25 15:53:41,975 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_72061288-2fe4-40b2-85db-34c2624c12bb 
					 Epoch: 5 | Completed: 126/138 (100%) 
 					 Loss: [1m3.715469[0m 
					 ---------
2022-04-25 15:53:42,189 fedbiomed INFO - [1mINFO[0m
					[1m NODE[0m node_9bb93b0b-515c-4b9b-9907-54018a3c2e84
					[1m MESSAGE:[0m results uploaded successfully [0m

2022-04-25 15:53:50,908 fedbiomed INFO - [1mINFO[0m
					[1m NODE[0m node_72061288-2fe4-40b2-85db-34c2624c12bb
					[1m MESSAGE:[0m training with arguments {'history_monitor': <fedbiomed.node.history_monitor.HistoryMonitor object at 0x137c40d00>, 'node_args': {'gpu': False, 'gpu_num': None, 'gpu_only': False}, 'batch_size': 48, 'lr': 0.001, 'log_interval': 1, 'epochs': 5, 'dry_run': False, 'batch_maxnum': 200}[0m
-----------------------------------------------------------------
2022-04-25 15:53:51,122 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_9bb93b0b-515c-4b9b-9907-54018a3c2e84 
					 Epoch: 1 | Completed: 48/133 (33%) 
 					 Loss: [1m4.233840[0m 
					 ---------
2022-04-25 15:53:51,124 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_72061288-2fe4-40b2-85db-34c2624c12bb 
					 Epoch: 1 | Completed: 48/138 (33%) 
 					 Loss: [1m2.722369[0m 
					 ---------
2022-04-25 15:53:51,125 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_9bb93b0b-515c-4b9b-

2022-04-25 15:53:51,814 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_9bb93b0b-515c-4b9b-9907-54018a3c2e84 
					 Epoch: 5 | Completed: 48/133 (33%) 
 					 Loss: [1m3.735688[0m 
					 ---------
2022-04-25 15:53:51,818 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_34845d75-1213-4389-b875-6f070482d764 
					 Epoch: 4 | Completed: 48/133 (33%) 
 					 Loss: [1m6.244238[0m 
					 ---------
2022-04-25 15:53:51,847 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_34845d75-1213-4389-b875-6f070482d764 
					 Epoch: 4 | Completed: 96/133 (67%) 
 					 Loss: [1m6.254208[0m 
					 ---------
2022-04-25 15:53:51,865 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_9bb93b0b-515c-4b9b-9907-54018a3c2e84 
					 Epoch: 5 | Completed: 96/133 (67%) 
 					 Loss: [1m4.169677[0m 
					 ---------
2022-04-25 15:53:51,899 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_34845d75-1213-4389-b875-6f070482d764 
					 Epoch: 4 | Completed: 111/133 (100%) 
 					 Loss: [

2022-04-25 15:54:01,029 fedbiomed DEBUG - researcher_ef0cd244-d377-4a9e-9f48-f04499b7e67d
					[1m NODE[0m node_34845d75-1213-4389-b875-6f070482d764
					[1m MESSAGE:[0m There is no test activated for the round. Please set flag for `test_on_global_updates`, `test_on_local_updates`, or both. Splitting dataset for testing will be ignored[0m
-----------------------------------------------------------------
2022-04-25 15:54:01,222 fedbiomed INFO - [1mINFO[0m
					[1m NODE[0m node_34845d75-1213-4389-b875-6f070482d764
					[1m MESSAGE:[0m training with arguments {'history_monitor': <fedbiomed.node.history_monitor.HistoryMonitor object at 0x13b2d4190>, 'node_args': {'gpu': False, 'gpu_num': None, 'gpu_only': False}, 'batch_size': 48, 'lr': 0.001, 'log_interval': 1, 'epochs': 5, 'dry_run': False, 'batch_maxnum': 200}[0m
-----------------------------------------------------------------
2022-04-25 15:54:01,292 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_34845d75-1213-4389-

2022-04-25 15:54:01,875 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_9bb93b0b-515c-4b9b-9907-54018a3c2e84 
					 Epoch: 3 | Completed: 111/133 (100%) 
 					 Loss: [1m2.484135[0m 
					 ---------
2022-04-25 15:54:01,895 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_72061288-2fe4-40b2-85db-34c2624c12bb 
					 Epoch: 4 | Completed: 48/138 (33%) 
 					 Loss: [1m2.179721[0m 
					 ---------
2022-04-25 15:54:01,900 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_34845d75-1213-4389-b875-6f070482d764 
					 Epoch: 4 | Completed: 96/133 (67%) 
 					 Loss: [1m4.705926[0m 
					 ---------
2022-04-25 15:54:01,932 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_9bb93b0b-515c-4b9b-9907-54018a3c2e84 
					 Epoch: 4 | Completed: 48/133 (33%) 
 					 Loss: [1m2.431350[0m 
					 ---------
2022-04-25 15:54:01,953 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_34845d75-1213-4389-b875-6f070482d764 
					 Epoch: 4 | Completed: 111/133 (100%) 
 					 Loss: 

2022-04-25 15:54:11,326 fedbiomed DEBUG - researcher_ef0cd244-d377-4a9e-9f48-f04499b7e67d
2022-04-25 15:54:11,328 fedbiomed INFO - [1mSending request[0m 
					[1m To[0m: node_34845d75-1213-4389-b875-6f070482d764 
					[1m Request: [0m: Perform training with the arguments: {'researcher_id': 'researcher_ef0cd244-d377-4a9e-9f48-f04499b7e67d', 'job_id': 'f968d1c4-1d5a-47cf-9048-f942afc6f39b', 'training_args': {'test_ratio': 0.0, 'test_on_local_updates': False, 'test_on_global_updates': False, 'test_metric': None, 'test_metric_args': {}, 'batch_size': 48, 'lr': 0.001, 'log_interval': 1, 'epochs': 5, 'dry_run': False, 'batch_maxnum': 200}, 'training': True, 'model_args': {'n_features': 13, 'n_latent': 10, 'n_hidden': 128, 'n_samples': 20}, 'command': 'train', 'model_url': 'http://localhost:8844/media/uploads/2022/04/25/my_model_d2bf3bcf-56f9-412c-944c-461db1cea595.py', 'params_url': 'http://localhost:8844/media/uploads/2022/04/25/aggregated_params_c42ed61b-2c78-4470-9a90-4a170e581000.p

2022-04-25 15:54:11,928 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_72061288-2fe4-40b2-85db-34c2624c12bb 
					 Epoch: 3 | Completed: 96/138 (67%) 
 					 Loss: [1m1.726970[0m 
					 ---------
2022-04-25 15:54:11,939 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_9bb93b0b-515c-4b9b-9907-54018a3c2e84 
					 Epoch: 3 | Completed: 111/133 (100%) 
 					 Loss: [1m2.136501[0m 
					 ---------
2022-04-25 15:54:11,954 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_34845d75-1213-4389-b875-6f070482d764 
					 Epoch: 3 | Completed: 48/133 (33%) 
 					 Loss: [1m3.610759[0m 
					 ---------
2022-04-25 15:54:11,961 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_72061288-2fe4-40b2-85db-34c2624c12bb 
					 Epoch: 3 | Completed: 126/138 (100%) 
 					 Loss: [1m1.164123[0m 
					 ---------
2022-04-25 15:54:11,972 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_9bb93b0b-515c-4b9b-9907-54018a3c2e84 
					 Epoch: 4 | Completed: 48/133 (33%) 
 					 Loss: 

2022-04-25 15:54:21,730 fedbiomed DEBUG - researcher_ef0cd244-d377-4a9e-9f48-f04499b7e67d
2022-04-25 15:54:21,735 fedbiomed INFO - [1mSending request[0m 
					[1m To[0m: node_9bb93b0b-515c-4b9b-9907-54018a3c2e84 
					[1m Request: [0m: Perform training with the arguments: {'researcher_id': 'researcher_ef0cd244-d377-4a9e-9f48-f04499b7e67d', 'job_id': 'f968d1c4-1d5a-47cf-9048-f942afc6f39b', 'training_args': {'test_ratio': 0.0, 'test_on_local_updates': False, 'test_on_global_updates': False, 'test_metric': None, 'test_metric_args': {}, 'batch_size': 48, 'lr': 0.001, 'log_interval': 1, 'epochs': 5, 'dry_run': False, 'batch_maxnum': 200}, 'training': True, 'model_args': {'n_features': 13, 'n_latent': 10, 'n_hidden': 128, 'n_samples': 20}, 'command': 'train', 'model_url': 'http://localhost:8844/media/uploads/2022/04/25/my_model_d2bf3bcf-56f9-412c-944c-461db1cea595.py', 'params_url': 'http://localhost:8844/media/uploads/2022/04/25/aggregated_params_c3a12503-f774-4105-9cdc-1207e94ed420.p

2022-04-25 15:54:22,244 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_34845d75-1213-4389-b875-6f070482d764 
					 Epoch: 2 | Completed: 48/133 (33%) 
 					 Loss: [1m3.697309[0m 
					 ---------
2022-04-25 15:54:22,264 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_72061288-2fe4-40b2-85db-34c2624c12bb 
					 Epoch: 3 | Completed: 48/138 (33%) 
 					 Loss: [1m-0.483384[0m 
					 ---------
2022-04-25 15:54:22,283 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_9bb93b0b-515c-4b9b-9907-54018a3c2e84 
					 Epoch: 3 | Completed: 48/133 (33%) 
 					 Loss: [1m2.294437[0m 
					 ---------
2022-04-25 15:54:22,302 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_34845d75-1213-4389-b875-6f070482d764 
					 Epoch: 2 | Completed: 96/133 (67%) 
 					 Loss: [1m3.553247[0m 
					 ---------
2022-04-25 15:54:22,318 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_9bb93b0b-515c-4b9b-9907-54018a3c2e84 
					 Epoch: 3 | Completed: 96/133 (67%) 
 					 Loss: [1

2022-04-25 15:54:32,074 fedbiomed DEBUG - upload (HTTP POST request) of file /Users/balelli/ownCloud/INRIA_EPIONE/FedBioMed/fedbiomed/var/experiments/Experiment_0000/aggregated_params_88010d9f-3848-44d5-a4f4-6c73eb2b035b.pt successful, with status code 201
2022-04-25 15:54:32,076 fedbiomed INFO - Saved aggregated params for round 5 in /Users/balelli/ownCloud/INRIA_EPIONE/FedBioMed/fedbiomed/var/experiments/Experiment_0000/aggregated_params_88010d9f-3848-44d5-a4f4-6c73eb2b035b.pt
2022-04-25 15:54:32,085 fedbiomed INFO - Sampled nodes in round 6 ['node_72061288-2fe4-40b2-85db-34c2624c12bb', 'node_9bb93b0b-515c-4b9b-9907-54018a3c2e84', 'node_34845d75-1213-4389-b875-6f070482d764']
2022-04-25 15:54:32,087 fedbiomed INFO - [1mSending request[0m 
					[1m To[0m: node_72061288-2fe4-40b2-85db-34c2624c12bb 
					[1m Request: [0m: Perform training with the arguments: {'researcher_id': 'researcher_ef0cd244-d377-4a9e-9f48-f04499b7e67d', 'job_id': 'f968d1c4-1d5a-47cf-9048-f942afc6f39b', 'train

2022-04-25 15:54:32,410 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_9bb93b0b-515c-4b9b-9907-54018a3c2e84 
					 Epoch: 1 | Completed: 111/133 (100%) 
 					 Loss: [1m1.070515[0m 
					 ---------
2022-04-25 15:54:32,413 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_72061288-2fe4-40b2-85db-34c2624c12bb 
					 Epoch: 1 | Completed: 126/138 (100%) 
 					 Loss: [1m0.800889[0m 
					 ---------
2022-04-25 15:54:32,435 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_34845d75-1213-4389-b875-6f070482d764 
					 Epoch: 1 | Completed: 111/133 (100%) 
 					 Loss: [1m4.592980[0m 
					 ---------
2022-04-25 15:54:32,443 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_72061288-2fe4-40b2-85db-34c2624c12bb 
					 Epoch: 2 | Completed: 48/138 (33%) 
 					 Loss: [1m0.275570[0m 
					 ---------
2022-04-25 15:54:32,469 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_9bb93b0b-515c-4b9b-9907-54018a3c2e84 
					 Epoch: 2 | Completed: 48/133 (33%) 
 					 Loss

2022-04-25 15:54:33,527 fedbiomed INFO - [1mINFO[0m
					[1m NODE[0m node_34845d75-1213-4389-b875-6f070482d764
					[1m MESSAGE:[0m results uploaded successfully [0m
-----------------------------------------------------------------
2022-04-25 15:54:33,674 fedbiomed INFO - [1mINFO[0m
					[1m NODE[0m node_72061288-2fe4-40b2-85db-34c2624c12bb
					[1m MESSAGE:[0m results uploaded successfully [0m
-----------------------------------------------------------------
2022-04-25 15:54:42,129 fedbiomed INFO - Downloading model params after training on node_9bb93b0b-515c-4b9b-9907-54018a3c2e84 - from http://localhost:8844/media/uploads/2022/04/25/node_params_7de1992d-6034-479c-9c19-bb4672f06507.pt
2022-04-25 15:54:42,178 fedbiomed DEBUG - upload (HTTP GET request) of file node_params_8ec61215-e1ef-4dea-878c-022fa5a111a4.pt successful, with status code 200
2022-04-25 15:54:42,191 fedbiomed INFO - Downloading model params after training on node_34845d75-1213-4389-b875-6f070482d764 - f

					[1m NODE[0m node_9bb93b0b-515c-4b9b-9907-54018a3c2e84
					[1m MESSAGE:[0m There is no test activated for the round. Please set flag for `test_on_global_updates`, `test_on_local_updates`, or both. Splitting dataset for testing will be ignored[0m
-----------------------------------------------------------------
2022-04-25 15:54:42,769 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_34845d75-1213-4389-b875-6f070482d764 
					 Epoch: 1 | Completed: 96/133 (67%) 
 					 Loss: [1m4.596344[0m 
					 ---------
2022-04-25 15:54:42,771 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_72061288-2fe4-40b2-85db-34c2624c12bb 
					 Epoch: 1 | Completed: 96/138 (67%) 
 					 Loss: [1m1.255809[0m 
					 ---------
2022-04-25 15:54:42,774 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_34845d75-1213-4389-b875-6f070482d764 
					 Epoch: 1 | Completed: 111/133 (100%) 
 					 Loss: [1m2.453826[0m 
					 ---------
2022-04-25 15:54:42,777 fedbiomed INFO - [1mTRAINING[0m 

2022-04-25 15:54:43,366 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_72061288-2fe4-40b2-85db-34c2624c12bb 
					 Epoch: 5 | Completed: 96/138 (67%) 
 					 Loss: [1m0.468686[0m 
					 ---------
2022-04-25 15:54:43,386 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_34845d75-1213-4389-b875-6f070482d764 
					 Epoch: 5 | Completed: 96/133 (67%) 
 					 Loss: [1m3.351791[0m 
					 ---------
2022-04-25 15:54:43,402 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_72061288-2fe4-40b2-85db-34c2624c12bb 
					 Epoch: 5 | Completed: 126/138 (100%) 
 					 Loss: [1m-0.527020[0m 
					 ---------
2022-04-25 15:54:43,425 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_34845d75-1213-4389-b875-6f070482d764 
					 Epoch: 5 | Completed: 111/133 (100%) 
 					 Loss: [1m2.816196[0m 
					 ---------
2022-04-25 15:54:43,453 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_9bb93b0b-515c-4b9b-9907-54018a3c2e84 
					 Epoch: 5 | Completed: 48/133 (33%) 
 					 Loss:

					[1m NODE[0m node_72061288-2fe4-40b2-85db-34c2624c12bb
					[1m MESSAGE:[0m There is no test activated for the round. Please set flag for `test_on_global_updates`, `test_on_local_updates`, or both. Splitting dataset for testing will be ignored[0m
-----------------------------------------------------------------
2022-04-25 15:54:52,995 fedbiomed INFO - [1mINFO[0m
					[1m NODE[0m node_9bb93b0b-515c-4b9b-9907-54018a3c2e84
					[1m MESSAGE:[0m training with arguments {'history_monitor': <fedbiomed.node.history_monitor.HistoryMonitor object at 0x13f7d44c0>, 'node_args': {'gpu': False, 'gpu_num': None, 'gpu_only': False}, 'batch_size': 48, 'lr': 0.001, 'log_interval': 1, 'epochs': 5, 'dry_run': False, 'batch_maxnum': 200}[0m
-----------------------------------------------------------------
2022-04-25 15:54:53,013 fedbiomed INFO - [1mINFO[0m
					[1m NODE[0m node_72061288-2fe4-40b2-85db-34c2624c12bb
					[1m MESSAGE:[0m training with arguments {'history_monitor': <fedbi

2022-04-25 15:54:53,594 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_72061288-2fe4-40b2-85db-34c2624c12bb 
					 Epoch: 4 | Completed: 126/138 (100%) 
 					 Loss: [1m0.891807[0m 
					 ---------
2022-04-25 15:54:53,620 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_34845d75-1213-4389-b875-6f070482d764 
					 Epoch: 3 | Completed: 111/133 (100%) 
 					 Loss: [1m2.620686[0m 
					 ---------
2022-04-25 15:54:53,641 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_72061288-2fe4-40b2-85db-34c2624c12bb 
					 Epoch: 5 | Completed: 48/138 (33%) 
 					 Loss: [1m0.746450[0m 
					 ---------
2022-04-25 15:54:53,675 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_72061288-2fe4-40b2-85db-34c2624c12bb 
					 Epoch: 5 | Completed: 96/138 (67%) 
 					 Loss: [1m0.956461[0m 
					 ---------
2022-04-25 15:54:53,683 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_34845d75-1213-4389-b875-6f070482d764 
					 Epoch: 4 | Completed: 48/133 (33%) 
 					 Loss: 

2022-04-25 15:55:03,154 fedbiomed DEBUG - researcher_ef0cd244-d377-4a9e-9f48-f04499b7e67d
2022-04-25 15:55:03,157 fedbiomed INFO - [1mSending request[0m 
					[1m To[0m: node_34845d75-1213-4389-b875-6f070482d764 
					[1m Request: [0m: Perform training with the arguments: {'researcher_id': 'researcher_ef0cd244-d377-4a9e-9f48-f04499b7e67d', 'job_id': 'f968d1c4-1d5a-47cf-9048-f942afc6f39b', 'training_args': {'test_ratio': 0.0, 'test_on_local_updates': False, 'test_on_global_updates': False, 'test_metric': None, 'test_metric_args': {}, 'batch_size': 48, 'lr': 0.001, 'log_interval': 1, 'epochs': 5, 'dry_run': False, 'batch_maxnum': 200}, 'training': True, 'model_args': {'n_features': 13, 'n_latent': 10, 'n_hidden': 128, 'n_samples': 20}, 'command': 'train', 'model_url': 'http://localhost:8844/media/uploads/2022/04/25/my_model_d2bf3bcf-56f9-412c-944c-461db1cea595.py', 'params_url': 'http://localhost:8844/media/uploads/2022/04/25/aggregated_params_34afd6ee-d731-483a-b1f3-0fdf95ef74a8.p

2022-04-25 15:55:03,744 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_72061288-2fe4-40b2-85db-34c2624c12bb 
					 Epoch: 3 | Completed: 96/138 (67%) 
 					 Loss: [1m-0.048885[0m 
					 ---------
2022-04-25 15:55:03,760 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_9bb93b0b-515c-4b9b-9907-54018a3c2e84 
					 Epoch: 3 | Completed: 96/133 (67%) 
 					 Loss: [1m1.554569[0m 
					 ---------
2022-04-25 15:55:03,778 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_72061288-2fe4-40b2-85db-34c2624c12bb 
					 Epoch: 3 | Completed: 126/138 (100%) 
 					 Loss: [1m0.829475[0m 
					 ---------
2022-04-25 15:55:03,787 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_34845d75-1213-4389-b875-6f070482d764 
					 Epoch: 3 | Completed: 96/133 (67%) 
 					 Loss: [1m3.626635[0m 
					 ---------
2022-04-25 15:55:03,805 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_9bb93b0b-515c-4b9b-9907-54018a3c2e84 
					 Epoch: 3 | Completed: 111/133 (100%) 
 					 Loss:

2022-04-25 15:55:13,492 fedbiomed DEBUG - researcher_ef0cd244-d377-4a9e-9f48-f04499b7e67d
2022-04-25 15:55:13,495 fedbiomed INFO - [1mSending request[0m 
					[1m To[0m: node_9bb93b0b-515c-4b9b-9907-54018a3c2e84 
					[1m Request: [0m: Perform training with the arguments: {'researcher_id': 'researcher_ef0cd244-d377-4a9e-9f48-f04499b7e67d', 'job_id': 'f968d1c4-1d5a-47cf-9048-f942afc6f39b', 'training_args': {'test_ratio': 0.0, 'test_on_local_updates': False, 'test_on_global_updates': False, 'test_metric': None, 'test_metric_args': {}, 'batch_size': 48, 'lr': 0.001, 'log_interval': 1, 'epochs': 5, 'dry_run': False, 'batch_maxnum': 200}, 'training': True, 'model_args': {'n_features': 13, 'n_latent': 10, 'n_hidden': 128, 'n_samples': 20}, 'command': 'train', 'model_url': 'http://localhost:8844/media/uploads/2022/04/25/my_model_d2bf3bcf-56f9-412c-944c-461db1cea595.py', 'params_url': 'http://localhost:8844/media/uploads/2022/04/25/aggregated_params_4704ca5b-1ed7-43b6-821d-5d67b9b2df41.p

2022-04-25 15:55:13,982 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_72061288-2fe4-40b2-85db-34c2624c12bb 
					 Epoch: 2 | Completed: 126/138 (100%) 
 					 Loss: [1m0.362217[0m 
					 ---------
2022-04-25 15:55:14,017 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_72061288-2fe4-40b2-85db-34c2624c12bb 
					 Epoch: 3 | Completed: 48/138 (33%) 
 					 Loss: [1m1.058595[0m 
					 ---------
2022-04-25 15:55:14,019 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_34845d75-1213-4389-b875-6f070482d764 
					 Epoch: 2 | Completed: 96/133 (67%) 
 					 Loss: [1m3.576982[0m 
					 ---------
2022-04-25 15:55:14,051 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_72061288-2fe4-40b2-85db-34c2624c12bb 
					 Epoch: 3 | Completed: 96/138 (67%) 
 					 Loss: [1m0.659602[0m 
					 ---------
2022-04-25 15:55:14,062 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_9bb93b0b-515c-4b9b-9907-54018a3c2e84 
					 Epoch: 3 | Completed: 48/133 (33%) 
 					 Loss: [

2022-04-25 15:55:23,881 fedbiomed DEBUG - upload (HTTP POST request) of file /Users/balelli/ownCloud/INRIA_EPIONE/FedBioMed/fedbiomed/var/experiments/Experiment_0000/aggregated_params_8a6ad915-80f8-4355-a9d8-f3529a795d96.pt successful, with status code 201
2022-04-25 15:55:23,883 fedbiomed INFO - Saved aggregated params for round 10 in /Users/balelli/ownCloud/INRIA_EPIONE/FedBioMed/fedbiomed/var/experiments/Experiment_0000/aggregated_params_8a6ad915-80f8-4355-a9d8-f3529a795d96.pt
2022-04-25 15:55:23,885 fedbiomed INFO - Sampled nodes in round 11 ['node_72061288-2fe4-40b2-85db-34c2624c12bb', 'node_9bb93b0b-515c-4b9b-9907-54018a3c2e84', 'node_34845d75-1213-4389-b875-6f070482d764']
2022-04-25 15:55:23,888 fedbiomed INFO - [1mSending request[0m 
					[1m To[0m: node_72061288-2fe4-40b2-85db-34c2624c12bb 
					[1m Request: [0m: Perform training with the arguments: {'researcher_id': 'researcher_ef0cd244-d377-4a9e-9f48-f04499b7e67d', 'job_id': 'f968d1c4-1d5a-47cf-9048-f942afc6f39b', 'tra

2022-04-25 15:55:24,238 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_9bb93b0b-515c-4b9b-9907-54018a3c2e84 
					 Epoch: 1 | Completed: 111/133 (100%) 
 					 Loss: [1m2.216051[0m 
					 ---------
2022-04-25 15:55:24,263 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_34845d75-1213-4389-b875-6f070482d764 
					 Epoch: 1 | Completed: 111/133 (100%) 
 					 Loss: [1m4.324576[0m 
					 ---------
2022-04-25 15:55:24,297 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_9bb93b0b-515c-4b9b-9907-54018a3c2e84 
					 Epoch: 2 | Completed: 48/133 (33%) 
 					 Loss: [1m1.819958[0m 
					 ---------
2022-04-25 15:55:24,300 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_72061288-2fe4-40b2-85db-34c2624c12bb 
					 Epoch: 1 | Completed: 126/138 (100%) 
 					 Loss: [1m-0.908472[0m 
					 ---------
2022-04-25 15:55:24,302 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_34845d75-1213-4389-b875-6f070482d764 
					 Epoch: 2 | Completed: 48/133 (33%) 
 					 Los

2022-04-25 15:55:25,355 fedbiomed INFO - [1mINFO[0m
					[1m NODE[0m node_34845d75-1213-4389-b875-6f070482d764
					[1m MESSAGE:[0m results uploaded successfully [0m
-----------------------------------------------------------------
2022-04-25 15:55:25,418 fedbiomed INFO - [1mINFO[0m
					[1m NODE[0m node_72061288-2fe4-40b2-85db-34c2624c12bb
					[1m MESSAGE:[0m results uploaded successfully [0m
-----------------------------------------------------------------
2022-04-25 15:55:33,928 fedbiomed INFO - Downloading model params after training on node_9bb93b0b-515c-4b9b-9907-54018a3c2e84 - from http://localhost:8844/media/uploads/2022/04/25/node_params_080f9393-f657-4439-b202-56b542c8e3a4.pt
2022-04-25 15:55:33,970 fedbiomed DEBUG - upload (HTTP GET request) of file node_params_ff2752a0-0935-4e92-89aa-47fbdd641bb6.pt successful, with status code 200
2022-04-25 15:55:33,977 fedbiomed INFO - Downloading model params after training on node_34845d75-1213-4389-b875-6f070482d764 - f

2022-04-25 15:55:34,558 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_34845d75-1213-4389-b875-6f070482d764 
					 Epoch: 1 | Completed: 111/133 (100%) 
 					 Loss: [1m2.157709[0m 
					 ---------
					[1m NODE[0m node_9bb93b0b-515c-4b9b-9907-54018a3c2e84
					[1m MESSAGE:[0m There is no test activated for the round. Please set flag for `test_on_global_updates`, `test_on_local_updates`, or both. Splitting dataset for testing will be ignored[0m
-----------------------------------------------------------------
2022-04-25 15:55:34,587 fedbiomed INFO - [1mINFO[0m
					[1m NODE[0m node_9bb93b0b-515c-4b9b-9907-54018a3c2e84
					[1m MESSAGE:[0m training with arguments {'history_monitor': <fedbiomed.node.history_monitor.HistoryMonitor object at 0x13e8cad30>, 'node_args': {'gpu': False, 'gpu_num': None, 'gpu_only': False}, 'batch_size': 48, 'lr': 0.001, 'log_interval': 1, 'epochs': 5, 'dry_run': False, 'batch_maxnum': 200}[0m
----------------------------------------------

2022-04-25 15:55:35,164 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_72061288-2fe4-40b2-85db-34c2624c12bb 
					 Epoch: 5 | Completed: 48/138 (33%) 
 					 Loss: [1m0.520248[0m 
					 ---------
2022-04-25 15:55:35,167 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_9bb93b0b-515c-4b9b-9907-54018a3c2e84 
					 Epoch: 5 | Completed: 96/133 (67%) 
 					 Loss: [1m1.416206[0m 
					 ---------
2022-04-25 15:55:35,184 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_34845d75-1213-4389-b875-6f070482d764 
					 Epoch: 5 | Completed: 96/133 (67%) 
 					 Loss: [1m3.358291[0m 
					 ---------
2022-04-25 15:55:35,210 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_34845d75-1213-4389-b875-6f070482d764 
					 Epoch: 5 | Completed: 111/133 (100%) 
 					 Loss: [1m1.811840[0m 
					 ---------
2022-04-25 15:55:35,247 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_72061288-2fe4-40b2-85db-34c2624c12bb 
					 Epoch: 5 | Completed: 96/138 (67%) 
 					 Loss: [

					[1m NODE[0m node_72061288-2fe4-40b2-85db-34c2624c12bb
					[1m MESSAGE:[0m There is no test activated for the round. Please set flag for `test_on_global_updates`, `test_on_local_updates`, or both. Splitting dataset for testing will be ignored[0m
-----------------------------------------------------------------
2022-04-25 15:55:44,959 fedbiomed INFO - [1mINFO[0m
					[1m NODE[0m node_9bb93b0b-515c-4b9b-9907-54018a3c2e84
					[1m MESSAGE:[0m training with arguments {'history_monitor': <fedbiomed.node.history_monitor.HistoryMonitor object at 0x13e914bb0>, 'node_args': {'gpu': False, 'gpu_num': None, 'gpu_only': False}, 'batch_size': 48, 'lr': 0.001, 'log_interval': 1, 'epochs': 5, 'dry_run': False, 'batch_maxnum': 200}[0m
-----------------------------------------------------------------
2022-04-25 15:55:44,961 fedbiomed INFO - [1mINFO[0m
					[1m NODE[0m node_72061288-2fe4-40b2-85db-34c2624c12bb
					[1m MESSAGE:[0m training with arguments {'history_monitor': <fedbi

2022-04-25 15:55:45,316 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_72061288-2fe4-40b2-85db-34c2624c12bb 
					 Epoch: 4 | Completed: 48/138 (33%) 
 					 Loss: [1m1.220340[0m 
					 ---------
2022-04-25 15:55:45,349 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_9bb93b0b-515c-4b9b-9907-54018a3c2e84 
					 Epoch: 4 | Completed: 96/133 (67%) 
 					 Loss: [1m1.608760[0m 
					 ---------
2022-04-25 15:55:45,373 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_72061288-2fe4-40b2-85db-34c2624c12bb 
					 Epoch: 4 | Completed: 96/138 (67%) 
 					 Loss: [1m0.229422[0m 
					 ---------
2022-04-25 15:55:45,377 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_34845d75-1213-4389-b875-6f070482d764 
					 Epoch: 4 | Completed: 96/133 (67%) 
 					 Loss: [1m2.402742[0m 
					 ---------
2022-04-25 15:55:45,389 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_9bb93b0b-515c-4b9b-9907-54018a3c2e84 
					 Epoch: 4 | Completed: 111/133 (100%) 
 					 Loss: [

2022-04-25 15:55:54,968 fedbiomed DEBUG - researcher_ef0cd244-d377-4a9e-9f48-f04499b7e67d
2022-04-25 15:55:54,971 fedbiomed INFO - [1mSending request[0m 
					[1m To[0m: node_34845d75-1213-4389-b875-6f070482d764 
					[1m Request: [0m: Perform training with the arguments: {'researcher_id': 'researcher_ef0cd244-d377-4a9e-9f48-f04499b7e67d', 'job_id': 'f968d1c4-1d5a-47cf-9048-f942afc6f39b', 'training_args': {'test_ratio': 0.0, 'test_on_local_updates': False, 'test_on_global_updates': False, 'test_metric': None, 'test_metric_args': {}, 'batch_size': 48, 'lr': 0.001, 'log_interval': 1, 'epochs': 5, 'dry_run': False, 'batch_maxnum': 200}, 'training': True, 'model_args': {'n_features': 13, 'n_latent': 10, 'n_hidden': 128, 'n_samples': 20}, 'command': 'train', 'model_url': 'http://localhost:8844/media/uploads/2022/04/25/my_model_d2bf3bcf-56f9-412c-944c-461db1cea595.py', 'params_url': 'http://localhost:8844/media/uploads/2022/04/25/aggregated_params_3116da6b-9b2a-4a75-b8c9-685e8e5e6c34.p

2022-04-25 15:55:55,566 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_34845d75-1213-4389-b875-6f070482d764 
					 Epoch: 3 | Completed: 48/133 (33%) 
 					 Loss: [1m2.354410[0m 
					 ---------
2022-04-25 15:55:55,586 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_9bb93b0b-515c-4b9b-9907-54018a3c2e84 
					 Epoch: 3 | Completed: 111/133 (100%) 
 					 Loss: [1m1.189783[0m 
					 ---------
2022-04-25 15:55:55,598 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_72061288-2fe4-40b2-85db-34c2624c12bb 
					 Epoch: 3 | Completed: 96/138 (67%) 
 					 Loss: [1m1.168085[0m 
					 ---------
2022-04-25 15:55:55,611 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_34845d75-1213-4389-b875-6f070482d764 
					 Epoch: 3 | Completed: 96/133 (67%) 
 					 Loss: [1m2.626155[0m 
					 ---------
2022-04-25 15:55:55,639 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_9bb93b0b-515c-4b9b-9907-54018a3c2e84 
					 Epoch: 4 | Completed: 48/133 (33%) 
 					 Loss: [

15

Local training results for each round and each node are available via `exp.training_replies()` (index 0 to (`rounds` - 1) ).

For example you can view the training results for the last round below.

Different timings (in seconds) are reported for each dataset of a node participating in a round :
- `rtime_training` real time (clock time) spent in the training function on the node
- `ptime_training` process time (user and system CPU) spent in the training function on the node
- `rtime_total` real time (clock time) spent in the researcher between sending the request and handling the response, at the `Job()` layer

In [9]:
print("\nList the training rounds : ", exp.training_replies().keys())

print("\nList the nodes for the last training round and their timings : ")
round_data = exp.training_replies()[rounds - 1].data()
for c in range(len(round_data)):
    print("\t- {id} :\
    \n\t\trtime_training={rtraining:.2f} seconds\
    \n\t\tptime_training={ptraining:.2f} seconds\
    \n\t\trtime_total={rtotal:.2f} seconds".format(id = round_data[c]['node_id'],
        rtraining = round_data[c]['timing']['rtime_training'],
        ptraining = round_data[c]['timing']['ptime_training'],
        rtotal = round_data[c]['timing']['rtime_total']))
print('\n')
    
exp.training_replies()[rounds - 1].dataframe()


List the training rounds :  dict_keys([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])

List the nodes for the last training round and their timings : 
	- node_9bb93b0b-515c-4b9b-9907-54018a3c2e84 :    
		rtime_training=0.80 seconds    
		ptime_training=0.38 seconds    
		rtime_total=10.02 seconds
	- node_72061288-2fe4-40b2-85db-34c2624c12bb :    
		rtime_training=0.75 seconds    
		ptime_training=0.37 seconds    
		rtime_total=10.08 seconds
	- node_34845d75-1213-4389-b875-6f070482d764 :    
		rtime_training=0.81 seconds    
		ptime_training=0.37 seconds    
		rtime_total=10.12 seconds




Unnamed: 0,success,msg,dataset_id,node_id,params_path,params,timing
0,True,,dataset_f85df4f8-f797-41d7-8673-784b0d90668c,node_9bb93b0b-515c-4b9b-9907-54018a3c2e84,/Users/balelli/ownCloud/INRIA_EPIONE/FedBioMed...,"{'encoder.0.weight': [[tensor(0.0841), tensor(...","{'rtime_training': 0.7971322799999996, 'ptime_..."
1,True,,dataset_989ac5b3-b653-499a-a18a-27af7d809d06,node_72061288-2fe4-40b2-85db-34c2624c12bb,/Users/balelli/ownCloud/INRIA_EPIONE/FedBioMed...,"{'encoder.0.weight': [[tensor(0.0831), tensor(...","{'rtime_training': 0.7534455420000086, 'ptime_..."
2,True,,dataset_a3f9753f-44c4-4714-a76b-ccb0685e84b1,node_34845d75-1213-4389-b875-6f070482d764,/Users/balelli/ownCloud/INRIA_EPIONE/FedBioMed...,"{'encoder.0.weight': [[tensor(0.0829), tensor(...","{'rtime_training': 0.8075077310000154, 'ptime_..."


Federated parameters for each round are available via `exp.aggregated_params()` (index 0 to (`rounds` - 1) ).

For example you can view the federated parameters for the last round of the experiment :

In [10]:
print("\nList the training rounds : ", exp.aggregated_params().keys())

print("\nAccess the federated params for the last training round :")
print("\t- params_path: ", exp.aggregated_params()[rounds - 1]['params_path'])
print("\t- parameter data: ", exp.aggregated_params()[rounds - 1]['params'].keys())


List the training rounds :  dict_keys([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])

Access the federated params for the last training round :
	- params_path:  /Users/balelli/ownCloud/INRIA_EPIONE/FedBioMed/fedbiomed/var/experiments/Experiment_0000/aggregated_params_f04be204-0564-4320-aa99-534655f8deac.pt
	- parameter data:  odict_keys(['encoder.0.weight', 'encoder.0.bias', 'encoder.2.weight', 'encoder.2.bias', 'encoder.4.weight', 'encoder.4.bias', 'decoder.0.weight', 'decoder.0.bias', 'decoder.2.weight', 'decoder.2.bias', 'decoder.4.weight', 'decoder.4.bias'])


# Test and comparison to local training

## 1. Testing on an external dataset

First of all we are going to test the performance of the final federated model to impute missing data on a test dataset. To this extent we are going to remove randomly 50% of samples from the test dataset, `data_test`, defined at the beginning of this notebook.

In [11]:
# from the test dataset, we will remove randomly 50% of data
np.random.seed(1234)

perc_miss = 0.5 # 50% of missing data

n = data_test.shape[0] # number of observations
p = data_test.shape[1] # number of features
xfull = np.copy(data_test)
xfull = (xfull - np.mean(xfull,0))/np.std(xfull,0)
xmiss = np.copy(xfull)
xmiss_flat = xmiss.flatten()
miss_pattern = np.random.choice(n*p, np.floor(n*p*perc_miss).astype(np.int_),\
                                replace=False)
xmiss_flat[miss_pattern] = np.nan 
xmiss = xmiss_flat.reshape([n,p]) # in xmiss, the missing values are represented by nans
mask = np.isfinite(xmiss) # binary mask that indicates which values are missing
xhat_0 = np.copy(xmiss)
xhat_0[np.isnan(xmiss)] = 0
xhat = np.copy(xhat_0) # This will be out imputed data matrix

We instantiate the model using last updated federated parameters:

In [12]:
L = 100

# extract federated model into PyTorch framework
model = exp.model_instance()
model.load_state_dict(exp.aggregated_params()[rounds - 1]['params'])

encoder = model.encoder
decoder = model.decoder

We define the MIWAE imputation routine:

In [13]:
p_z = td.Independent(td.Normal(loc=torch.zeros(d),scale=torch.ones(d)),1)

def miwae_impute(iota_x,mask,L):
    batch_size = iota_x.shape[0]
    out_encoder = encoder(iota_x)
    q_zgivenxobs = td.Independent(td.Normal(loc=out_encoder[..., :d],scale=torch.nn.Softplus()(out_encoder[..., d:(2*d)])),1)

    zgivenx = q_zgivenxobs.rsample([L])
    zgivenx_flat = zgivenx.reshape([L*batch_size,d])

    out_decoder = decoder(zgivenx_flat)
    all_means_obs_model = out_decoder[..., :p]
    all_scales_obs_model = torch.nn.Softplus()(out_decoder[..., p:(2*p)]) + 0.001
    all_degfreedom_obs_model = torch.nn.Softplus()(out_decoder[..., (2*p):(3*p)]) + 3

    data_flat = torch.Tensor.repeat(iota_x,[L,1]).reshape([-1,1])
    tiledmask = torch.Tensor.repeat(mask,[L,1])

    all_log_pxgivenz_flat = torch.distributions.StudentT(loc=all_means_obs_model.reshape([-1,1]),scale=all_scales_obs_model.reshape([-1,1]),df=all_degfreedom_obs_model.reshape([-1,1])).log_prob(data_flat)
    all_log_pxgivenz = all_log_pxgivenz_flat.reshape([L*batch_size,p])

    logpxobsgivenz = torch.sum(all_log_pxgivenz*tiledmask,1).reshape([L,batch_size])
    logpz = p_z.log_prob(zgivenx)
    logq = q_zgivenxobs.log_prob(zgivenx)

    xgivenz = td.Independent(td.StudentT(loc=all_means_obs_model, scale=all_scales_obs_model, df=all_degfreedom_obs_model),1)

    imp_weights = torch.nn.functional.softmax(logpxobsgivenz + logpz - logq,0) # these are w_1,....,w_L for all observations in the batch
    xms = xgivenz.mean.reshape([L,batch_size,p])  # that's the only line that changed!
    xm=torch.einsum('ki,kij->ij', imp_weights, xms) 

    return xm

In [14]:
def mse(xhat,xtrue,mask): # MSE function for imputations
    xhat = np.array(xhat)
    xtrue = np.array(xtrue)
    return np.mean(np.power(xhat-xtrue,2)[~mask])

And we finally do the imputation and evaluate the corresponding imputation error through MSE:

In [15]:
xhat[~mask] = miwae_impute(iota_x = torch.from_numpy(xhat_0).float(),mask = torch.from_numpy(mask).float(),L= L).cpu().data.numpy()[~mask]
err_test_data = np.array([mse(xhat,xfull,mask)])
print('Imputation MSE on testing data  %g' %err_test_data)
print('-----')

Imputation MSE on testing data  0.621936
-----


## 2. Testing on a client's dataset

We are now going to use the final federated model to impute missing data of client 1, which have been used for training:

In [16]:
data_client_1 = Clients_data[0]
n = data_client_1.shape[0] # number of observations
p = data_client_1.shape[1] # number of features

xfull = np.copy(data_client_1)
xfull = (xfull - np.mean(xfull,0))/np.std(xfull,0)
xmiss = np.copy(xfull)
xmiss_flat = xmiss.flatten()
miss_pattern = np.random.choice(n*p, np.floor(n*p*perc_miss).astype(np.int_),\
                                replace=False)
xmiss_flat[miss_pattern] = np.nan 
xmiss = xmiss_flat.reshape([n,p]) # in xmiss, the missing values are represented by nans
mask = np.isfinite(xmiss) # binary mask that indicates which values are missing
xhat_0 = np.copy(xmiss)
xhat_0[np.isnan(xmiss)] = 0
xhat = np.copy(xhat_0) # This will be out imputed data matrix

### Now we do the imputation

xhat[~mask] = miwae_impute(iota_x = torch.from_numpy(xhat_0).float(),mask = torch.from_numpy(mask).float(),L= L).cpu().data.numpy()[~mask]
err_cl1_data = np.array([mse(xhat,xfull,mask)])
print('Imputation MSE on data from client 1  %g' %err_cl1_data)
print('-----')

Imputation MSE on data from client 1  0.556671
-----


## 3. Local training and testing on a client

Finally, we test the performance of the same model trained locally and tested on the dataset from client 1. We will use a total of `epochs`x`rounds` local epochs.

In [17]:
p_z = td.Independent(td.Normal(loc=torch.zeros(d),scale=torch.ones(d)),1)

def miwae_loss(iota_x,mask):
    batch_size = iota_x.shape[0]
    out_encoder = encoder(iota_x)
    q_zgivenxobs = td.Independent(td.Normal(loc=out_encoder[..., :d],scale=torch.nn.Softplus()(out_encoder[..., d:(2*d)])),1)

    zgivenx = q_zgivenxobs.rsample([K])
    zgivenx_flat = zgivenx.reshape([K*batch_size,d])

    out_decoder = decoder(zgivenx_flat)
    all_means_obs_model = out_decoder[..., :p]
    all_scales_obs_model = torch.nn.Softplus()(out_decoder[..., p:(2*p)]) + 0.001
    all_degfreedom_obs_model = torch.nn.Softplus()(out_decoder[..., (2*p):(3*p)]) + 3

    data_flat = torch.Tensor.repeat(iota_x,[K,1]).reshape([-1,1])
    tiledmask = torch.Tensor.repeat(mask,[K,1])

    all_log_pxgivenz_flat = torch.distributions.StudentT(loc=all_means_obs_model.reshape([-1,1]),scale=all_scales_obs_model.reshape([-1,1]),df=all_degfreedom_obs_model.reshape([-1,1])).log_prob(data_flat)
    all_log_pxgivenz = all_log_pxgivenz_flat.reshape([K*batch_size,p])

    logpxobsgivenz = torch.sum(all_log_pxgivenz*tiledmask,1).reshape([K,batch_size])
    logpz = p_z.log_prob(zgivenx)
    logq = q_zgivenxobs.log_prob(zgivenx)

    neg_bound = -torch.mean(torch.logsumexp(logpxobsgivenz + logpz - logq,0))

    return neg_bound

We perform the local training:

In [18]:
n_epochs_local = n_epochs*rounds
bs = 48 # batch size

encoder = nn.Sequential(
    torch.nn.Linear(p, h),
    torch.nn.ReLU(),
    torch.nn.Linear(h, h),
    torch.nn.ReLU(),
    torch.nn.Linear(h, 2*d),  # the encoder will output both the mean and the diagonal covariance
)

decoder = nn.Sequential(
    torch.nn.Linear(d, h),
    torch.nn.ReLU(),
    torch.nn.Linear(h, h),
    torch.nn.ReLU(),
    torch.nn.Linear(h, 3*p),  # the decoder will output both the mean, the scale, and the number of degrees of freedoms (hence the 3*p)
)

optimizer = torch.optim.Adam(list(encoder.parameters()) + list(decoder.parameters()),lr=1e-3)

def weights_init(layer):
    if type(layer) == nn.Linear: torch.nn.init.orthogonal_(layer.weight)
        
encoder.apply(weights_init)
decoder.apply(weights_init)

for ep in range(1,n_epochs_local):
    perm = np.random.permutation(n) # We use the "random reshuffling" version of SGD
    batches_data = np.array_split(xhat_0[perm,], n/bs)
    batches_mask = np.array_split(mask[perm,], n/bs)
    for it in range(len(batches_data)):
        optimizer.zero_grad()
        encoder.zero_grad()
        decoder.zero_grad()
        b_data = torch.from_numpy(batches_data[it]).float()
        b_mask = torch.from_numpy(batches_mask[it]).float()
        loss = miwae_loss(iota_x = b_data,mask = b_mask)
        loss.backward()
        optimizer.step()
    if ep % rounds == 1:
        print('Epoch %g' %ep)
        print('MIWAE likelihood bound  %g' %(-np.log(K)-miwae_loss(iota_x = torch.from_numpy(xhat_0).float(),mask = torch.from_numpy(mask).float()).cpu().data.numpy())) # Gradient step      

Epoch 1
MIWAE likelihood bound  -9.2405
Epoch 16
MIWAE likelihood bound  -7.46782
Epoch 31
MIWAE likelihood bound  -5.70482
Epoch 46
MIWAE likelihood bound  -5.12237
Epoch 61
MIWAE likelihood bound  -5.24706


And we do the imputation on the same dataset:

In [19]:
xhat[~mask] = miwae_impute(iota_x = torch.from_numpy(xhat_0).float(),mask = torch.from_numpy(mask).float(),L= L).cpu().data.numpy()[~mask]
err_local_cl1_data = np.array([mse(xhat,xfull,mask)])
print('Imputation MSE of local training on data from client 1  %g' %err_local_cl1_data)
print('-----')

Imputation MSE of local training on data from client 1  0.617914
-----


## Comparison of obtained results:

In [20]:
print('Imputation MSE on testing data  %g' %err_test_data)
print('Imputation MSE on data from client 1  %g' %err_cl1_data)
print('Imputation MSE of local training on data from client 1  %g' %err_local_cl1_data)

Imputation MSE on testing data  0.621936
Imputation MSE on data from client 1  0.556671
Imputation MSE of local training on data from client 1  0.617914


As you can see, the federated model performs better than the local one!