# Federated 2d image classification with MONAI using GPU

## Introduction

This tutorial shows how to deploy in Fed-BioMed the 2d image classification example provided in the project MONAI (https://monai.io/), **using Nvidia GPU on the node to speedup training**:

https://github.com/Project-MONAI/tutorials/blob/master/2d_classification/mednist_tutorial.ipynb

Being MONAI based on PyTorch, the deployment within Fed-BioMed follows seamlessy the same general structure of general PyTorch models. 

Following the MONAI example, this tutorial is based on the MedNIST dataset.

## Creating MedNIST nodes

MedNIST provides an artificial 2d classification dataset created by gathering different medical imaging datasets from TCIA, the RSNA Bone Age Challenge, and the NIH Chest X-ray dataset. The dataset is kindly made available by Dr. Bradley J. Erickson M.D., Ph.D. (Department of Radiology, Mayo Clinic) under the Creative Commons CC BY-SA 4.0 license.

To proceed with the tutorial, we created an iid partitioning of the MedNIST dataset between 3 clients. Each client has 3000 image samples for each class. The training partitions are availables at the following link:

https://drive.google.com/file/d/1vLIcBdtdAhh6K-vrgCFy_0Y55dxOWZwf/view

The dataset owned by each client has structure:


└── client_*/

    ├── AbdomenCT/
    
    └── BreastMRI/
    
    └── CXR/
    
    └── ChestCT/
    
    └── Hand/
    
    └── HeadCT/   

To create the federated dataset, we follow the standard procedure for node creation/population of Fed-BioMed. 
After activating the fedbiomed network with the commands

`source ./scripts/fedbiomed_environment network`

and 

`./scripts/fedbiomed_run network`

we create a first node by using the commands

`source ./scripts/fedbiomed_environment node`

`./scripts/fedbiomed_run node start --gpu`

We then poulate the node with the data of first client:

`./scripts/fedbiomed_run node add`

We select option 3 (images) to add MedNIST partition of client 1, by just picking the folder of client 1. 
Assign tag `mednist` to the data when asked.

We can further check that the data has been added by executing `./scripts/fedbiomed_run node list`

Following the same procedure, we create the other two nodes with the datasets of client 2 and client 3 respectively.

**Number of nodes that can be launched depends on GPU memory**. If you are running all nodes on the same node + using the same GPU, a 4GB GPU also serving as the laptop graphic card will usually be able to **run 1 or 2 nodes with this model**. If too many nodes are launched nodes will fail with **out of memory error** at training time.


## Running Fed-BioMed Researcher

We are now ready to start the reseracher enviroment with the command `source ./scripts/fedbiomed_environment researcher`, and open the Jupyter notebook. 

We can first quesry the network for the mednist dataset. In this case, the nodes are sharing the respective partitions unsing the same tag `mednist`:

In [1]:
from fedbiomed.researcher.requests import Requests
req = Requests()
req.list(verbose=True)


2022-01-11 14:13:15,760 fedbiomed INFO - Component environment:
2022-01-11 14:13:15,761 fedbiomed INFO - - type = ComponentType.RESEARCHER
2022-01-11 14:13:15,938 fedbiomed INFO - Messaging researcher_73a42523-e23d-4d9f-955b-8450849207fb successfully connected to the message broker, object = <fedbiomed.common.messaging.Messaging object at 0x7f5b780b1f40>
2022-01-11 14:13:15,969 fedbiomed INFO - Listing available datasets in all nodes... 
2022-01-11 14:13:15,971 fedbiomed INFO - log from: node_f92a07ae-353a-47e5-a14e-1c2fd160670f / DEBUG - Message received: {'researcher_id': 'researcher_73a42523-e23d-4d9f-955b-8450849207fb', 'command': 'list'}
2022-01-11 14:13:15,971 fedbiomed INFO - log from: node_b8e09523-9f97-4481-9fae-079f623b984b / DEBUG - Message received: {'researcher_id': 'researcher_73a42523-e23d-4d9f-955b-8450849207fb', 'command': 'list'}
2022-01-11 14:13:25,984 fedbiomed INFO - 
 Node: node_f92a07ae-353a-47e5-a14e-1c2fd160670f | Number of Datasets: 2 
+---------+-------------

{'node_f92a07ae-353a-47e5-a14e-1c2fd160670f': [{'name': 'MNIST',
   'data_type': 'default',
   'tags': ['#MNIST', '#dataset'],
   'description': 'MNIST database',
   'shape': [60000, 1, 28, 28]},
  {'name': 'mednist',
   'data_type': 'images',
   'tags': ['mednist'],
   'description': 'mednist',
   'shape': [18000, 3, 64, 64]}],
 'node_b8e09523-9f97-4481-9fae-079f623b984b': [{'name': 'mednist',
   'data_type': 'images',
   'tags': ['mednist'],
   'description': 'mednist',
   'shape': [18000, 3, 64, 64]}]}

## Create an experiment to train a model on the data found

The code for network and data loader of the MONAI tutorial can now be deployed in Fed-BioMed.
We first import the necessary modules from `fedbiomed` and `monai` libraries:

In [2]:
from fedbiomed.researcher.environ import environ
import tempfile
import os

tmp_dir_model = tempfile.TemporaryDirectory(dir=environ['TMP_DIR']+os.sep)
model_file = os.path.join(tmp_dir_model.name, 'class_export_mednist.py')

In [3]:
from monai.apps import download_and_extract
from monai.config import print_config
from monai.data import decollate_batch
from monai.metrics import ROCAUCMetric
from monai.networks.nets import DenseNet121
from monai.transforms import (
    Activations,
    AddChannel,
    AsDiscrete,
    Compose,
    LoadImage,
    RandFlip,
    RandRotate,
    RandZoom,
    ScaleIntensity,
    EnsureType,
)
from monai.utils import set_determinism

We can now define the training plan. Note that we can simply use the standard `TorchTrainingPlan` natively provided in Fed-BioMed. We reuse the `MedNISTDataset` data loader defined in the original MONAI tutorial, which is returned by the method `training_data`, which also implements the data parsing from the nodes `dataset_path`. Following the MONAI tutorial, the model is the `DenseNet121`.

In [4]:
%%writefile "$model_file"

import os
import numpy as np
import torch
import torch.nn as nn
from fedbiomed.common.torchnn import TorchTrainingPlan
from torch.utils.data import DataLoader
from torchvision import datasets, transforms

from monai.apps import download_and_extract
from monai.config import print_config
from monai.data import decollate_batch
from monai.metrics import ROCAUCMetric
from monai.networks.nets import DenseNet121
from monai.transforms import (
    Activations,
    AddChannel,
    AsDiscrete,
    Compose,
    LoadImage,
    RandFlip,
    RandRotate,
    RandZoom,
    ScaleIntensity,
    EnsureType,
)
from monai.utils import set_determinism



# Here we define the model to be used. 
# You can use any class name (here 'DenseNet121')
class MyTrainingPlan(TorchTrainingPlan):
    def __init__(self, kwargs):
        super(MyTrainingPlan, self).__init__()
        
        # Here we define the custom dependencies that will be needed by our custom Dataloader
        # In this case, we need the torch DataLoader classes
        # Since we will train on MNIST, we need datasets and transform from torchvision
        deps = ["import numpy as np",
                "import os",
                "from torch.utils.data import DataLoader",
                "from monai.apps import download_and_extract",
                "from monai.config import print_config",
                "from monai.data import decollate_batch",
                "from monai.metrics import ROCAUCMetric",
                "from monai.networks.nets import DenseNet121",
                "from monai.transforms import ( Activations, AddChannel, AsDiscrete, Compose, LoadImage, RandFlip, RandRotate, RandZoom, ScaleIntensity, EnsureType, )",
                "from monai.utils import set_determinism",]
        self.add_dependency(deps)
         
        self.num_class =  kwargs['num_class']
        
        self.model = DenseNet121(spatial_dims=2, in_channels=1,
                    out_channels = self.num_class)
        
        self.loss_function = torch.nn.CrossEntropyLoss()
        
        # Model wants to use GPU (or not) if available on node and proposed by node
        self.use_gpu = True

    def forward(self, x):
        return self.model(x)

    class MedNISTDataset(torch.utils.data.Dataset):
            def __init__(self, image_files, labels, transforms):
                self.image_files = image_files
                self.labels = labels
                self.transforms = transforms

            def __len__(self):
                return len(self.image_files)

            def __getitem__(self, index):
                return self.transforms(self.image_files[index]), self.labels[index]
    
    def parse_data(self, path):
        print(self.dataset_path)
        class_names = sorted(x for x in os.listdir(path)
                     if os.path.isdir(os.path.join(path, x)))
        num_class = len(class_names)
        image_files = [
                        [
                            os.path.join(path, class_names[i], x)
                            for x in os.listdir(os.path.join(path, class_names[i]))
                        ]
                        for i in range(num_class)
                      ]
        
        return image_files, num_class
    
    def training_data(self, batch_size = 48):
        self.image_files, num_class = self.parse_data(self.dataset_path)
        
        if self.num_class!=num_class:
                raise Exception('number of available classes does not match declared classes')
        
        num_each = [len(self.image_files[i]) for i in range(self.num_class)]
        image_files_list = []
        image_class = []
        
        for i in range(self.num_class):
            image_files_list.extend(self.image_files[i])
            image_class.extend([i] * num_each[i])
        num_total = len(image_class)
        
        
        length = len(image_files_list)
        indices = np.arange(length)
        np.random.shuffle(indices)

        val_split = int(1. * length) 
        train_indices = indices[:val_split]

        train_x = [image_files_list[i] for i in train_indices]
        train_y = [image_class[i] for i in train_indices]


        train_transforms = Compose(
            [
                LoadImage(image_only=True),
                AddChannel(),
                ScaleIntensity(),
                RandRotate(range_x=np.pi / 12, prob=0.5, keep_size=True),
                RandFlip(spatial_axis=0, prob=0.5),
                RandZoom(min_zoom=0.9, max_zoom=1.1, prob=0.5),
                EnsureType(),
            ]
        )

        val_transforms = Compose(
            [LoadImage(image_only=True), AddChannel(), ScaleIntensity(), EnsureType()])

        y_pred_trans = Compose([EnsureType(), Activations(softmax=True)])
        y_trans = Compose([EnsureType(), AsDiscrete(to_onehot=num_class)])

        print(
            f"Training count: {len(train_x)}")
        
        
        train_ds = self.MedNISTDataset(train_x, train_y, train_transforms)
        train_loader = torch.utils.data.DataLoader(
            train_ds, batch_size, shuffle=True)
        
        return train_loader
    
    def training_step(self, data, target):
        output = self.forward(data)
        loss   = self.loss_function(output, target)
        return loss


Writing /home/mvesin/GIT/fedbiomed/fedbiomed/var/tmp/tmpvxmo4v_n/class_export_mednist.py


We now set the model and training parameters. Note that we use only 1 epoch for this experiment, and perform the training on ~26% of the locally available training data.

In [5]:
model_args = {'num_class':6,}

training_args = {
    'batch_size': 20, 
    'lr': 1e-5, 
    'epochs': 1, 
    'dry_run': False,  
    'batch_maxnum':250 # Fast pass for development : only use ( batch_maxnum * batch_size ) samples
}

2022-01-11 14:13:45,408 fedbiomed INFO - log from: node_40002d89-e02b-45ff-a959-fcdb888f454f / INFO - Starting task manager


The experiment can be now defined, by providing the `mednist` tag, and running the local training on nodes with model defined in `model_path`, standard `aggregator` (FedAvg) and `client_selection_strategy` (all nodes used). Federated learning is going to be performed through 3 optimization rounds.

## WARNING:

**For running this experiment, you need a computer with the following specifications:**
- a Nvidia GPU with enough RAM for the number of MedNIST nodes launched. A 4GB GPU also serving as laptop graphic card will usually be able to **run 1 or 2 nodes with this model**. If too many nodes are launched, nodes will fail with **out of memory error** at training time.


In [6]:
from fedbiomed.researcher.experiment import Experiment
from fedbiomed.researcher.aggregators.fedavg import FedAverage

tags =  ['mednist']
rounds = 3

exp = Experiment(tags=tags,
                 model_path=model_file,
                 model_args=model_args,
                 model_class='MyTrainingPlan',
                 training_args=training_args,
                 rounds=rounds,
                 aggregator=FedAverage(),
                 node_selection_strategy=None
                )

2022-01-11 14:13:48,635 fedbiomed INFO - Searching dataset with data tags: ['mednist'] for all nodes
2022-01-11 14:13:48,637 fedbiomed INFO - log from: node_b8e09523-9f97-4481-9fae-079f623b984b / DEBUG - Message received: {'researcher_id': 'researcher_73a42523-e23d-4d9f-955b-8450849207fb', 'tags': ['mednist'], 'command': 'search'}
2022-01-11 14:13:48,638 fedbiomed INFO - log from: node_f92a07ae-353a-47e5-a14e-1c2fd160670f / DEBUG - Message received: {'researcher_id': 'researcher_73a42523-e23d-4d9f-955b-8450849207fb', 'tags': ['mednist'], 'command': 'search'}
2022-01-11 14:13:48,639 fedbiomed INFO - log from: node_40002d89-e02b-45ff-a959-fcdb888f454f / DEBUG - Message received: {'researcher_id': 'researcher_73a42523-e23d-4d9f-955b-8450849207fb', 'tags': ['mednist'], 'command': 'search'}
2022-01-11 14:13:58,647 fedbiomed INFO - Node selected for training -> node_b8e09523-9f97-4481-9fae-079f623b984b
2022-01-11 14:13:58,648 fedbiomed INFO - Node selected for training -> node_f92a07ae-353a-

Let's start the experiment.

By default, this function doesn't stop until all the `rounds` are done for all the clients



In [7]:
exp.run()

2022-01-11 14:15:00,965 fedbiomed INFO - Sampled nodes in round 0 ['node_b8e09523-9f97-4481-9fae-079f623b984b', 'node_f92a07ae-353a-47e5-a14e-1c2fd160670f', 'node_40002d89-e02b-45ff-a959-fcdb888f454f']
2022-01-11 14:15:00,966 fedbiomed INFO - Send message to node node_b8e09523-9f97-4481-9fae-079f623b984b - {'researcher_id': 'researcher_73a42523-e23d-4d9f-955b-8450849207fb', 'job_id': 'e81ec5de-7cfb-49de-b1e7-a62b7ace445c', 'training_args': {'batch_size': 20, 'lr': 1e-05, 'epochs': 1, 'dry_run': False, 'batch_maxnum': 250}, 'model_args': {'num_class': 6}, 'command': 'train', 'model_url': 'http://localhost:8844/media/uploads/2022/01/11/my_model_6f199215-985d-4c9a-a96e-342fda231090.py', 'params_url': 'http://localhost:8844/media/uploads/2022/01/11/aggregated_params_init_0c6d1e32-b0cd-48ea-9cbc-e7eeb478bb68.pt', 'model_class': 'MyTrainingPlan', 'training_data': {'node_b8e09523-9f97-4481-9fae-079f623b984b': ['dataset_c90efc22-02df-4bc3-ade4-57b63ecba562']}}
2022-01-11 14:15:00,967 fedbiomed

2022-01-11 14:15:01,797 fedbiomed INFO - log from: node_b8e09523-9f97-4481-9fae-079f623b984b / DEBUG - Dataset_path/data/mvesin/data/MedNIST/client_1
2022-01-11 14:15:01,806 fedbiomed INFO - log from: node_b8e09523-9f97-4481-9fae-079f623b984b / DEBUG - Using device cuda for training (cuda_available=True, node_gpu=True, use_gpu=True, node_gpu_num=None)
2022-01-11 14:15:01,873 fedbiomed INFO - log from: node_f92a07ae-353a-47e5-a14e-1c2fd160670f / INFO - {'monitor': <fedbiomed.node.history_monitor.HistoryMonitor object at 0x7f9068b0a790>, 'node_gpu': False, 'node_gpu_num': None, 'batch_size': 20, 'lr': 1e-05, 'epochs': 1, 'dry_run': False, 'batch_maxnum': 250}
2022-01-11 14:15:01,876 fedbiomed INFO - log from: node_f92a07ae-353a-47e5-a14e-1c2fd160670f / DEBUG - Dataset_path/data/mvesin/data/MedNIST/client_2
2022-01-11 14:15:01,890 fedbiomed INFO - log from: node_f92a07ae-353a-47e5-a14e-1c2fd160670f / DEBUG - Using device cpu for training (cuda_available=True, node_gpu=False, use_gpu=True,

2022-01-11 14:17:51,032 fedbiomed INFO - log from: node_f92a07ae-353a-47e5-a14e-1c2fd160670f / DEBUG - Reached 250 batches for this epoch, ignore remaining data
2022-01-11 14:17:51,596 fedbiomed INFO - log from: node_f92a07ae-353a-47e5-a14e-1c2fd160670f / INFO - results uploaded successfully 
2022-01-11 14:17:52,445 fedbiomed INFO - log from: node_40002d89-e02b-45ff-a959-fcdb888f454f / DEBUG - Reached 250 batches for this epoch, ignore remaining data
2022-01-11 14:17:52,858 fedbiomed INFO - log from: node_40002d89-e02b-45ff-a959-fcdb888f454f / INFO - results uploaded successfully 
2022-01-11 14:18:01,303 fedbiomed INFO - Downloading model params after training on node_f92a07ae-353a-47e5-a14e-1c2fd160670f - from http://localhost:8844/media/uploads/2022/01/11/node_params_666138df-796d-4df1-9969-274832224195.pt
2022-01-11 14:18:01,384 fedbiomed INFO - Downloading model params after training on node_40002d89-e02b-45ff-a959-fcdb888f454f - from http://localhost:8844/media/uploads/2022/01/11/

2022-01-11 14:18:01,942 fedbiomed INFO - log from: node_b8e09523-9f97-4481-9fae-079f623b984b / DEBUG - Message received: {'researcher_id': 'researcher_73a42523-e23d-4d9f-955b-8450849207fb', 'job_id': 'e81ec5de-7cfb-49de-b1e7-a62b7ace445c', 'training_args': {'batch_size': 20, 'lr': 1e-05, 'epochs': 1, 'dry_run': False, 'batch_maxnum': 250}, 'model_args': {'num_class': 6}, 'command': 'train', 'model_url': 'http://localhost:8844/media/uploads/2022/01/11/my_model_6f199215-985d-4c9a-a96e-342fda231090.py', 'params_url': 'http://localhost:8844/media/uploads/2022/01/11/aggregated_params_699c8ed5-b912-4417-828b-4b61f865d12a.pt', 'model_class': 'MyTrainingPlan', 'training_data': {'node_b8e09523-9f97-4481-9fae-079f623b984b': ['dataset_c90efc22-02df-4bc3-ade4-57b63ecba562']}}
2022-01-11 14:18:01,942 fedbiomed DEBUG - researcher_73a42523-e23d-4d9f-955b-8450849207fb
2022-01-11 14:18:01,943 fedbiomed INFO - log from: node_b8e09523-9f97-4481-9fae-079f623b984b / DEBUG - [TASKS QUEUE] Item:{'researcher_

2022-01-11 14:19:07,257 fedbiomed INFO - log from: node_b8e09523-9f97-4481-9fae-079f623b984b / DEBUG - Reached 250 batches for this epoch, ignore remaining data
2022-01-11 14:19:07,908 fedbiomed INFO - log from: node_b8e09523-9f97-4481-9fae-079f623b984b / INFO - results uploaded successfully 
2022-01-11 14:19:17,018 fedbiomed INFO - Downloading model params after training on node_b8e09523-9f97-4481-9fae-079f623b984b - from http://localhost:8844/media/uploads/2022/01/11/node_params_01f9d62e-84a0-433d-bc42-bb46815728df.pt


2022-01-11 14:20:56,158 fedbiomed INFO - log from: node_40002d89-e02b-45ff-a959-fcdb888f454f / DEBUG - Reached 250 batches for this epoch, ignore remaining data
2022-01-11 14:20:56,158 fedbiomed INFO - log from: node_f92a07ae-353a-47e5-a14e-1c2fd160670f / DEBUG - Reached 250 batches for this epoch, ignore remaining data
2022-01-11 14:20:56,631 fedbiomed INFO - log from: node_f92a07ae-353a-47e5-a14e-1c2fd160670f / INFO - results uploaded successfully 
2022-01-11 14:20:56,639 fedbiomed INFO - log from: node_40002d89-e02b-45ff-a959-fcdb888f454f / INFO - results uploaded successfully 
2022-01-11 14:21:02,243 fedbiomed INFO - Downloading model params after training on node_f92a07ae-353a-47e5-a14e-1c2fd160670f - from http://localhost:8844/media/uploads/2022/01/11/node_params_29b56756-05c1-4ca0-a7a5-9b685060ab78.pt
2022-01-11 14:21:02,319 fedbiomed INFO - Downloading model params after training on node_40002d89-e02b-45ff-a959-fcdb888f454f - from http://localhost:8844/media/uploads/2022/01/11/

2022-01-11 14:21:02,845 fedbiomed INFO - log from: node_f92a07ae-353a-47e5-a14e-1c2fd160670f / DEBUG - [TASKS QUEUE] Item:{'researcher_id': 'researcher_73a42523-e23d-4d9f-955b-8450849207fb', 'job_id': 'e81ec5de-7cfb-49de-b1e7-a62b7ace445c', 'params_url': 'http://localhost:8844/media/uploads/2022/01/11/aggregated_params_0c4afdd6-da67-45ed-9261-b0930de62f2c.pt', 'training_args': {'batch_size': 20, 'lr': 1e-05, 'epochs': 1, 'dry_run': False, 'batch_maxnum': 250}, 'training_data': {'node_f92a07ae-353a-47e5-a14e-1c2fd160670f': ['dataset_3909c989-e2d5-40d8-8059-dab89c1ab00c']}, 'model_args': {'num_class': 6}, 'model_url': 'http://localhost:8844/media/uploads/2022/01/11/my_model_6f199215-985d-4c9a-a96e-342fda231090.py', 'model_class': 'MyTrainingPlan', 'command': 'train'}
2022-01-11 14:21:02,846 fedbiomed INFO - log from: node_40002d89-e02b-45ff-a959-fcdb888f454f / DEBUG - Message received: {'researcher_id': 'researcher_73a42523-e23d-4d9f-955b-8450849207fb', 'job_id': 'e81ec5de-7cfb-49de-b1e7

2022-01-11 14:22:10,575 fedbiomed INFO - log from: node_b8e09523-9f97-4481-9fae-079f623b984b / DEBUG - Reached 250 batches for this epoch, ignore remaining data
2022-01-11 14:22:11,240 fedbiomed INFO - log from: node_b8e09523-9f97-4481-9fae-079f623b984b / INFO - results uploaded successfully 
2022-01-11 14:22:17,920 fedbiomed INFO - Downloading model params after training on node_b8e09523-9f97-4481-9fae-079f623b984b - from http://localhost:8844/media/uploads/2022/01/11/node_params_dbbdcd99-02ac-40ab-a2db-ac8d375c35e3.pt
2022-01-11 14:23:53,986 fedbiomed INFO - log from: node_40002d89-e02b-45ff-a959-fcdb888f454f / DEBUG - Reached 250 batches for this epoch, ignore remaining data
2022-01-11 14:23:54,514 fedbiomed INFO - log from: node_40002d89-e02b-45ff-a959-fcdb888f454f / INFO - results uploaded successfully 


2022-01-11 14:23:56,609 fedbiomed INFO - log from: node_f92a07ae-353a-47e5-a14e-1c2fd160670f / DEBUG - Reached 250 batches for this epoch, ignore remaining data
2022-01-11 14:23:57,039 fedbiomed INFO - log from: node_f92a07ae-353a-47e5-a14e-1c2fd160670f / INFO - results uploaded successfully 
2022-01-11 14:24:03,141 fedbiomed INFO - Downloading model params after training on node_40002d89-e02b-45ff-a959-fcdb888f454f - from http://localhost:8844/media/uploads/2022/01/11/node_params_ef03a681-ea2d-4220-8cdd-c213266705d1.pt
2022-01-11 14:24:03,206 fedbiomed INFO - Downloading model params after training on node_f92a07ae-353a-47e5-a14e-1c2fd160670f - from http://localhost:8844/media/uploads/2022/01/11/node_params_f54a60ef-5c09-46f0-8ee3-cc5638494e07.pt
2022-01-11 14:24:03,269 fedbiomed INFO - Nodes that successfully reply in round 2 ['node_b8e09523-9f97-4481-9fae-079f623b984b', 'node_40002d89-e02b-45ff-a959-fcdb888f454f', 'node_f92a07ae-353a-47e5-a14e-1c2fd160670f']
2022-01-11 14:24:03,754 

## Testing


Once the federated model is obtained, it is possible to test it locally on an independent testing partition.
The test dataset is available at this link:

https://drive.google.com/file/d/1YbwA0WitMoucoIa_Qao7IC1haPfDp-XD/

In [None]:
!pip install gdown

In [None]:
import os
import shutil
import tempfile
import PIL
import torch
import numpy as np
from sklearn.metrics import classification_report

from monai.config import print_config
from monai.data import decollate_batch
from monai.metrics import ROCAUCMetric
from monai.networks.nets import DenseNet121
import zipfile
from monai.transforms import (
    Activations,
    AddChannel,
    AsDiscrete,
    Compose,
    LoadImage,
    RandFlip,
    RandRotate,
    RandZoom,
    ScaleIntensity,
    EnsureType,
)
from monai.utils import set_determinism

print_config()

Download the testing dataset on the local temporary folder.

In [None]:
import gdown
import zipfile

resource = "https://drive.google.com/uc?id=1YbwA0WitMoucoIa_Qao7IC1haPfDp-XD"
base_dir = tmp_dir_model.name 
test_file = os.path.join(base_dir, "MedNIST_testing.zip")

gdown.download(resource, test_file, quiet=False)

zf = zipfile.ZipFile(test_file)

for file in zf.infolist():
    zf.extract(file, base_dir)
    
data_dir = os.path.join(base_dir, "MedNIST_testing")

Parse the data and create the testing data loader:

In [None]:
class_names = sorted(x for x in os.listdir(data_dir)
                     if os.path.isdir(os.path.join(data_dir, x)))
num_class = len(class_names)
image_files = [
    [
        os.path.join(data_dir, class_names[i], x)
        for x in os.listdir(os.path.join(data_dir, class_names[i]))
    ]
    for i in range(num_class)
]

num_each = [len(image_files[i]) for i in range(num_class)]
image_files_list = []

image_class = []
for i in range(num_class):
    image_files_list.extend(image_files[i])
    image_class.extend([i] * num_each[i])
num_total = len(image_class)
image_width, image_height = PIL.Image.open(image_files_list[0]).size

print(f"Total image count: {num_total}")
print(f"Image dimensions: {image_width} x {image_height}")
print(f"Label names: {class_names}")
print(f"Label counts: {num_each}")

In [None]:
length = len(image_files_list)
indices = np.arange(length)
np.random.shuffle(indices)


test_split = int(0.1 * length)
test_indices = indices[:test_split]

test_x = [image_files_list[i] for i in test_indices]
test_y = [image_class[i] for i in test_indices]

val_transforms = Compose(
    [LoadImage(image_only=True), AddChannel(), ScaleIntensity(), EnsureType()])

y_pred_trans = Compose([EnsureType(), Activations(softmax=True)])
y_trans = Compose([EnsureType(), AsDiscrete(to_onehot=num_class)])

In [None]:
class MedNISTDataset(torch.utils.data.Dataset):
    def __init__(self, image_files, labels, transforms):
        self.image_files = image_files
        self.labels = labels
        self.transforms = transforms

    def __len__(self):
        return len(self.image_files)

    def __getitem__(self, index):
        return self.transforms(self.image_files[index]), self.labels[index]


test_ds = MedNISTDataset(test_x, test_y, val_transforms)
test_loader = torch.utils.data.DataLoader(
    test_ds, batch_size=300)

Define testing metric:

In [None]:
auc_metric = ROCAUCMetric()

To test the federated model we need to create a model instance and assign to it the model parameters estimated at the last federated optimization round.

In [None]:
model = exp.model_instance
model.load_state_dict(exp.aggregated_params[rounds - 1]['params'])

Compute the testing performance:

In [None]:
y_true = []
y_pred = []
with torch.no_grad():
    for test_data in test_loader:
        test_images, test_labels = (
            test_data[0],
            test_data[1],
        )
        pred = model(test_images).argmax(dim=1)
        for i in range(len(pred)):
            y_true.append(test_labels[i].item())
            y_pred.append(pred[i].item())


In [None]:
print(classification_report(
    y_true, y_pred, target_names=class_names, digits=4))

In spite of the relatively small training performed on the data shared in the 3 nodes, the performance of the federated model seems pretty good. Well done! 