# PyTorch aggregation methods in Fed-BioMed

**Difficulty level**: **advanced**

## Introduction

This tutorial focuses on how to deal with heterogeneous dataset by changing its `Aggregator`. Fed-BioMed provides different methods for Aggregation. Selecting an appropriate Aggregation method can be critical when being confronted to unbalanced or heterogeneous datasets.

`Aggregators` provide a way to merge local models sent by `Nodes` into a global, more generalized model. Please note that designing `Nodes` sampling `Strategies` could also help when working on heterogeneous datasets.

For more information about `Aggregators` object in Fed-BioMed, and on how to create your own `Aggregator`; please see [`Aggregators` in the User Guide](../../../user-guide/researcher/aggregation) 

### Before you start
For this tutorial, we will be using heterogeneous the MedNIST dataset. MedNIST is a collection of 2-D grayscale medical images. The MedNIST dataset was gathered from several sets from TCIA, the RSNA Bone Age Challenge, and the NIH Chest X-ray dataset. The dataset is kindly made available by Dr. Bradley J. Erickson M.D., Ph.D. (Department of Radiology, Mayo Clinic) under the Creative Commons CC BY-SA 4.0 license and is distributed by MONAI for teaching and benchmarking simple deep-learning pipelines. For more information regarding the dataset please see [MedNIST Dataset](../../../user-guide/datasets/mednist-dataset).

## 1. Defining an `Experiment` using `FedAverage` `Aggregator`

First, let's reuse the `TorchTrainingPlan` that is defined in the [previous MedNIST tutorial](../04_Transfer-learning_tutorial_usingDenseNet-121.ipynb). FedAveraging has been introduced by McMahan et al. as the first aggregation method in the Federated Learning literature. It does the weighted sum of all `Nodes` local models parameters in order to obtain a global model:

In this tutorial, we will keep the same `TrainingPlan` (and thus the same model) for all the `Experiments`, we will only be changing the `Aggregators`.

In [None]:
from fedbiomed.common.training_plans import TorchTrainingPlan
from fedbiomed.common.datamanager import DataManager
from fedbiomed.common.dataset import MedNistDataset

class MyTrainingPlan(TorchTrainingPlan):

    def init_model(self, model_args):
        model = models.densenet121(weights=None)  # here model coefficients are set to random weights

        # add the classifier 
        num_classes = model_args['num_classes'] 
        num_ftrs = model.classifier.in_features
        model.classifier= nn.Sequential(
            nn.Linear(num_ftrs, 512),
            nn.ReLU(inplace=True),
            nn.Dropout(0.5),
            nn.Linear(512, num_classes)
        )      
        return model

    def init_dependencies(self):
        return [
            "from torchvision import transforms, models",
            "import torch.optim as optim",
            "from torchvision.models import densenet121",
            "from fedbiomed.common.dataset import MedNistDataset"
        ]

    def init_optimizer(self, optimizer_args):        
        return optim.Adam(self.model().parameters(), lr=optimizer_args["lr"])

    def training_data(self):

        # Transform images and do data augmentation 
        preprocess = transforms.Normalize(mean = [0.485, 0.456, 0.406], std = [0.229, 0.224, 0.225])
        target_transform = transforms.Lambda(lambda y: y.long())
    
        train_data = MedNistDataset(transform = preprocess, target_transform=target_transform)
        train_kwargs = { 'shuffle': True}
        return DataManager(dataset=train_data, **train_kwargs)

    def training_step(self, data, target):
        output = self.model().forward(data)
        loss_func = nn.CrossEntropyLoss()
        loss   = loss_func(output, target)
        return loss


We define hereafter parameters for `Experiment` to be used with vanilla `FedAverage`

In [None]:
training_args = {
    'loader_args': {
        'batch_size': 32,
    }, 
    'random_seed': 1234,
    'optimizer_args': {'lr': 1e-3}, 
    'epochs': 1, 
    'dry_run': False,  
    'num_updates': 50, # Fast pass for development : only use ( batch_maxnum * batch_size ) samples
}

model_args = {
    'num_classes': 6, # adapt this number to the number of classes in your dataset
}

We then import `FedAverage` `Aggregator` from Fed-BioMed's `Aggregators`

In [None]:
from fedbiomed.researcher.federated_workflows import Experiment
from fedbiomed.researcher.aggregators import FedAverage
from fedbiomed.researcher.strategies.default_strategy import DefaultStrategy

tags =  ['#MEDNIST', '#dataset']
rounds = 3

exp_fed_avg = Experiment()
exp_fed_avg.set_model_args(model_args=model_args)
exp_fed_avg.set_training_args(training_args=training_args)
exp_fed_avg.set_training_plan_class(training_plan_class=MyTrainingPlan)
exp_fed_avg.set_tags(tags = tags)
exp_fed_avg.set_training_data(training_data=None, from_tags=True)
exp_fed_avg.set_aggregator(aggregator=FedAverage())
exp_fed_avg.set_strategy(node_selection_strategy=DefaultStrategy())
exp_fed_avg.set_round_limit(rounds)
exp_fed_avg.set_tensorboard(True)

Activate Tensorboard

In [None]:
%load_ext tensorboard

In [None]:
fedavg_tensorboard_dir = exp_fed_avg.tensorboard_results_path

%tensorboard --logdir {fedavg_tensorboard_dir}

In [None]:
exp_fed_avg.run(increase=True)

Save trained model to file

In [None]:
exp_fed_avg.training_plan().export_model('./trained_model')

## 2. Defining an `Experiment` using `FedProx` `Aggregator`


In order to improve our results, we can change our `Aggregator`, by changing `FedAverage` into `FedProx`. 
Since `FedProx` is a `FedAverge` aggregator with a regularization term, we are reusing `FedAverage` `Aggregator` but we will be adding to the `training_args` `fedprox_mu`, that is the regularization parameter.




In [None]:
training_args_fedprox = {
    'loader_args': {
        'batch_size': 32,
    }, 
    'random_seed': 1234,
    'optimizer_args': {'lr': 1e-3}, 
    'epochs': 1, 
    'dry_run': False,  
    'num_updates': 50,
    'fedprox_mu': .1,  # This parameter indicates that we are going to use FedProx
    
}

model_args = {
    'num_classes': 6, # adapt this number to the number of classes in your dataset
}


In [None]:
from fedbiomed.researcher.federated_workflows import Experiment
from fedbiomed.researcher.aggregators import FedAverage
from fedbiomed.researcher.strategies.default_strategy import DefaultStrategy

tags =  ['#MEDNIST', '#dataset']
rounds = 3

exp_fedprox = Experiment()


exp_fedprox.set_model_args(model_args=model_args)
exp_fedprox.set_training_args(training_args=training_args_fedprox)
exp_fedprox.set_training_plan_class(training_plan_class=MyTrainingPlan)
exp_fedprox.set_tags(tags = tags)
exp_fedprox.set_training_data(training_data=None, from_tags=True)
exp_fedprox.set_aggregator(aggregator=FedAverage())
exp_fedprox.set_strategy(node_selection_strategy=DefaultStrategy())
exp_fedprox.set_round_limit(rounds)
exp_fedprox.set_tensorboard(True)

In [None]:
%reload_ext tensorboard

In [None]:
fedprox_tensorboard_dir = exp_fedprox.tensorboard_results_path

%tensorboard --logdir {fedavg_tensorboard_dir}

In [None]:
exp_fedprox.run(increase=True)

Save trained model to file

In [None]:
exp_fedprox.training_plan().export_model('./trained_model')

## 3. Defining an `Experiment` using `SCAFFOLD` `Aggregator`


The `Scaffold` aggregator's purpose is to limit the so called *client drift* that may happen when dealing with heterogeneous datasset across `Nodes`. 

In order to use `Scaffold`, we will have to import another `Aggregator` from `fedbiomed.researcher.aggregators` module, as you can see below.

`Scaffold` takes `server_lr` and `fds` the as arguments:
 - `server_lr` is the Server Learning Rate (it is used to perform a gradient descent on global model's updates on `Scaffold` aggregation)
 - `fds` is the `Federated Dataset` containing information about the `Nodes` connected to the network after issuing a `TrainRequest`

*Please note that it is possible to use `Scaffold` with a regularization parameter as suggested in `FedProx`. For that, you just have to specify `fedprox_mu` into the `training_args` dictionary, as shown in the `FedProx` example*

**Attention**: this version of `Scaffold` exchanges correction terms that are not protected, even when using [Secure Aggregation](../../../user-guide/secagg/introduction). Please do not use this version of `Scaffold` under heavy security constraints.

In [None]:
from fedbiomed.researcher.aggregators import Scaffold
from fedbiomed.researcher.strategies.default_strategy import DefaultStrategy

server_lr = .8
exp_scaffold = Experiment()

exp_scaffold.set_model_args(model_args=model_args)
exp_scaffold.set_training_args(training_args=training_args)
exp_scaffold.set_training_plan_class(training_plan_class=MyTrainingPlan)
exp_scaffold.set_tags(tags = tags)
exp_scaffold.set_training_data(training_data=None, from_tags=True)
exp_scaffold.set_aggregator(Scaffold(server_lr=server_lr))
exp_scaffold.set_strategy(node_selection_strategy=DefaultStrategy())
exp_scaffold.set_round_limit(rounds)
exp_scaffold.set_tensorboard(True)

In [None]:
%reload_ext tensorboard

In [None]:
scaffold_tensorboard_dir = exp_scaffold.tensorboard_results_path

%tensorboard --logdir {fedavg_tensorboard_dir}

In [None]:
exp_scaffold.run(increase=True)

Save trained model to file

In [None]:
exp_scaffold.training_plan().export_model('./trained_model')

## 4. Going further

In this tutorial we presented 3 important `Aggregators` that can be found in the Federated Learning Literature. If you want to create your custom `Aggregator`, please check our [Aggregation User guide](../../../user-guide/researcher/aggregation)


You may have noticed that thanks to Fed-BioMed's modular structure, it is possible to alternate from one aggregator to another while conducting an `Experiment`. For instance, you may start with the `SCAFFOLD` `Aggregator` for the 3 first rounds, and then switch to `FedAverage` `Aggregator` for the remaining rounds, as shown in the example below:


In [None]:
from fedbiomed.researcher.aggregators import Scaffold, FedAverage
from fedbiomed.researcher.strategies.default_strategy import DefaultStrategy

server_lr = .8
exp_multi_agg = Experiment()

# selecting how many rounds of each aggregator we will perform
rounds_scaffold = 3
rounds_fedavg = 1

exp_multi_agg.set_model_args(model_args=model_args)
exp_multi_agg.set_training_args(training_args=training_args)
exp_multi_agg.set_training_plan_class(training_plan_class=MyTrainingPlan)
exp_multi_agg.set_tags(tags = tags)
exp_multi_agg.set_training_data(training_data=None, from_tags=True)
exp_multi_agg.set_aggregator(Scaffold(server_lr=server_lr))
exp_multi_agg.set_strategy(node_selection_strategy=DefaultStrategy())
exp_multi_agg.set_round_limit(rounds_scaffold + rounds_fedavg)

exp_multi_agg.run(rounds=rounds_scaffold)


In [None]:
exp_multi_agg.set_aggregator(FedAverage())
exp_multi_agg.run(rounds=rounds_fedavg)

Save trained model to file

In [None]:
exp_multi_agg.training_plan().export_model('./trained_model')

For more advanced Aggregators and Regularizers, like `FedOpt`, you may be interested by [`DecLearn` optimizers](../../optimizers/01-fedopt-and-scaffold) that are compatible with Fed-BioMed and provide more options for Aggregation and Optimization.