# Advanced optimizers in Fed-BioMed


**Difficulty level**: **advanced**
    
## Introduction

This tutorial presents on how  to deal with heterogeneous dataset by changing its `Optimizer`. 
In `Fed-BioMed`, one can specify two sort of `Optimizer`s:

1. a `Optimizer` on the `Node` side, defined on the `Training Plan`
2. a `Optimizer` on the `Researcher` side, configured in the `Experiment`

Advanced `Optimizer` are backed by [`declearn` package](), a python package focused on `Optimization` for Federated Learning. Advanced `Optimizer` can be used regardless of the machine learning framework (compatible with both sklearn and PyTorch)


In this tutorial you will learn:
- how to use and chain one or several `Optimizers` on `Node` and `Researcher` side
- how to use fedopt
- how to use `Optimizers` that exchange auxiliary variables such as `Scaffold`

For further details you can refer to the [`Optimizer` section in the User Guide]()

# 1. Configuring `Nodes`

Before starting, we need to configure several `Nodes` and add MedNist dataset to it. Node configuration steps require `fedbiomed-node` conda environment. Please make sure that you have the necessary conda environment: this is explained in the [installation tutorial](../../installation/0-basic-software-installation). 


Please open a terminal, `cd` to the base directory of the cloned fedbiomed project and follow the steps below.    

* **Configuration Steps:**
    * Run `${FEDBIOMED_DIR}/scripts/fedbiomed_run node add` in the terminal
    * It will ask you to select the data type that you want to add. The third option has been configured to add the MedNIST dataset. Please type `3` and continue. 
    * Please use default tags which are `#MEDNIST` and `#dataset`.
    * For the next step, please select the directory that you want to download the MNIST dataset.
    * After the download is completed you will see the details of the MNIST dataset on the screen.
 
Please run the command below in the same terminal to make sure the MNIST dataset is successfully added to the Node. 

```
$ ${FEDBIOMED_DIR}/scripts/fedbiomed_run node config conf1.ini add
```

Before starting the node, please make sure that you have already launched the network using command `scripts/fedbiomed_run network`. Afterward, all you need to do is to start the node.


```
$ ${FEDBIOMED_DIR}/scripts/fedbiomed_run node config conf1.ini start
```

In another terminal, you may proceed by launching a second `Node`

# 2. Defining an `Optimizer` on `Node` side

`Optimizers` are defined through the `init_optimizer` method of the `training plan`. They must be set using `Fed-BioMed` `Optimizer` object (ie from `fedbiomed.common.optimizers.optimizer.Optimizer`)

## 2.1 With PyTorch framework

In [this tutorial]() we have showcased the use of a PyTorch model with [PyTorch native optimizers](), such as `torch.optim.SGD`. In the present tutorial, we will see how to use `declearn` cross frameworks optimizers

### PyTorch `Training Plan`
Below is a simple implementation of a `declearn` SGD `Optimizer` on a PyTorch model. It is equivalent to the following

```python

class MyTrainingPlan(TorchTrainingPlan):
    ...
    def init_optimizer(self, optimizer_args):
        return torch.optim.SGD(self.model().parameters(), lr = optimizer_args['lr'])
```

In [None]:
import torch
import torch.nn as nn
from fedbiomed.common.training_plans import TorchTrainingPlan
from fedbiomed.common.data import DataManager
from torchvision import datasets, transforms
from torchvision.models import densenet121
from fedbiomed.common.optimizers.optimizer import Optimizer

# Here we define the model to be used. 
# we will use the densnet121 model
class MyTrainingPlan(TorchTrainingPlan):
    
    def init_dependencies(self):
        deps = ["from torchvision import datasets, transforms",
                "from torchvision.models import densenet121",
               "from fedbiomed.common.optimizers.optimizer import Optimizer"]

        return deps
    
    def init_model(self):
        self.loss_function = torch.nn.CrossEntropyLoss()
        model = densenet121(pretrained=True)
        model.classifier =nn.Sequential(nn.Linear(1024,512), nn.Softmax())
        return model 
    
    def init_optimizer(self, optimizer_args):
        # Defines and return a declearn optimizer
        # equivalent: Optimizer(lr=optimizer_args['lr'], modules=[], regurlarizers=[])
        return Optimizer(lr=optimizer_args['lr'])

    def training_data(self, batch_size = 48):
        preprocess = transforms.Compose([transforms.ToTensor(),
                                        transforms.Normalize(
                                            mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]
                                        )])
        train_data = datasets.ImageFolder(self.dataset_path,transform = preprocess)
        train_kwargs = {'batch_size': batch_size, 'shuffle': True}
        return DataManager(dataset=train_data, **train_kwargs)
    
    def training_step(self, data, target):
        output = self.model().forward(data)
        loss   = self.loss_function(output, target)
        return loss


### 2.2 Sklearn `Training Plan`

For another machine learning framework such as sklearn, syntax is the same

In [None]:
from fedbiomed.common.training_plans import FedSGDClassifier
from fedbiomed.common.data import DataManager

from fedbiomed.common.optimizers.optimizer import Optimizer



class SGDRegressorTrainingPlan(FedSGDClassifier):
    # Declares and return dependencies
    def init_dependencies(self):
        deps = ["from torchvision import datasets, transforms",
                "from fedbiomed.common.optimizers.optimizer import Optimizer"]
        return deps

    def training_data(self, batch_size):
        preprocess = transforms.Compose([transforms.ToTensor(),
                                        transforms.Normalize(
                                            mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]
                                        )])
        train_data = datasets.ImageFolder(self.dataset_path,transform = preprocess)
        X_train = dataset.data.numpy()
        X_train = X_train.reshape(-1, 28*28)
        Y_train = dataset.targets.numpy()
        return DataManager(dataset=X_train, target=Y_train, batch_size=batch_size)

    # Defines and return a declearn optimizer
    def init_optimizer(self, optimizer_args):
        return Optimizer(lr=optimizer_args['lr'])

### 2.3 Using a more advanced `Optimizer` with `Regularizer`

`Optimizer` from `fedbiomed.common.optimizers.optimizer` with learning rate equal `.1` can be written as ```Optimizer(lr=.1, decay=0., modules=[], regualrizers=[])```, where:

- `decay` is the weight decay ;
- `modules` is a python list containing one or several [`declearn` `OptiModules`](https://magnet.gitlabpages.inria.fr/declearn/docs/2.2/api-reference/optimizer/modules/OptiModule/) ;
- `regularizers` is a python list containing one or several [`declearn` `Regularizers`](https://magnet.gitlabpages.inria.fr/declearn/docs/2.2/api-reference/optimizer/regularizers/Regularizer/)

We will re-use the `Pytorch Training Plan` already defined above and show how to use a `Adam` `Optimizer` with `Ridge` as the `Regularizer`. For that, we need to import the `Adam`  and the `Ridge` version of `declearn` (`AdamModule` and `RidgeRegularizer`).

Then the `Training Plan` can be defined as follow:

In [None]:
import torch
import torch.nn as nn
from fedbiomed.common.training_plans import TorchTrainingPlan
from fedbiomed.common.data import DataManager
from torchvision import datasets, transforms
from torchvision.models import densenet121
from fedbiomed.common.optimizers.optimizer import Optimizer
from declearn.optimizer.modules import AdamModule
from declearn.optimizer.regularizers import RidgeRegularizer

# Here we define the model to be used. 
# we will use the densnet121 model
class MyTrainingPlan(TorchTrainingPlan):
    
    def init_dependencies(self):
        deps = ["from torchvision import datasets, transforms",
                "from torchvision.models import densenet121",
                "from fedbiomed.common.optimizers.optimizer import Optimizer",
                "from declearn.optimizer.modules import AdamModule",
                "from declearn.optimizer.regularizers import RidgeRegularizer"]

        return deps
    
    def init_model(self):
        self.loss_function = torch.nn.CrossEntropyLoss()
        model = densenet121(pretrained=True)
        model.classifier =nn.Sequential(nn.Linear(1024,512), nn.Softmax())
        return model 
    
    def init_optimizer(self, optimizer_args):
        # Defines and return a declearn optimizer
        # equivalent: Optimizer(lr=optimizer_args['lr'], modules=[], regurlarizers=[])
        return Optimizer(lr=optimizer_args['lr'], modules=[AdamModule()], regularizers=[RidgeRegularizer()])

    def training_data(self, batch_size = 48):
        preprocess = transforms.Compose([transforms.ToTensor(),
                                        transforms.Normalize(
                                            mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]
                                        )])
        train_data = datasets.ImageFolder(self.dataset_path,transform = preprocess)
        train_kwargs = {'batch_size': batch_size, 'shuffle': True}
        return DataManager(dataset=train_data, **train_kwargs)
    
    def training_step(self, data, target):
        output = self.model().forward(data)
        loss   = self.loss_function(output, target)
        return loss


### 2.4. Create the `Experiment`

Once the `Training Plan` has been created with a specific framework model, definition of the `Experiment` is the same as the one in PyTorch or Scikit-Learn

In [None]:
model_args = {}

training_args = {
    'batch_size': 8,
    'optimizer_args': {
        "lr" : 1e-3
    },
    'dry_run': False,
    'num_updates': 50
}

tags =  ['#dataset', '#MEDNIST']
rounds = 2

In [None]:
from fedbiomed.researcher.experiment import Experiment
from fedbiomed.researcher.aggregators import FedAverage
from fedbiomed.researcher.strategies.default_strategy import DefaultStrategy


exp = Experiment()
exp.set_training_plan_class(training_plan_class=MyTrainingPlan)
exp.set_model_args(model_args=model_args)
exp.set_training_args(training_args=training_args)
exp.set_tags(tags = tags)
exp.set_aggregator(aggregator=FedAverage())
exp.set_round_limit(rounds)
exp.set_training_data(training_data=None, from_tags=True)
exp.set_job()
exp.set_strategy(node_selection_strategy=DefaultStrategy)

exp.run(increase=True)

# 3. Defining an `Optimizer` on `Researcher` side: `FedOpt`

# 4. Defining `Scaffold` through `Optimizer`

# 5. Explore advanced `Optimizer` feature through `declearn` and the Fed-BioMed user guide