# In Depth Experiment Configuration

The experiment class provides an interface that you can manage your experiment with backward compatibility. It means that even your Experiment has been built/defined you will be able to configure its parameters. This feature will provide more control over your experiment even after your running your experiment for several rounds. In this tutorial, detailed experiment interface will be explained using MNIST basic example.

## Configuring Environment
Before running this notebook, you need to configure your environment by completing following steps:

### Deploying MNIST Dataset in the None
Please run following command to add MNIST dataset into your Node. This command will deploy MNIST dataset in your default node whose config file is located in `${FEDBIOMED_DIR}/etc` directory as `config_node.ini`

After running following command, please select data type `2) default`, use default `tags` and select the folder where MNIST dataset will be saved.

```shell
${FEDBIOMED_DIR}/scripts/fedbiomed_run node add
```

### Starting the Node
 After you have successfully completed previous step, please run following command to start your node.

```shell
${FEDBIOMED_DIR}/scripts/fedbiomed_run node start
```

## Creating a Training Plan

Before declaring an experiment, the training plan that will be used for federated training should be defined. The model that is goıng to be used is exactly the same training plan that has been created in the Basic MNIST tutorial. We recommend you to follow Basic MNIST tutorial on PyTorch Framework to understand following steps.

In [1]:
import torch
import torch.nn as nn
from fedbiomed.common.training_plans import TorchTrainingPlan
from fedbiomed.common.data import DataManager
from torchvision import datasets, transforms


# Here we define the model to be used. 
# You can use any class name (here 'Net')
class MyTrainingPlan(TorchTrainingPlan):
    
    # Defines and return model 
    def init_model(self, model_args):
        return self.Net(model_args = model_args)
    
    # Defines and return optimizer
    def init_optimizer(self, optimizer_args):
        return torch.optim.Adam(self.model().parameters(), lr = optimizer_args["lr"])
    
    # Declares and return dependencies
    def init_dependencies(self):
        deps = ["from torchvision import datasets, transforms"]
        return deps
    
    class Net(nn.Module):
        def __init__(self, model_args):
            super().__init__()
            self.conv1 = nn.Conv2d(1, 32, 3, 1)
            self.conv2 = nn.Conv2d(32, 64, 3, 1)
            self.dropout1 = nn.Dropout(0.25)
            self.dropout2 = nn.Dropout(0.5)
            self.fc1 = nn.Linear(9216, 128)
            self.fc2 = nn.Linear(128, 10)

        def forward(self, x):
            x = self.conv1(x)
            x = F.relu(x)
            x = self.conv2(x)
            x = F.relu(x)
            x = F.max_pool2d(x, 2)
            x = self.dropout1(x)
            x = torch.flatten(x, 1)
            x = self.fc1(x)
            x = F.relu(x)
            x = self.dropout2(x)
            x = self.fc2(x)


            output = F.log_softmax(x, dim=1)
            return output

    def training_data(self):
        # Custom torch Dataloader for MNIST data
        transform = transforms.Compose([transforms.ToTensor(),
        transforms.Normalize((0.1307,), (0.3081,))])
        dataset1 = datasets.MNIST(self.dataset_path, train=True, download=False, transform=transform)
        train_kwargs = { 'shuffle': True}
        return DataManager(dataset=dataset1, **train_kwargs)
    
    def training_step(self, data, target):
        output = self.model().forward(data)
        loss   = torch.nn.functional.nll_loss(output, target)
        return loss


## Creating an Experiment Step by Step  

The experiment class can be created without passing any argument. This will just build an empty experiment object. Afterwards, you will be able to define your arguments using setters of the experiment object.


<div class="note"><p>It is always possible to create a fully configured experiment by passing all arguments during the initialization. You can also create your experiment with some of the arguments and set the other arguments after.</p></div>

### Building an Empty Experiment


After building an empty experiment you won't be able to perform federated training, since it is not fully configured. That's why the output of the initialization will always remind you that the experiment is not fully configured.

In [2]:
from fedbiomed.researcher.federated_workflows import Experiment
exp = Experiment()

from fedbiomed.researcher.federated_workflows import Experiment
2024-01-09 11:30:11,495 fedbiomed INFO - Starting researcher service...
2024-01-09 11:30:11,496 fedbiomed INFO - Waiting 3s for nodes to connect...
2024-01-09 11:30:11,830 fedbiomed DEBUG - Node: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 polling for the tasks
2024-01-09 11:30:11,831 fedbiomed DEBUG - Node: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b polling for the tasks
2024-01-09 11:30:14,511 fedbiomed DEBUG - Experiment not fully configured yet: no training data
2024-01-09 11:30:14,515 fedbiomed DEBUG - Experiment not fully configured yet: no node selection strategy


### Displaying Current Status of Experiment
As an addition to output of the initialization, to find out more about the current status of the experiment, you can call the `info()` method of your experiment object. This method will print the information about your experiment and what you should complete to be able to start your federated training.

In [3]:
exp.info()

Arguments             Values
--------------------  ------------------------------------------------------------
Aggregator            FedAverage
Strategy              None
Aggregator Optimizer  None
Rounds already run    0
Rounds total          None
Breakpoint State      False
Training Plan Class   None
Model Arguments       None
Tags                  None
Nodes filter          None
Training Data         None
Training Arguments    {'optimizer_args': {}, 'loader_args': {}, 'epochs': None, 'n
                      um_updates': None, 'dry_run': False, 'batch_maxnum': None, '
                      test_ratio': 0.0, 'test_on_local_updates': False, 'test_on_g
                      lobal_updates': False, 'test_metric': None, 'test_metric_arg
                      s': {}, 'log_interval': 10, 'fedprox_mu': None, 'use_gpu': F
                      alse, 'dp_args': None, 'share_persistent_buffers': True, 'ra
                      ndom_seed': None}
Experiment folder     Experiment_0005
Experiment 

{'Arguments': ['Aggregator',
  'Strategy',
  'Aggregator Optimizer',
  'Rounds already run',
  'Rounds total',
  'Breakpoint State',
  'Training Plan Class',
  'Model Arguments',
  'Tags',
  'Nodes filter',
  'Training Data',
  'Training Arguments',
  'Experiment folder',
  'Experiment Path',
  'Secure Aggregation'],
 'Values': ['FedAverage',
  'None',
  'None',
  '0',
  'None',
  'False',
  'None',
  'None',
  'None',
  'None',
  'None',
  "{'optimizer_args': {}, 'loader_args': {}, 'epochs': None, 'n\num_updates': None, 'dry_run': False, 'batch_maxnum': None, '\ntest_ratio': 0.0, 'test_on_local_updates': False, 'test_on_g\nlobal_updates': False, 'test_metric': None, 'test_metric_arg\ns': {}, 'log_interval': 10, 'fedprox_mu': None, 'use_gpu': F\nalse, 'dp_args': None, 'share_persistent_buffers': True, 'ra\nndom_seed': None}",
  'Experiment_0005',
  '/home/ybouilla/github/fedbiomed_ssh/fedbiomed/var/experiment\ns/Experiment_0005',
  '- Using: <fedbiomed.researcher.secagg._secure_aggrega

Based on the output, some arguments are defined with default values, while others are not. Model arguments, training arguments, tags, round limit, training data etc. have no default value, and they are required to be set. However, these arguments are related to each other. For example, to be able to define your federated training data you need to define the `tags` first, and then while setting your training data argument, experiment will be able to send search request to the nodes to receive information about the datasets. These relations between the arguments will be explained in the following steps.


--------------------
Fed-BioMed researcher stopped due to exception:
Experiment not fully configured yet: no job. Missing training data
--------------------


FedbiomedSilentTerminationError: 

2024-01-09 11:31:11,824 fedbiomed DEBUG - Node: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 polling for the tasks
2024-01-09 11:31:11,826 fedbiomed DEBUG - Node: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b polling for the tasks


### Setting Model and Training Arguments
In the previous step, the model has been defined for your experiment. Now, you can define your model arguments and training arguments that will be used respectively for building your model class and training your model on the node side. The methods `set_model_args` and `set_training_args` of the experiment class will allow you to set these arguments.

<div class="">
    <p>There isn't any requirement on the order of defining model class and mode/training arguments. It is also possible to
        define model/training arguments first and model class after. 
    </p>    
<div>


In [7]:
# Model arguments should be an empty Dict, since our model does not require 
# any argument for initialization
model_args = {}

# Training Arguments
training_args = {
    'loader_args': { 'batch_size': 48, }, 
    'optimizer_args': {
        'lr': 1e-3, 
    },
    'epochs': 1, 
    'dry_run': False,  
    'batch_maxnum': 100 # Fast pass for development : only use ( batch_maxnum * batch_size ) samples
}

exp.set_model_args(model_args=model_args)
exp.set_training_args(training_args=training_args)

{'loader_args': {'batch_size': 48},
 'optimizer_args': {'lr': 0.001},
 'epochs': 1,
 'dry_run': False,
 'batch_maxnum': 100,
 'num_updates': None,
 'test_ratio': 0.0,
 'test_on_local_updates': False,
 'test_on_global_updates': False,
 'test_metric': None,
 'test_metric_args': {},
 'log_interval': 10,
 'fedprox_mu': None,
 'use_gpu': False,
 'dp_args': None,
 'share_persistent_buffers': True,
 'random_seed': None}

2024-01-09 11:00:36,626 fedbiomed DEBUG - Node: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b polling for the tasks


### Setting Tags

The tags for the dataset search request can be set using `set_tags` method of experiment object. 

<br><div class="note"><p>Setting tags does not mean sending dataset search request. Search request is sent while setting training data. `tags` is the argument that is required for the search request.</p></div>

The arguments `tags` of `set_tags` method should be an array of tags which are in `string` type or just a tag in `string` type.

In [8]:
tags = ['#MNIST', '#dataset']
exp.set_tags(tags = tags)

['#MNIST', '#dataset']

To see the tags that are set, you can run `tags()` method of experiment object. 

In [9]:
exp.tags()

['#MNIST', '#dataset']

### Setting Nodes
The `nodes` arguments indicates the nodes that are going to be used for the experiment. By default, it is equal to `None` which means every node up and running will be part of the experiment as long as they have the dataset that is going to be used for training. If the `nodes` has been set in advance, the search request for the dataset search will be sent only the nodes that are indicated. You can set nodes using the method `set_nodes(noes=nodes)`. This method takes `nodes` argument which should be an array of node ids which are in `string` type or just a single node id as `string`.

Since the node ids can change randomly, to make this notebook runnable in all environments, we won't be setting nodes for the experiment.


### Setting Training Data
Training data is a `FederatedDataSet` instance which comes from the module `fedbiomed.researcher.datasets`. There are several ways to define your training data.

1. You can run `set_training_data(training_data=None, from_tags=True)`. This will send search request to the nodes to get dataset information by using the `tags` which are defined before.
2. You can provide `training_data` argument which is an instance of `FederatedDataSet`. 
3. You can provide `training_data` argument as python `dict` and setter will create a `FederatedDataSet` object by itself.

While using the last option please make sure that your `dict` object is configured as coherent to `FederatedDataSet` schema. Otherwise, you might get error while running your experiment.

A `FederatedDataSet` object must have **one unique** dataset per node to ensure training uses only one dataset for each node. This is checked and enforced when creating a `FederatedDataSet`

If you run `set_training_data(training_data=None)`. No training data is defined yet for the experiment (`training_data` is set to `None`).


In [10]:
training_data = exp.set_training_data(training_data=None, from_tags=True)

2024-01-09 11:00:42,914 fedbiomed DEBUG - Node: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b polling for the tasks
2024-01-09 11:00:42,917 fedbiomed DEBUG - Node: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 polling for the tasks
2024-01-09 11:00:42,919 fedbiomed INFO - Node selected for training -> NODE_2277a35a-722f-4a30-8a60-85cf8d873d82
2024-01-09 11:00:42,920 fedbiomed INFO - Node selected for training -> NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b


Since it will send search request to the nodes, the output will inform you about selected nodes for training. It means that those nodes have the dataset and able to train your model.

`set_training_data` will return a `FederatedDataSet` object. You can either use the return value of the setter or the getter for training data which is `training_data()`.

In [11]:
training_data = exp.training_data()

2024-01-09 11:00:51,485 fedbiomed DEBUG - Node: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b polling for the tasks


To inspect the result in detail you can call the method `data()` of the `FederatedDataSet` object. This will return a python dictionary that includes information about the datasets that has been found in the nodes. 

In [12]:
training_data.data()

{'NODE_2277a35a-722f-4a30-8a60-85cf8d873d82': {'name': 'MNIST',
  'data_type': 'default',
  'tags': ['#MNIST', '#dataset'],
  'description': 'MNIST database',
  'shape': [60000, 1, 28, 28],
  'dataset_id': 'dataset_12994421-68ab-4c5a-b589-18633081ec9e',
  'dtypes': [],
  'dataset_parameters': None},
 'NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b': {'name': 'MNIST',
  'data_type': 'default',
  'tags': ['#MNIST', '#dataset'],
  'description': 'MNIST database',
  'shape': [60000, 1, 28, 28],
  'dataset_id': 'dataset_076711ca-794f-44b8-a784-5b30cc7f6df8',
  'dtypes': [],
  'dataset_parameters': None}}

As it is mentioned before, setting training data once doesn't mean that you can't change it. You can create a new `FederatedDataSet` with a `dict` that includes the information about the datasets. This will allow you to select the datasets that will be used for federated training.

<div class="note"><p>Since the dataset information will be provided, there will be no need to send request to the nodes</p></div>

In [13]:
from fedbiomed.researcher.datasets import FederatedDataSet 

tr_data = training_data.data()
federated_dataset = FederatedDataSet(tr_data)
exp.set_training_data(training_data = federated_dataset)

<fedbiomed.researcher.datasets.FederatedDataSet at 0x7fe6bc8e7010>

Or, you can directly use `tr_data` in `set_training_data()`

In [14]:
exp.set_training_data(training_data = tr_data)

<fedbiomed.researcher.datasets.FederatedDataSet at 0x7fe6bc8e5270>

<div class="note">
    <p>
        If you change the tags for the dataset by using <code>set_tags</code> and if there is already a defined training data in your experiment object, you have to update your training data by running <code>exp.set_training_data(training_data=None)</code>.  
    </p>
</div>

### Setting Training Plan for The Experiment

The training plan that is going to be used for training can be set in the experiment using the method `set_training_plan_class`.

In [None]:
exp.set_training_plan_class(training_plan_class=MyTrainingPlan)

### Setting an Aggregator  

An aggregator is one of the required arguments for the experiment. It is used for aggregating model parameters that are received from the nodes after every round. By default, when the experiment is initialized without passing any aggregator, it will automatically use the default `FedAverage` aggregator class. However, it is also possible to set a different aggregation algorithm with the method `set_aggregator`. Currently, Fed-BioMed has only `FedAverage` but it is possible to create a custom aggregator classes.

You can see the current aggregator by running `exp.aggregator()`. It will return the aggregator object that will be used for aggregation. 

In [16]:
exp.aggregator()

<fedbiomed.researcher.aggregators.fedavg.FedAverage at 0x7fe6bc8e5c00>

If we supposed that you have created your own aggregator, you can set it as follows,

In [17]:
from fedbiomed.researcher.aggregators.fedavg import FedAverage
exp.set_aggregator(aggregator=FedAverage)

<fedbiomed.researcher.aggregators.fedavg.FedAverage at 0x7fe6bc8e70d0>

2024-01-09 11:01:30,848 fedbiomed DEBUG - Node: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 polling for the tasks
2024-01-09 11:01:51,487 fedbiomed DEBUG - Node: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b polling for the tasks


If your aggregator class needs initialization parameters, you can build your class and pass as an object .

In [18]:
fed_average = FedAverage()
exp.set_aggregator(aggregator=fed_average)

<fedbiomed.researcher.aggregators.fedavg.FedAverage at 0x7fe6bc8e7d90>

### Setting Node Selection Strategy

Node selection Strategy is also one of the required arguments for the experiment. It is used for selecting nodes before each round of training. Since the strategy will be used for selecting nodes, before setting the strategy, training data should be already set. Then, strategy will be able to which nodes are current with their dataset.

By default, `set_strategy(node_selection_strategy=None)` will use the default `DefaultStrategy` class. It is default strategy that selects all the nodes available with their datasets at the moment. However, it is also possible to set different strategies. Currently, Fed-BioMed has only `DefaultStrategy` but you can create your custom strategy classes.



In [19]:
exp.set_strategy(node_selection_strategy=None)

<fedbiomed.researcher.strategies.default_strategy.DefaultStrategy at 0x7fe6bc8e66b0>

2024-01-09 11:02:30,849 fedbiomed DEBUG - Node: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 polling for the tasks


Or, you can directly pass `DefaultStrategy`

In [20]:
from fedbiomed.researcher.strategies.default_strategy import DefaultStrategy
exp.set_strategy(node_selection_strategy=DefaultStrategy)

# To make the strategy has been set
exp.strategy()

<fedbiomed.researcher.strategies.default_strategy.DefaultStrategy at 0x7fe6bc8e7e80>

### Setting Round Limit

Round limit is the limit that indicates max number of rounds of the training. By default, it is `None` and it needs to be set before running your experiment. You can set the round limit with the method `set_round_limit`. Round limit can  be changed after running one or several rounds of training. You can always execute `exp.round_limit()` to see current round limit.

In [21]:
exp.set_round_limit(round_limit=2)
exp.round_limit()

2

### Setting Job to Manage Federated Training Rounds

Job is a class that manages federated training rounds. Before setting job, strategy for selecting nodes, model and training data should be set. Therefore, please make sure that they all defined before setting job.  The method `set_job` creates the Job instance and it does not take any argument. 

In [22]:
exp.set_job()
exp.job()

AttributeError: 'Experiment' object has no attribute 'set_job'

2024-01-09 11:02:51,484 fedbiomed DEBUG - Node: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b polling for the tasks
2024-01-09 11:03:06,621 fedbiomed DEBUG - Node: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b polling for the tasks
2024-01-09 11:03:30,847 fedbiomed DEBUG - Node: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 polling for the tasks
2024-01-09 11:03:44,424 fedbiomed DEBUG - Node: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 polling for the tasks
2024-01-09 11:04:06,616 fedbiomed DEBUG - Node: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b polling for the tasks
2024-01-09 11:04:14,014 fedbiomed DEBUG - Node: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b polling for the tasks
2024-01-09 11:04:44,423 fedbiomed DEBUG - Node: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 polling for the tasks
2024-01-09 11:04:55,620 fedbiomed DEBUG - Node: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 polling for the tasks
2024-01-09 11:05:14,008 fedbiomed DEBUG - Node: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b polling for th

### Setting Secure Aggregation

Secure aggregation enables nodes to send encrypted updates to the researcher. Thus researcher cannot read cleartext updates from the nodes. Nevertheless, researcher can aggregate updates from the nodes. The aggregated model parameters are then in cleartext, and the researcher can read them.

The method `use_secagg` toggles usage of secure aggregation. When `True`, a secure aggregation context (cryptographic material) for the experiment is negotiated between the experiment parties (if not existing yet) or existing secure aggregation context is re-used. Then usage of secure aggregation by the experiment is activated.

Secure aggregation needs at least 2 active nodes in the experiment (thus 3 parties including the researcher).

In [23]:
exp.set_use_secagg(use_secagg=True)

AttributeError: 'Experiment' object has no attribute 'set_use_secagg'

2024-01-09 11:13:08,316 fedbiomed DEBUG - Node: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 polling for the tasks
2024-01-09 11:13:14,832 fedbiomed DEBUG - Node: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 polling for the tasks
2024-01-09 11:13:34,413 fedbiomed DEBUG - Node: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b polling for the tasks
2024-01-09 11:13:39,370 fedbiomed DEBUG - Node: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b polling for the tasks


In [24]:
print("Using secagg: ", exp.use_secagg())
exp_servkey, exp_biprime = exp.secagg_context()
if exp_servkey:
    print(f"Secagg servkey:\n- status {exp_servkey.status()}\n- secagg_id {exp_servkey.secagg_id ()}" \
        f"\n- context {exp_servkey.context()}")
else:
    print("No secagg servkey")
if exp_biprime:
    print(f"Secagg biprime:\n- status {exp_biprime.status()}\n- secagg_id {exp_biprime.secagg_id ()}" \
        f"\n- context {exp_biprime.context()}")
else:
    print("No secagg biprime")

AttributeError: 'Experiment' object has no attribute 'use_secagg'

In [None]:
exp.set_use_secagg(False)

### Controlling Experiment Status Before Starting Training Rounds
Now, let's see if our experiment is ready for the training.

In [25]:
exp.info()

Arguments             Values
--------------------  ------------------------------------------------------------
Aggregator            FedAverage
Strategy              <fedbiomed.researcher.strategies.default_strategy.DefaultStr
                      ategy object at 0x7fe6bc8e7e80>
Aggregator Optimizer  None
Rounds already run    0
Rounds total          2
Breakpoint State      False
Training Plan Class   <class '__main__.MyTrainingPlan'>
Model Arguments       {}
Tags                  ['#MNIST', '#dataset']
Nodes filter          None
Training Data         <fedbiomed.researcher.datasets.FederatedDataSet object at 0x
                      7fe6bc8e5270>
Training Arguments    {'loader_args': {'batch_size': 48}, 'optimizer_args': {'lr':
                       0.001}, 'epochs': 1, 'dry_run': False, 'batch_maxnum': 100,
                       'num_updates': None, 'test_ratio': 0.0, 'test_on_local_upda
                      tes': False, 'test_on_global_updates': False, 'test_metric':
           

{'Arguments': ['Aggregator',
  'Strategy',
  'Aggregator Optimizer',
  'Rounds already run',
  'Rounds total',
  'Breakpoint State',
  'Training Plan Class',
  'Model Arguments',
  'Tags',
  'Nodes filter',
  'Training Data',
  'Training Arguments',
  'Experiment folder',
  'Experiment Path',
  'Secure Aggregation'],
 'Values': ['FedAverage',
  '<fedbiomed.researcher.strategies.default_strategy.DefaultStr\nategy object at 0x7fe6bc8e7e80>',
  'None',
  '0',
  '2',
  'False',
  "<class '__main__.MyTrainingPlan'>",
  '{}',
  "['#MNIST', '#dataset']",
  'None',
  '<fedbiomed.researcher.datasets.FederatedDataSet object at 0x\n7fe6bc8e5270>',
  "{'loader_args': {'batch_size': 48}, 'optimizer_args': {'lr':\n 0.001}, 'epochs': 1, 'dry_run': False, 'batch_maxnum': 100,\n 'num_updates': None, 'test_ratio': 0.0, 'test_on_local_upda\ntes': False, 'test_on_global_updates': False, 'test_metric':\n None, 'test_metric_args': {}, 'log_interval': 10, 'fedprox_\nmu': None, 'use_gpu': False, 'dp_args': No

2024-01-09 11:14:14,828 fedbiomed DEBUG - Node: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 polling for the tasks


If the experiment is ready, you will see the message that says `Experiment can be run now (fully defined)` at the bottom of the output. So now, we can run the experiment

## Running The Experiment

As long as `info()` says that the experiment is fully defined you will be able to run your experiment. Experiment has two methods  as `run()` and `run_once()` for running training rounds.

 - `run()` runs the experiment rounds from current round to round limit. If the round limit is reached it will indicate that the round limit has been reach. However, the method `run` takes to arguments as `round` and `increase`. 
    - `round` is an integer that indicates number of rounds that are going to be run. If the experiment is at round `0`, the round limit is `4`, and if you pass `round` as 3, it will run the experiment only for `3` rounds.
    - `increase` is a boolean that indicates whether round limit should be increased if the given `round` pass over the round limit. For example, if the current round is `3`, the round limit is `4`, and the `round` argument is `2`, the experiment will increase round limit to `5`
    
 - `run_once()` runs the experiment for single round of training. If the round limit is reached it will indicate that the round limit has been reach. However, if it is executed as `run_once(increase=True)` when the round limit is reach, it increases the round limit for one round.

In [26]:
exp.run_once()

2024-01-09 11:14:21,363 fedbiomed INFO - Sampled nodes in round 0 ['NODE_2277a35a-722f-4a30-8a60-85cf8d873d82', 'NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b']
2024-01-09 11:14:21,376 fedbiomed INFO - [1mSending request[0m 
					[1m To[0m: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 
					[1m Request: [0m: TRAIN
 -----------------------------------------------------------------
2024-01-09 11:14:21,377 fedbiomed INFO - [1mSending request[0m 
					[1m To[0m: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b 
					[1m Request: [0m: TRAIN
 -----------------------------------------------------------------
2024-01-09 11:14:21,380 fedbiomed INFO - Node NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 is in WAITING status. Server is waiting for receiving a request from this node to convert it as ACTIVE. Node will be updated as DISCONNECTED soon if no request received.
2024-01-09 11:14:21,441 fedbiomed DEBUG - Node: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b polling for the tasks
2024-01-09 11:14:21,578

1

2024-01-09 11:14:43,225 fedbiomed DEBUG - Node: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b polling for the tasks
2024-01-09 11:15:22,818 fedbiomed DEBUG - Node: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 polling for the tasks
2024-01-09 11:15:30,256 fedbiomed DEBUG - Node: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 polling for the tasks
2024-01-09 11:15:43,225 fedbiomed DEBUG - Node: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b polling for the tasks


After running the experiment for once, you can check the current round. It returns `1` which means only one round has been run.

In [27]:
exp.round_current()

1

Now, let's run the experiment with `run_once()` again. 

In [28]:
exp.run_once()

2024-01-09 11:15:52,479 fedbiomed INFO - Sampled nodes in round 1 ['NODE_2277a35a-722f-4a30-8a60-85cf8d873d82', 'NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b']
2024-01-09 11:15:52,490 fedbiomed INFO - [1mSending request[0m 
					[1m To[0m: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 
					[1m Request: [0m: TRAIN
 -----------------------------------------------------------------
2024-01-09 11:15:52,493 fedbiomed INFO - [1mSending request[0m 
					[1m To[0m: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b 
					[1m Request: [0m: TRAIN
 -----------------------------------------------------------------
2024-01-09 11:15:52,540 fedbiomed DEBUG - Node: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 polling for the tasks
2024-01-09 11:15:52,544 fedbiomed DEBUG - Node: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b polling for the tasks
2024-01-09 11:15:52,708 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b 
					 Round 2 Epoch: 1 | Iteration: 1/100 (1%) | Sam

1

2024-01-09 11:16:39,546 fedbiomed DEBUG - Node: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 polling for the tasks
2024-01-09 11:16:58,095 fedbiomed DEBUG - Node: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b polling for the tasks


Since the round limit has been set to `2` the round limit had been reached. If you try to run `run()` or `run_once()` the experiment will indicate that the round limit has been reached.

In [29]:
exp.run_once()



0

In [30]:
exp.run()



0

After this point, if you would like to run the experiment you can increase round limit with `set_round_limit(round)`

In [31]:
exp.set_round_limit(4)
print('Round Limit    : ' , exp.round_limit())
print('Current Round  : ' , exp.round_current())

Round Limit    :  4
Current Round  :  2


The round limit of the experiment has been set to `4` and the completed number of rounds is `2`. It means if you run the experiment with method `run()` without passing any argument, it will run the experiment for `2` rounds.

In [32]:
exp.run()

2024-01-09 11:17:33,990 fedbiomed INFO - Sampled nodes in round 2 ['NODE_2277a35a-722f-4a30-8a60-85cf8d873d82', 'NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b']
2024-01-09 11:17:34,002 fedbiomed INFO - [1mSending request[0m 
					[1m To[0m: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 
					[1m Request: [0m: TRAIN
 -----------------------------------------------------------------
2024-01-09 11:17:34,003 fedbiomed INFO - [1mSending request[0m 
					[1m To[0m: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b 
					[1m Request: [0m: TRAIN
 -----------------------------------------------------------------
2024-01-09 11:17:34,046 fedbiomed DEBUG - Node: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 polling for the tasks
2024-01-09 11:17:34,055 fedbiomed DEBUG - Node: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b polling for the tasks
2024-01-09 11:17:34,243 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b 
					 Round 3 Epoch: 1 | Iteration: 1/100 (1%) | Sam

2

2024-01-09 11:18:02,233 fedbiomed DEBUG - Node: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b polling for the tasks
2024-01-09 11:18:46,456 fedbiomed DEBUG - Node: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 polling for the tasks
2024-01-09 11:18:56,166 fedbiomed DEBUG - Node: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 polling for the tasks


Let's check the current round status of the experiment. 

In [33]:
print('Round Limit    : ' , exp.round_limit())
print('Current Round  : ' , exp.round_current())

Round Limit    :  4
Current Round  :  4


Another way to run our experiment if the round limit is reached is passing `rounds` to the method `run()`. For example, following cell will run the experiment for `2` more rounds.

In [34]:
exp.run(rounds=2, increase=True) # increase is True by default

2024-01-09 11:19:02,129 fedbiomed DEBUG - Auto increasing total rounds for experiment from 4 to 6
2024-01-09 11:19:02,132 fedbiomed INFO - Sampled nodes in round 4 ['NODE_2277a35a-722f-4a30-8a60-85cf8d873d82', 'NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b']
2024-01-09 11:19:02,137 fedbiomed INFO - [1mSending request[0m 
					[1m To[0m: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 
					[1m Request: [0m: TRAIN
 -----------------------------------------------------------------
2024-01-09 11:19:02,138 fedbiomed INFO - [1mSending request[0m 
					[1m To[0m: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b 
					[1m Request: [0m: TRAIN
 -----------------------------------------------------------------
2024-01-09 11:19:02,177 fedbiomed DEBUG - Node: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 polling for the tasks
2024-01-09 11:19:02,180 fedbiomed DEBUG - Node: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b polling for the tasks
2024-01-09 11:19:02,343 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_I

2

If the argument `increase` is `False`, it will not increase the round limit automatically. 

In [35]:
exp.run(rounds=2, increase=False)



0

In [36]:
print('Round Limit    : ' , exp.round_limit())
print('Current Round  : ' , exp.round_current())

Round Limit    :  6
Current Round  :  6


2024-01-09 11:20:11,464 fedbiomed DEBUG - Node: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 polling for the tasks
2024-01-09 11:20:14,439 fedbiomed DEBUG - Node: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b polling for the tasks
2024-01-09 11:20:20,569 fedbiomed DEBUG - Node: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b polling for the tasks
2024-01-09 11:21:11,465 fedbiomed DEBUG - Node: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 polling for the tasks
2024-01-09 11:21:20,565 fedbiomed DEBUG - Node: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b polling for the tasks
2024-01-09 11:21:24,811 fedbiomed DEBUG - Node: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 polling for the tasks
2024-01-09 11:21:31,863 fedbiomed DEBUG - Node: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b polling for the tasks
2024-01-09 11:22:24,812 fedbiomed DEBUG - Node: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 polling for the tasks
2024-01-09 11:22:31,862 fedbiomed DEBUG - Node: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b polling for th

It is also possible to increase number of rounds while running the experiment with `run_once()` by passing `increase` argument as `True`

In [37]:
exp.run_once(increase=True)

2024-01-09 11:23:10,397 fedbiomed DEBUG - Auto increasing total rounds for experiment from 6 to 7
2024-01-09 11:23:10,399 fedbiomed INFO - Sampled nodes in round 6 ['NODE_2277a35a-722f-4a30-8a60-85cf8d873d82', 'NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b']
2024-01-09 11:23:10,405 fedbiomed INFO - [1mSending request[0m 
					[1m To[0m: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 
					[1m Request: [0m: TRAIN
 -----------------------------------------------------------------
2024-01-09 11:23:10,406 fedbiomed INFO - [1mSending request[0m 
					[1m To[0m: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b 
					[1m Request: [0m: TRAIN
 -----------------------------------------------------------------
2024-01-09 11:23:10,444 fedbiomed DEBUG - Node: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b polling for the tasks
2024-01-09 11:23:10,446 fedbiomed DEBUG - Node: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 polling for the tasks
2024-01-09 11:23:10,609 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_I

1

In [38]:
print('Round Limit    : ' , exp.round_limit())
print('Current Round  : ' , exp.round_current())

Round Limit    :  7
Current Round  :  7


### Changing Training Arguments for the Next Round

The method `set_training_args()` allows you to change the training arguments even you've already run your experiment several times. Thanks to the method `set_training_args()` you will be able to configure your training from one round to another. For example, we can change our `batch_size` to `64` and `batch_maxnum` to `50` for the next round.


In [39]:
# Training Arguments
training_args = {
    'loader_args': { 'batch_size': 64, }, 
    'optimizer_args': {
        'lr': 1e-3
    },
    'epochs': 1, 
    'dry_run': False,  
    'batch_maxnum': 50
}

exp.set_training_args(training_args=training_args)

{'loader_args': {'batch_size': 64},
 'optimizer_args': {'lr': 0.001},
 'epochs': 1,
 'dry_run': False,
 'batch_maxnum': 50,
 'num_updates': None,
 'test_ratio': 0.0,
 'test_on_local_updates': False,
 'test_on_global_updates': False,
 'test_metric': None,
 'test_metric_args': {},
 'log_interval': 10,
 'fedprox_mu': None,
 'use_gpu': False,
 'dp_args': None,
 'share_persistent_buffers': True,
 'random_seed': None}

In [40]:
exp.run_once(increase=True)

2024-01-09 11:23:22,702 fedbiomed DEBUG - Auto increasing total rounds for experiment from 7 to 8
2024-01-09 11:23:22,703 fedbiomed INFO - Sampled nodes in round 7 ['NODE_2277a35a-722f-4a30-8a60-85cf8d873d82', 'NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b']
2024-01-09 11:23:22,706 fedbiomed INFO - [1mSending request[0m 
					[1m To[0m: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 
					[1m Request: [0m: TRAIN
 -----------------------------------------------------------------
2024-01-09 11:23:22,707 fedbiomed INFO - [1mSending request[0m 
					[1m To[0m: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b 
					[1m Request: [0m: TRAIN
 -----------------------------------------------------------------
2024-01-09 11:23:22,749 fedbiomed DEBUG - Node: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 polling for the tasks
2024-01-09 11:23:22,751 fedbiomed DEBUG - Node: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b polling for the tasks
2024-01-09 11:23:22,922 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_I

1

2024-01-09 11:23:47,861 fedbiomed DEBUG - Node: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 polling for the tasks
2024-01-09 11:23:49,488 fedbiomed DEBUG - Node: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b polling for the tasks
2024-01-09 11:24:47,856 fedbiomed DEBUG - Node: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 polling for the tasks
2024-01-09 11:24:49,485 fedbiomed DEBUG - Node: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b polling for the tasks
2024-01-09 11:24:54,661 fedbiomed DEBUG - Node: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 polling for the tasks
2024-01-09 11:25:04,803 fedbiomed DEBUG - Node: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b polling for the tasks
2024-01-09 11:25:54,657 fedbiomed DEBUG - Node: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 polling for the tasks
2024-01-09 11:25:59,620 fedbiomed DEBUG - Node: NODE_2277a35a-722f-4a30-8a60-85cf8d873d82 polling for the tasks
2024-01-09 11:26:04,798 fedbiomed DEBUG - Node: NODE_fbc33f24-6d43-4d43-9031-ed1c4981a72b polling for th

### Conclusions 
The experiment class is the interface and the orchestrator of the whole processes behind federated training on the researcher side. It allows you to manage your federated training experiment easily. It has been extended with setter and getter methods to ease its declaration. This also provides more control before, during or after the training rounds. The purpose of the experiment class is to provide a robust interface for end-user to make them able to easily perform their federated training on Fed-BioMed nodes.