# Fed-BioMed secure aggregation tutorial

## Example experimentation setup

This part contains setup of a basic example for Fed-BioMed. At this point, nothing is specific to secure aggregation.

### Start the network
Before running this notebook, start the network with `./scripts/fedbiomed_run network`

### Setting nodes up
It is necessary to previously configure ** at least two nodes**:
1. `./scripts/fedbiomed_run node config config_node1.ini add` (respectively for the second node: `./scripts/fedbiomed_run node config config_node2.ini add`)
  * Select option 2 (default) to add MNIST to the node
  * Confirm default tags by hitting "y" and ENTER
  * Pick the folder where MNIST is downloaded (this is due to a pytorch issue https://github.com/pytorch/vision/issues/3549)
  * Data must have been added (if you get a warning saying that data must be unique is because it's been already added)
  
2. Check that your data has been added by executing `./scripts/fedbiomed_run config config_node1.ini node list`
3. Run the node using `./scripts/fedbiomed_run config_node1.ini node run`. Wait until you get `Starting task manager`. it means you are online.

### Define an experiment model and parameters"

Declare a torch training plan MyTrainingPlan class to send for training on the node

In [28]:
import torch
import torch.nn as nn
from fedbiomed.common.training_plans import TorchTrainingPlan
from fedbiomed.common.data import DataManager
from torchvision import datasets, transforms


# Here we define the model to be used. 
# You can use any class name (here 'Net')
class MyTrainingPlan(TorchTrainingPlan):
    
    # Defines and return model 
    def init_model(self, model_args):
        return self.Net(model_args = model_args)
    
    # Defines and return optimizer
    def init_optimizer(self, optimizer_args):
        return torch.optim.Adam(self.model().parameters(), lr = optimizer_args["lr"])
    
    # Declares and return dependencies
    def init_dependencies(self):
        deps = ["from torchvision import datasets, transforms"]
        return deps
    
    class Net(nn.Module):
        def __init__(self, model_args):
            super().__init__()
            self.conv1 = nn.Conv2d(1, 32, 3, 1)
            self.conv2 = nn.Conv2d(32, 64, 3, 1)
            self.dropout1 = nn.Dropout(0.25)
            self.dropout2 = nn.Dropout(0.5)
            self.fc1 = nn.Linear(9216, 128)
            self.fc2 = nn.Linear(128, 10)

        def forward(self, x):
            x = self.conv1(x)
            x = F.relu(x)
            x = self.conv2(x)
            x = F.relu(x)
            x = F.max_pool2d(x, 2)
            x = self.dropout1(x)
            x = torch.flatten(x, 1)
            x = self.fc1(x)
            x = F.relu(x)
            x = self.dropout2(x)
            x = self.fc2(x)


            output = F.log_softmax(x, dim=1)
            return output

    def training_data(self, batch_size = 48):
        # Custom torch Dataloader for MNIST data
        transform = transforms.Compose([transforms.ToTensor(),
        transforms.Normalize((0.1307,), (0.3081,))])
        dataset1 = datasets.MNIST(self.dataset_path, train=True, download=False, transform=transform)
        train_kwargs = {'batch_size': batch_size, 'shuffle': True}
        return DataManager(dataset=dataset1, **train_kwargs)
    
    def training_step(self, data, target):
        output = self.model().forward(data)
        loss   = torch.nn.functional.nll_loss(output, target)
        return loss


This group of arguments correspond respectively:
* `model_args`: a dictionary with the arguments related to the model (e.g. number of layers, features, etc.). This will be passed to the model class on the node side.
* `training_args`: a dictionary containing the arguments for the training routine (e.g. batch size, learning rate, epochs, etc.). This will be passed to the routine on the node side.

**NOTE:** typos and/or lack of positional (required) arguments will raise error. 🤓

In [29]:
model_args = {}

training_args = {
    'batch_size': 48, 
    'optimizer_args': {
        "lr" : 1e-3
    },
    'epochs': 1, 
    'dry_run': False,  
    'batch_maxnum': 100 # Fast pass for development : only use ( batch_maxnum * batch_size ) samples
}

### Declare and run the experiment

In [30]:
from fedbiomed.researcher.experiment import Experiment
from fedbiomed.researcher.aggregators.fedavg import FedAverage
from fedbiomed.researcher.secagg import SecureAggregation
tags =  ['#MNIST', '#dataset']
rounds = 2

exp = Experiment(tags=tags,
                 model_args=model_args,
                 training_plan_class=MyTrainingPlan,
                 training_args=training_args,
                 round_limit=rounds,
                 aggregator=FedAverage(),
                 node_selection_strategy=None,
                 secagg=True, # or custom SecureAggregation(active=<bool>, clipping_range=<int>, timeout=<int>)
                 save_breakpoints=True)

2023-06-21 09:31:22,954 fedbiomed INFO - Searching dataset with data tags: ['#MNIST', '#dataset'] for all nodes
2023-06-21 09:31:32,964 fedbiomed INFO - Node selected for training -> node_cc8823f5-b234-447e-a41a-c2ad64a1072f
2023-06-21 09:31:32,966 fedbiomed INFO - Node selected for training -> node_8d470259-46b9-46cd-adba-3d050899da73
2023-06-21 09:31:32,970 fedbiomed INFO - Checking data quality of federated datasets...
Secure RNG turned off. This is perfectly fine for experimentation as it allows for much faster training performance, but remember to turn it on and retrain one last time before production with ``secure_mode`` turned on.
2023-06-21 09:31:32,983 fedbiomed DEBUG - using native torch optimizer
2023-06-21 09:31:32,985 fedbiomed DEBUG - Model file has been saved: /workspaces/Projects/fedbiomed/var/experiments/Experiment_0014/my_model_9c785a41-e198-4e14-913a-6bbcf6e7b454.py
2023-06-21 09:31:33,004 fedbiomed DEBUG - HTTP POST request of file /workspaces/Projects/fedbiomed/var

### Access secure aggregation context

Please use the attribute `secagg` to verify secure aggregation is set as active

In [31]:
print("Is using secagg: ", exp.secagg.active)

Is using secagg:  True


It is also possible to check secure aggregation context using `secagg` attribute. Since secure aggregation context negotiation will occur during experiment run, context and id should be `None` at this point.

In [32]:
print("Secagg Biprime ", exp.secagg.biprime)
print("Secagg Servkey ", exp.secagg.servkey)

Secagg Biprime  None
Secagg Servkey  None


Run the experiment, using secure aggregation. Secure aggregation context will be created before the first training round and it is going to be updated before each round when new nodes are added or removed to the experiment. 

In [33]:
exp.run(increase=True)

2023-06-21 09:31:33,154 fedbiomed DEBUG - researcher_73f4aba0-45be-4b82-ae3f-ce4cfd2dea68
2023-06-21 09:31:33,154 fedbiomed DEBUG - researcher_73f4aba0-45be-4b82-ae3f-ce4cfd2dea68
2023-06-21 09:31:33,155 fedbiomed DEBUG - Secagg context for default_biprime0 is already existing on researcher researcher_id='researcher_73f4aba0-45be-4b82-ae3f-ce4cfd2dea68'
2023-06-21 09:31:33,158 fedbiomed INFO - [1mINFO[0m
					[1m NODE[0m node_cc8823f5-b234-447e-a41a-c2ad64a1072f
					[1m MESSAGE:[0m Node secagg context element for default_biprime0 is already existing for job None[0m
-----------------------------------------------------------------
2023-06-21 09:31:33,161 fedbiomed INFO - [1mINFO[0m
					[1m NODE[0m node_8d470259-46b9-46cd-adba-3d050899da73
					[1m MESSAGE:[0m Node secagg context element for default_biprime0 is already existing for job None[0m
-----------------------------------------------------------------
2023-06-21 09:31:34,158 fedbiomed DEBUG - researcher_73f4aba0-45

2023-06-21 09:31:39,506 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_8d470259-46b9-46cd-adba-3d050899da73 
					 Round 1 Epoch: 1 | Iteration: 10/100 (10%) | Samples: 480/4800
 					 Loss: [1m1.115050[0m 
					 ---------
2023-06-21 09:31:40,002 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_cc8823f5-b234-447e-a41a-c2ad64a1072f 
					 Round 1 Epoch: 1 | Iteration: 10/100 (10%) | Samples: 480/4800
 					 Loss: [1m1.446807[0m 
					 ---------
2023-06-21 09:31:40,761 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_8d470259-46b9-46cd-adba-3d050899da73 
					 Round 1 Epoch: 1 | Iteration: 20/100 (20%) | Samples: 960/4800
 					 Loss: [1m0.764307[0m 
					 ---------
2023-06-21 09:31:41,140 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_cc8823f5-b234-447e-a41a-c2ad64a1072f 
					 Round 1 Epoch: 1 | Iteration: 20/100 (20%) | Samples: 960/4800
 					 Loss: [1m0.952498[0m 
					 ---------
2023-06-21 09:31:42,081 fedbiomed INFO - [1mTRAINING[0m 
					 NOD

2023-06-21 09:32:53,692 fedbiomed INFO - Validation is completed.
2023-06-21 09:32:53,692 fedbiomed INFO - Aggregating encrypted parameters. This process may take some time depending on model size.
2023-06-21 09:33:47,571 fedbiomed DEBUG - Aggregation is completed in 53.61 seconds.
2023-06-21 09:33:47,924 fedbiomed DEBUG - HTTP POST request of file /workspaces/Projects/fedbiomed/var/experiments/Experiment_0014/aggregated_params_1b3f1db9-6976-407d-adde-3f3f1b9e8c7a.mpk successful, with status code 201
2023-06-21 09:33:47,925 fedbiomed INFO - Saved aggregated params for round 0 in /workspaces/Projects/fedbiomed/var/experiments/Experiment_0014/aggregated_params_1b3f1db9-6976-407d-adde-3f3f1b9e8c7a.mpk
2023-06-21 09:33:47,945 fedbiomed INFO - breakpoint for round 0 saved at /workspaces/Projects/fedbiomed/var/experiments/Experiment_0014/breakpoint_0000
2023-06-21 09:33:47,962 fedbiomed INFO - Sampled nodes in round 1 ['node_cc8823f5-b234-447e-a41a-c2ad64a1072f', 'node_8d470259-46b9-46cd-adb

2023-06-21 09:33:56,506 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_cc8823f5-b234-447e-a41a-c2ad64a1072f 
					 Round 2 Epoch: 1 | Iteration: 70/100 (70%) | Samples: 3360/4800
 					 Loss: [1m0.149095[0m 
					 ---------
2023-06-21 09:33:57,521 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_8d470259-46b9-46cd-adba-3d050899da73 
					 Round 2 Epoch: 1 | Iteration: 80/100 (80%) | Samples: 3840/4800
 					 Loss: [1m0.235595[0m 
					 ---------
2023-06-21 09:33:57,691 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_cc8823f5-b234-447e-a41a-c2ad64a1072f 
					 Round 2 Epoch: 1 | Iteration: 80/100 (80%) | Samples: 3840/4800
 					 Loss: [1m0.114578[0m 
					 ---------
2023-06-21 09:33:58,796 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_cc8823f5-b234-447e-a41a-c2ad64a1072f 
					 Round 2 Epoch: 1 | Iteration: 90/100 (90%) | Samples: 4320/4800
 					 Loss: [1m0.116449[0m 
					 ---------
2023-06-21 09:33:59,729 fedbiomed INFO - [1mTRAINING[0m 
					

2

Display context after running one round of training. 

In [34]:
print("Secagg Biprime context: ", exp.secagg.biprime.context)
print("Secagg Servkey context: ", exp.secagg.servkey.context)

Secagg Biprime context:  {'secagg_id': 'default_biprime0', 'parties': None, 'context': {'biprime': 158820908809271716671659880613366104677813341255487834154303909761107215283569995523817428402987962641429395032343305343341950966867458277812575065022203120547706127493272939455658018882112230042773163870472621818892994896895819790062496734944602899772583591514631486212290112369502692304700112819186167541107, 'max_keysize': 2048}}
Secagg Servkey context:  {'job_id': 'bfacbe08-283a-4215-bf43-6bf7b3854b80', 'context': {'server_key': -188310427165755324208836701042237103358281273043466688319813734932437400034024381615584713641255368419717346850634295892653493246830862327418390507611776701499867703039856877906810180199583258835615511352542604831805936943591192723945935235537840508879575966994046400313459517155830636661389140680063132577159689031803983707019007509670757974499414603847639687945555700953374365296210798553882444647608026765380392066224117172549038752338323108129700344587225286239

#### Changes in experiment triggers re-creation of secure aggregation context

The changes that re-create jobs like adding new node to the experiment will trigger automatic secure aggregation re-setup for the next round.  

In [None]:
# sends new dataset search request
from fedbiomed.researcher.strategies import DefaultStrategy
from fedbiomed.researcher.aggregators.fedavg import FedAverage
exp.set_training_data(None, True)
exp.set_strategy(DefaultStrategy)
exp.set_aggregator(FedAverage)
exp.set_job()

In [None]:
exp.run_once(increase=True)

### Changing arguments of secure aggregation

Setting `secagg` argument `True` in `Experiment` creates a default `SecureAggregation` instance. Additionally, It is also possible to create `SecureAggregation` instance and pass it as an argument. Here are the arguments that can be set for the `SecureAggregation`

- `active`: `True` if the round will use secure aggregation. Default is `True`
- `clipping_range`: Clipping range that is goingto be use for quantization of model parameters. Default clipping range is `3`. However, some models can have model weigths greater than `3`. If clipping range is exceeded during the encryption on the nodes, `Experiment` will log a warning message. In such cases, you can provide a higher clipping range through the argument `clipping_range`.
- `timeout`: Timeout is the maximum amount of time, in seconds, that the experiment will wait for responses from all parties during secure aggregation setup. Since secure aggregation context depends on network communication and multi-party computation, this argument allows to set higher timeout for larger context setups, or vice versa. 

In [None]:
from fedbiomed.researcher.secagg import SecureAggregation
secagg = SecureAggregation(
    active=True, 
    clipping_range=100,
    timeout=15
    
)
exp.set_secagg(secagg=secagg)


In [None]:
exp.run_once(increase=True)

### Load experiment from a breakpoint

Once a breakpoint is loadded if the context is already exsiting there won't be context setup. 

In [None]:
loaded_exp = Experiment.load_breakpoint()
loaded_exp.info()

In [None]:
loaded_exp.run_once(increase=True)

2023-06-19 15:26:33,226 fedbiomed INFO - [1mCRITICAL[0m
					[1m NODE[0m node_8d470259-46b9-46cd-adba-3d050899da73
					[1m MESSAGE:[0m Node stopped in signal_handler, probably by user decision (Ctrl C)[0m
-----------------------------------------------------------------
2023-06-19 15:26:35,064 fedbiomed INFO - [1mCRITICAL[0m
					[1m NODE[0m node_cc8823f5-b234-447e-a41a-c2ad64a1072f
					[1m MESSAGE:[0m Node stopped in signal_handler, probably by user decision (Ctrl C)[0m
-----------------------------------------------------------------
2023-06-19 15:26:45,543 fedbiomed INFO - [1mINFO[0m
					[1m NODE[0m node_8d470259-46b9-46cd-adba-3d050899da73
					[1m MESSAGE:[0m Starting task manager[0m
-----------------------------------------------------------------
2023-06-19 15:26:53,902 fedbiomed INFO - [1mINFO[0m
					[1m NODE[0m node_cc8823f5-b234-447e-a41a-c2ad64a1072f
					[1m MESSAGE:[0m Starting task manager[0m
------------------------------------------------