# Fed-BioMed secure aggregation tutorial


<font size=+2>
    Warning: secure aggregation is a work in progress. In current version it is not fully implement and does not provide any effective security/functionality. This notebook exists only for demonstration purposes.
</font>


## Example experimentation setup

This part contains setup of a basic example for Fed-BioMed. At this point, nothing is specific to secure aggregation.

### Start the network
Before running this notebook, start the network with `./scripts/fedbiomed_run network`

### Setting nodes up
It is necessary to previously configure ** at least two nodes**:
1. `./scripts/fedbiomed_run node config config_node1.ini add` (respectively for the second node: `./scripts/fedbiomed_run node config config_node2.ini add`)
  * Select option 2 (default) to add MNIST to the node
  * Confirm default tags by hitting "y" and ENTER
  * Pick the folder where MNIST is downloaded (this is due to a pytorch issue https://github.com/pytorch/vision/issues/3549)
  * Data must have been added (if you get a warning saying that data must be unique is because it's been already added)
  
2. Check that your data has been added by executing `./scripts/fedbiomed_run config config_node1.ini node list`
3. Run the node using `./scripts/fedbiomed_run config_node1.ini node run`. Wait until you get `Starting task manager`. it means you are online.

### Define an experiment model and parameters"

Declare a torch training plan MyTrainingPlan class to send for training on the node

In [1]:
import torch
import torch.nn as nn
from fedbiomed.common.training_plans import TorchTrainingPlan
from fedbiomed.common.data import DataManager
from torchvision import datasets, transforms


# Here we define the model to be used. 
# You can use any class name (here 'Net')
class MyTrainingPlan(TorchTrainingPlan):
    
    # Defines and return model 
    def init_model(self, model_args):
        return self.Net(model_args = model_args)
    
    # Defines and return optimizer
    def init_optimizer(self, optimizer_args):
        return torch.optim.Adam(self.model().parameters(), lr = optimizer_args["lr"])
    
    # Declares and return dependencies
    def init_dependencies(self):
        deps = ["from torchvision import datasets, transforms"]
        return deps
    
    class Net(nn.Module):
        def __init__(self, model_args):
            super().__init__()
            self.conv1 = nn.Conv2d(1, 32, 3, 1)
            self.conv2 = nn.Conv2d(32, 64, 3, 1)
            self.dropout1 = nn.Dropout(0.25)
            self.dropout2 = nn.Dropout(0.5)
            self.fc1 = nn.Linear(9216, 128)
            self.fc2 = nn.Linear(128, 10)

        def forward(self, x):
            x = self.conv1(x)
            x = F.relu(x)
            x = self.conv2(x)
            x = F.relu(x)
            x = F.max_pool2d(x, 2)
            x = self.dropout1(x)
            x = torch.flatten(x, 1)
            x = self.fc1(x)
            x = F.relu(x)
            x = self.dropout2(x)
            x = self.fc2(x)


            output = F.log_softmax(x, dim=1)
            return output

    def training_data(self, batch_size = 48):
        # Custom torch Dataloader for MNIST data
        transform = transforms.Compose([transforms.ToTensor(),
        transforms.Normalize((0.1307,), (0.3081,))])
        dataset1 = datasets.MNIST(self.dataset_path, train=True, download=False, transform=transform)
        train_kwargs = {'batch_size': batch_size, 'shuffle': True}
        return DataManager(dataset=dataset1, **train_kwargs)
    
    def training_step(self, data, target):
        output = self.model().forward(data)
        loss   = torch.nn.functional.nll_loss(output, target)
        return loss


This group of arguments correspond respectively:
* `model_args`: a dictionary with the arguments related to the model (e.g. number of layers, features, etc.). This will be passed to the model class on the node side.
* `training_args`: a dictionary containing the arguments for the training routine (e.g. batch size, learning rate, epochs, etc.). This will be passed to the routine on the node side.

**NOTE:** typos and/or lack of positional (required) arguments will raise error. 🤓

In [2]:
model_args = {}

training_args = {
    'batch_size': 48, 
    'optimizer_args': {
        "lr" : 1e-3
    },
    'epochs': 1, 
    'dry_run': False,  
    'batch_maxnum': 100 # Fast pass for development : only use ( batch_maxnum * batch_size ) samples
}

### Declare and run the experiment

In [3]:
from fedbiomed.researcher.experiment import Experiment
from fedbiomed.researcher.aggregators.fedavg import FedAverage
from fedbiomed.researcher.secagg import SecureAggregation
tags =  ['#MNIST', '#dataset']
rounds = 2

exp = Experiment(tags=tags,
                 model_args=model_args,
                 training_plan_class=MyTrainingPlan,
                 training_args=training_args,
                 round_limit=rounds,
                 aggregator=FedAverage(),
                 node_selection_strategy=None,
                 secagg=SecureAggregation(clipping_range=100),
                 save_breakpoints=True)

2023-04-04 11:57:31,047 fedbiomed INFO - Messaging researcher_a97b0749-849d-4acd-b6c7-058fc7714e28 successfully connected to the message broker, object = <fedbiomed.common.messaging.Messaging object at 0x7f5702c44b20>
2023-04-04 11:57:31,076 fedbiomed INFO - Searching dataset with data tags: ['#MNIST', '#dataset'] for all nodes
2023-04-04 11:57:41,092 fedbiomed INFO - Node selected for training -> node_c1270b1a-8b86-4eb2-8933-40a9e0400c94
2023-04-04 11:57:41,094 fedbiomed INFO - Node selected for training -> node_38c4ec0d-6b95-41b6-9518-ec343d18b951
2023-04-04 11:57:41,098 fedbiomed INFO - Checking data quality of federated datasets...
Secure RNG turned off. This is perfectly fine for experimentation as it allows for much faster training performance, but remember to turn it on and retrain one last time before production with ``secure_mode`` turned on.
2023-04-04 11:57:41,133 fedbiomed DEBUG - Model file has been saved: /home/scansiz/projects/fedbiomed-dev/fedbiomed/var/experiments/Expe

### Access secure aggregation context

Please use the attribute `secagg` to verify secure aggregation is set as active

In [None]:
print("Is using secagg: ", exp.secagg.active)

It is also possible to check secure aggregation context using `secagg` attribute. Since secure aggregation context negotiation will occur during experiment run, context and id should be `None`

In [None]:
print("Secagg Biprime ", exp.secagg.biprime)
print("Secagg Servkey ", exp.secagg.servkey)

Run the experiment, using secure aggregation. Secure aggregation context will be created in the first round and it going to be updated if new nodes are add or removed. 

In [4]:
exp.run(increase=True)

2023-04-04 11:57:41,405 fedbiomed DEBUG - researcher_a97b0749-849d-4acd-b6c7-058fc7714e28
2023-04-04 11:57:41,406 fedbiomed DEBUG - researcher_a97b0749-849d-4acd-b6c7-058fc7714e28
2023-04-04 11:57:41,409 fedbiomed DEBUG - Secagg context for default_biprime0 is already existing on researcher researcher_id='researcher_a97b0749-849d-4acd-b6c7-058fc7714e28'
2023-04-04 11:57:41,410 fedbiomed INFO - [1mINFO[0m
					[1m NODE[0m node_c1270b1a-8b86-4eb2-8933-40a9e0400c94
					[1m MESSAGE:[0m Node secagg context element for default_biprime0 is already existing for job None[0m
-----------------------------------------------------------------
2023-04-04 11:57:41,412 fedbiomed INFO - [1mINFO[0m
					[1m NODE[0m node_38c4ec0d-6b95-41b6-9518-ec343d18b951
					[1m MESSAGE:[0m Node secagg context element for default_biprime0 is already existing for job None[0m
-----------------------------------------------------------------
2023-04-04 11:57:42,440 fedbiomed DEBUG - researcher_a97b0749-84

2023-04-04 11:57:49,300 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_c1270b1a-8b86-4eb2-8933-40a9e0400c94 
					 Round 1 Epoch: 1 | Iteration: 10/100 (10%) | Samples: 480/4800
 					 Loss: [1m1.424600[0m 
					 ---------
2023-04-04 11:57:49,373 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_38c4ec0d-6b95-41b6-9518-ec343d18b951 
					 Round 1 Epoch: 1 | Iteration: 10/100 (10%) | Samples: 480/4800
 					 Loss: [1m1.441414[0m 
					 ---------
2023-04-04 11:57:50,299 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_c1270b1a-8b86-4eb2-8933-40a9e0400c94 
					 Round 1 Epoch: 1 | Iteration: 20/100 (20%) | Samples: 960/4800
 					 Loss: [1m1.022113[0m 
					 ---------
2023-04-04 11:57:50,434 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_38c4ec0d-6b95-41b6-9518-ec343d18b951 
					 Round 1 Epoch: 1 | Iteration: 20/100 (20%) | Samples: 960/4800
 					 Loss: [1m1.141063[0m 
					 ---------
2023-04-04 11:57:51,118 fedbiomed INFO - [1mTRAINING[0m 
					 NOD

0.044
-6850986132463309025139972046234190769166128136584901668649806908943762809947770779160842399038589117112756161515228517210309124709749052917078289880214421475960694029160784796980864069363163167478157174073192116610698198862326515328711026228750044548482335961050577108159467059554427787736868641900604932001053987114723769312157749120105576635099860602527689646778950745144891604255084752600742266749621302155682441439283180096751010176570290980686561957583691489128215980354101120376898553144691593391844108401386906849722099774462982266994635541684969560190289977822847581691205242021889026698858914395898454966750


2023-04-04 11:59:52,019 fedbiomed DEBUG - Aggregation is completed in 48.55 seconds.
2023-04-04 11:59:52,451 fedbiomed DEBUG - HTTP POST request of file /home/scansiz/projects/fedbiomed-dev/fedbiomed/var/experiments/Experiment_0022/aggregated_params_89dfc075-ab1e-4834-80a5-4645d55d6f5d.mpk successful, with status code 201
2023-04-04 11:59:52,452 fedbiomed INFO - Saved aggregated params for round 0 in /home/scansiz/projects/fedbiomed-dev/fedbiomed/var/experiments/Experiment_0022/aggregated_params_89dfc075-ab1e-4834-80a5-4645d55d6f5d.mpk
2023-04-04 11:59:52,474 fedbiomed INFO - breakpoint for round 0 saved at /home/scansiz/projects/fedbiomed-dev/fedbiomed/var/experiments/Experiment_0022/breakpoint_0000
2023-04-04 11:59:52,519 fedbiomed INFO - Sampled nodes in round 1 ['node_c1270b1a-8b86-4eb2-8933-40a9e0400c94', 'node_38c4ec0d-6b95-41b6-9518-ec343d18b951']
2023-04-04 11:59:52,521 fedbiomed INFO - [1mSending request[0m 
					[1m To[0m: node_c1270b1a-8b86-4eb2-8933-40a9e0400c94 
					

2023-04-04 11:59:59,991 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_38c4ec0d-6b95-41b6-9518-ec343d18b951 
					 Round 2 Epoch: 1 | Iteration: 80/100 (80%) | Samples: 3840/4800
 					 Loss: [1m74224.265625[0m 
					 ---------
2023-04-04 12:00:00,375 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_c1270b1a-8b86-4eb2-8933-40a9e0400c94 
					 Round 2 Epoch: 1 | Iteration: 80/100 (80%) | Samples: 3840/4800
 					 Loss: [1m96884.601562[0m 
					 ---------
2023-04-04 12:00:00,912 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_38c4ec0d-6b95-41b6-9518-ec343d18b951 
					 Round 2 Epoch: 1 | Iteration: 90/100 (90%) | Samples: 4320/4800
 					 Loss: [1m55777.347656[0m 
					 ---------
2023-04-04 12:00:01,291 fedbiomed INFO - [1mTRAINING[0m 
					 NODE_ID: node_c1270b1a-8b86-4eb2-8933-40a9e0400c94 
					 Round 2 Epoch: 1 | Iteration: 90/100 (90%) | Samples: 4320/4800
 					 Loss: [1m36828.890625[0m 
					 ---------
2023-04-04 12:00:01,790 fedbiomed INFO - [1mTRA

0.032
-6850986132463309025139972046234190769166128136584901668649806908943762809947770779160842399038589117112756161515228517210309124709749052917078289880214421475960694029160784796980864069363163167478157174073192116610698198862326515328711026228750044548482335961050577108159467059554427787736868641900604932001053987114723769312157749120105576635099860602527689646778950745144891604255084752600742266749621302155682441439283180096751010176570290980686561957583691489128215980354101120376898553144691593391844108401386906849722099774462982266994635541684969560190289977822847581691205242021889026698858914395898454966750


2023-04-04 12:01:58,704 fedbiomed DEBUG - Aggregation is completed in 50.79 seconds.
2023-04-04 12:01:58,902 fedbiomed CRITICAL - Fed-BioMed stopped due to unknown error:
shape '[10]' is invalid for input of size 6



--------------------
Fed-BioMed researcher stopped due to unknown error:
shape '[10]' is invalid for input of size 6
More details in the backtrace extract below
--------------------
Traceback (most recent call last):
  File "/home/scansiz/projects/fedbiomed-dev/fedbiomed/fedbiomed/researcher/experiment.py", line 66, in payload
    ret = function(*args, **kwargs)
  File "/home/scansiz/projects/fedbiomed-dev/fedbiomed/fedbiomed/researcher/experiment.py", line 1541, in run_once
    self._job.training_plan._model.unflatten(flatten_params)
  File "/home/scansiz/projects/fedbiomed-dev/fedbiomed/fedbiomed/common/models/_torch.py", line 124, in unflatten
    torch.nn.utils.vector_to_parameters(vector, model.parameters())
  File "/user/scansiz/home/miniconda3/envs/fedbiomed-researcher/lib/python3.9/site-packages/torch/nn/utils/convert_parameters.py", line 51, in vector_to_parameters
    param.data = vec[pointer:pointer + num_param].view_as(param).data
RuntimeError: shape '[10]' is invalid for 

FedbiomedSilentTerminationError: 

Display context after runing one round of training. 

In [None]:
print("Secagg Biprime context: ", exp.secagg.biprime.context)
print("Secagg Servkey context: ", exp.secagg.servkey.context)

#### Changes in experiment triggers re-creation of secure aggregation context

The changes that re-create jobs like adding new node to the experiment will trigger secure aggregation re-setup for the next round.  

In [None]:
# sends new dataset search request
from fedbiomed.researcher.strategies import DefaultStrategy
from fedbiomed.researcher.aggregators.fedavg import FedAverage
exp.set_training_data(None, True)
exp.set_strategy(DefaultStrategy)
exp.set_aggregator(FedAverage)
exp.set_job()

In [None]:
exp.run_once(increase=True)

### Changing arguments of secure aggregation

Setting `secagg` argument `True` in `Experiment` creates a default `SecureAggregation` instance. Additionaly, It is also possible to create `SecureAggregation` instance and pass it as an argument. Here ar the arguments that can be set for the `SecureAggregation`

- `active`: `True` if the round will use secure aggregation. Default is `True`
- `clipping_range`: Clipping range that is goingto be use for quantization of model parameters. Default clipping range is `3`. However, some models can have model weigth greater than `3`. If clipping range is exceed during the encryption on the nodes experiment will log a warning message. In such cases, you can provide a higher clipping range thorugh the argument `clipping_range`.
- `timeout`: Timeout is the amount of maximum time, in seconds, that the exerpiment will wait for responsens from all parties during secure aggregation setup. Since secure aggregation context depends on network comunication and multi-party computation, this argument allows to set higher timeout for larger context setups, or vice versa. 

In [None]:
from fedbiomed.researcher.secure_aggregation import SecureAggregation
secagg = SecureAggregation(
    active=True, 
    clipping_range=100,
    timeout=15
    
)
exp.set_secagg(secagg=secagg)


In [None]:
exp.run_once(increase=True)

### Load experiment from a breakpoint

Once a breakpoint is loadded if the context is already exsiting there won't be context setup. 

In [None]:
loaded_exp = Experiment.load_breakpoint()
loaded_exp.info()

In [None]:
loaded_exp.run_once(increase=True)