# Training Process for Approved Models 

Fed-BioMed offers a feature to run only the pre-approved models on the nodes. The nodes which receive your model might require approved models. Therefore, if the node accepts only the approved model, the model files that are sent by a researcher with the training request should be approved by the node side in advance. In this workflow, the approval process is done by a real user/person who reviews the code contained in the model file. The reviewer makes sure the model doesn't contain any code that might cause privacy issues or harm the node.

In this tutorial, we will be creating a node with activated model approval option.  

## Start the network
Before running this notebook, start the network with `./scripts/fedbiomed_run network`

## Setting Up a Node


Enabling model approval can be done both from config file or Fed-BioMed CLI while starting the node. The process of creating and starting a node with model approval option is not so different from setting up a normal node. By default, if no option is specified in the CLI when the node is launched for the first time, the node disables model approval in the security section of the config file. It then looks like the snippet below :

```shell
[security]
hashing_algorithm = SHA256
allow_default_models = True
model_approval = False
```
The Fed-BioMed CLI has two optional extra parameters `--enable-model-approval` and `--allow-default-models` to activate model approval. They choose the config file options, when the node is launched for the first time. They enable one-time override of the config file options at each launch of the node.

* `--enable-model-approval` : This parameter enables model approval for the node. If there isn't a config file for the node while running CLI, it creates a new config file with enabled model approval mode `model_approval = True`. 
* `--allow-default-models`  : This parameter allows default models for train requests. These are the models that come for Fed-BioMed tutorials. For example, the model for MNIST dataset that we will be using for this tutorial. If the default models are enabled, node updates/registers model files which are located in `envs/common/default_models` directory during the starting process of the node. This option has no effect if model approval is not enabled.


### Adding MNIST Dataset to The Node. 

In this section we will add MNIST dataset to the node. While adding the dataset through CLI we'll also specify `--enable-model-approval` and `--allow-default-models` options. This will create new `config-n1.ini` file with following configuration. 

```
[security]
hashing_algorithm = SHA256
allow_default_models = True
model_approval = True

```
Now, let's run the following command. 

```shell
$ ${FEDBIOMED_DIR}/scripts/fedbiomed_run node config config-n1.ini --enable-model-approval --allow-default-models add 
```

The CLI will ask you to select the dataset type. Since we will be working on MNIST dataset, please select `2` (default) and continue by typing `y` for the next prompt and select folder that you want to store MNIST dataset. Afterward, if you go to `etc` directory of fedbiomed, you can see `config-n1.ini` file. 

### Starting the Node

Now you can start your node by running following command; 

```
$ ${FEDBIOMED_DIR}/scripts/fedbiomed_run node config config-n1.ini start
```

Since config file has been configured to enable model approval mode, you do not need to specify any extra parameter while starting the node. But it is also possible to start node with `--enable-model-approval`, `--allow-default-models` or `--disable-model-approval`, `--disable-default-models`. If you start your node with `--disable-model-approval` it will disable model approval even it is enabled in the config file.  


## Creating An Experiment

In this section we will be using default MNIST model which has been already registered by the node.

The following model is the model that will be sent to the node for training. Since the model files are processed by the Experiment to configure dependencies, import part of the final file might be different from this one.

In [None]:
import torch
import torch.nn as nn
from fedbiomed.common.training_plans import TorchTrainingPlan
from fedbiomed.common.data import DataManager
from torchvision import datasets, transforms

# Here we define the model to be used. 
# You can use any class name (here 'Net')
class MyTrainingPlan(TorchTrainingPlan):
    def __init__(self, model_args: dict = {}):
        super(MyTrainingPlan, self).__init__(model_args)
        self.conv1 = nn.Conv2d(1, 32, 3, 1)
        self.conv2 = nn.Conv2d(32, 64, 3, 1)
        self.dropout1 = nn.Dropout(0.25)
        self.dropout2 = nn.Dropout(0.5)
        self.fc1 = nn.Linear(9216, 128)
        self.fc2 = nn.Linear(128, 10)
        
        # Here we define the custom dependencies that will be needed by our custom Dataloader
        # In this case, we need the torch DataLoader classes
        # Since we will train on MNIST, we need datasets and transform from torchvision
        deps = ["from torchvision import datasets, transforms"]
        
        self.add_dependency(deps)

    def forward(self, x):
        x = self.conv1(x)
        x = F.relu(x)
        x = self.conv2(x)
        x = F.relu(x)
        x = F.max_pool2d(x, 2)
        x = self.dropout1(x)
        x = torch.flatten(x, 1)
        x = self.fc1(x)
        x = F.relu(x)
        x = self.dropout2(x)
        x = self.fc2(x)
        
        
        output = F.log_softmax(x, dim=1)
        return output

    def training_data(self, batch_size = 48):
        # Custom torch Dataloader for MNIST data
        transform = transforms.Compose([transforms.ToTensor(),
        transforms.Normalize((0.1307,), (0.3081,))])
        dataset1 = datasets.MNIST(self.dataset_path, train=True, download=False, transform=transform)
        train_kwargs = {'batch_size': batch_size, 'shuffle': True}
        return DataManager(dataset=dataset1, **train_kwargs)
    
    def training_step(self, data, target):
        output = self.forward(data)
        loss   = torch.nn.functional.nll_loss(output, target)
        return loss


To be able to get/see the final model file we need to initialize the experiment. 

In [None]:
from fedbiomed.researcher.experiment import Experiment
from fedbiomed.researcher.aggregators.fedavg import FedAverage

tags =  ['#MNIST', '#dataset']
rounds = 2

model_args = {}

training_args = {
    'batch_size': 48, 
    'lr': 1e-3, 
    'epochs': 1, 
    'dry_run': False,  
    'batch_maxnum': 100 # Fast pass for development : only use ( batch_maxnum * batch_size ) samples
}

exp = Experiment(tags=tags,
                 model_args=model_args,
                 model_class=MyTrainingPlan,
                 training_args=training_args,
                 round_limit=rounds,
                 aggregator=FedAverage(),
                 node_selection_strategy=None)

### Getting Final Model File From Experiment

`model_file()` displays the model file that will be sent to the nodes.  

In [None]:
exp.model_file(display = True)

The `exp.check_model_status()` sends request to the nodes to check whether the model is approved or not. The nodes that will receive the requests are the nodes that have been found after searching datasets. 

In [None]:
status = exp.check_model_status()

In [None]:
status

In [None]:
exp.run_once()

The logs should indicate that the model is approved. You can also get status object from the result of the `check_model_status()`. It returns a list of status objects each for different node. Since we have only launched a single node, it returns only one status object. 

* `approval_obligation` : Indicates whether the model approval is enabled in the node.  
* `status`         : Indicates model status.

## Changing Model And Testing Model Approval Status

Let's change the model codes and test whether it is approved or not. We will be changing the network structure.

In [None]:
import torch
import torch.nn as nn
import torch.nn.functional as F
from fedbiomed.common.training_plans import TorchTrainingPlan
from fedbiomed.common.data import DataManager
from torchvision import datasets, transforms

class MyTrainingPlan(TorchTrainingPlan):
    def __init__(self, model_args: dict = {}):
        super(MyTrainingPlan, self).__init__(model_args)
        self.conv1 = nn.Conv2d(1, 16, 5, 1, 2)
        self.conv2 = nn.Conv2d(16, 32, 5, 1, 2)
        self.fc1 = nn.Linear(32 * 7 * 7, 10)
        deps = ["from torchvision import datasets, transforms"]
        
        self.add_dependency(deps)

    def forward(self, x):
        x = self.conv1(x)
        x = F.relu(x)
        x = F.max_pool2d(x, 2)
        x = self.conv2(x)
        x = F.relu(x)
        x = F.max_pool2d(x, 2)
        x = torch.flatten(x, 1)
        x = self.fc1(x)

        output = F.log_softmax(x, dim=1)
        return output

    def training_data(self, batch_size = 48):
        # Custom torch Dataloader for MNIST data
        
        transform = transforms.Compose([transforms.ToTensor(),
        transforms.Normalize((0.1307,), (0.3081,))])
        dataset1 = datasets.MNIST(self.dataset_path, train=True, download=False, transform=transform)
        train_kwargs = {'batch_size': batch_size, 'shuffle': True}        
        return DataManager(dataset1, **train_kwargs)
    
    def training_step(self, data, target):
        output = self.forward(data)
        loss   = torch.nn.functional.nll_loss(output, target)
        loss += 1
        return loss


Update model path using `set_model_path`.

In [None]:
exp.set_model_class(MyTrainingPlan)
# update job since model_path has been changed
exp.set_job()

Since we changed the model code, the output of the following method should say that the model is not approved by the node and `is_approved` key of the result object should be equal to `False`.

In [None]:
status = exp.check_model_status()

In [None]:
exp.model_file()

In [None]:
status

Since the model is not approved, you won't be able to train your model in the node and experiment will return an error. 

In [None]:
exp.run_once(increase=True)

## Registering/Approving the Model 

To register/approve the model that has been created in the previous section, we can use Fed-BioMed CLI.
In Fed-Biomed, there are two ways of approving a model: 
 1. By sending an `ApprovalRequest` to the `Node`
 2. By adding it directly to the `Node` through model registration facility

 
### 1. Approving a Model through an `ApprovalRequest`

Fed-BioMed 's `Experiment` interface provides a method to submit a model to the `Node`, for approval. `Node` can then review the code and approve the model using cli or gui.

The method of `Experiment` sending such request is `model_approve`

In [None]:
exp.model_approve(MyTrainingPlan, description="my new training plan")

Once the model has been sent, we need to approve it (or reject it) on `Node` side.

Before approving, optionally list models known to the node with their status (`Approved`, `Pending`, `Rejected`). Your new model should appear with `Pending` status and name `my new training plan`.

```bash
$ ${FEDBIOMED_DIR}/scripts/fedbiomed_run node config config-n1.ini --list-models
```

Then approve the model, using the following command on a new terminal:

```shell
$ ${FEDBIOMED_DIR}/scripts/fedbiomed_run node config config-n1.ini --approve-model
```

Models with both `Pending` or `Rejected` status will be displayed. Select the model you have sent to approve it. You might see a message explaining that model has successfully been approved.

Optionally list again models known to the node with their status. Your model should now appear with `Approved` status.

```bash
$ ${FEDBIOMED_DIR}/scripts/fedbiomed_run node config config-n1.ini --list-models
```

Back on the `Researcher` side, let's check it status by running the `check_model_status` command:

In [None]:
exp.check_model_status()

Model's status must have changed from `Pending` status to `Approved`, which means model can be trained from now on on the `Node`. `Researcher` can now run an `Experiment` on the `Node`!

In [None]:
exp.run_once(increase=True)

### 2. Registering a Model through Node interface

You do not need to stop your node to register new models, you can perfom registration process in a different terminal window. However, first we need to get final model from `exp` object

In [None]:
exp.model_file()

The output of the `exp.model_file()` is a file path that shows where the final model is saved. It also prints the content of the model file. You can either get the content of model from the output cell or the path where it is save. Anyway, you need to create a new `txt` file and copy the model content in it. You can create new directory in Fed-BioMed called `models` and inside it, you can create new `my-model.txt` file and copy the model content into it.


```shell
$ mkdir ${FEDBIOMED_DIR}/my_approved_model
$ cp <model_path_file> ${FEDBIOMED_DIR}/my_approved_model/my_model.txt
```
Where `<model_path_file>` is the path of the model that is returned by `exp.model_file(display=False)`


Then model needs to be approved on the `Node` side. First copy the saved model file `my_model.txt` on the node.

Afterward, please run the following command in other terminal to register model file.

```shell
$ ${FEDBIOMED_DIR}/scripts/fedbiomed_run node config config-n1.ini --register-model
```

You should type a unique name for your model e.g. 'MyTestModel-1' and a description. The CLI will ask you select model file you want to register. Select the file that you saved and continue.

Optionally list again models known to the node with their status. Your model should now appear with `Approved` status.

```bash
$ ${FEDBIOMED_DIR}/scripts/fedbiomed_run node config config-n1.ini --list-models
```


Back on the `Researcher` side, you should now be able to train your model.

In [None]:
exp.check_model_status()

In [None]:
exp.run_once(increase=True)

## Rejecting model

On `Node` side, it is possible to reject a Model using cli or GUI. Every type of model can be `Rejected`, even `Default` models. In Fed-BioMed, `Rejected` means that model cannot be trained on the `Node` (but model is still `Registered` into the database).

Using cli, `Node` can run:

```shell
$ ${FEDBIOMED_DIR}/scripts/fedbiomed_run node config config-n1.ini --reject-model
```

and select the model to be `Rejected`. 


In [None]:
exp.check_model_status()

In [None]:
exp.run_once(increase=True)