# Fedbiomed Researcher

Use for developing (autoreloads changes made across packages)

In [1]:
%load_ext autoreload
%autoreload 2

## Setting the node up
It is necessary to previously configure a node:
1. `./scripts/fedbiomed_run node add`
  * Select option 2 (default) to add MNIST to the node
  * Confirm default tags by hitting "y" and ENTER
  * Pick the folder where MNIST is downloaded (this is due torch issue https://github.com/pytorch/vision/issues/3549)
  * Data must have been added (if you get a warning saying that data must be unique is because it's been already added)
  
2. Check that your data has been added by executing `./scripts/fedbiomed_run node list`
3. By default model approval won't be enabled. Thats't you should start the by indicating that you want enable model approval and default models. Defaults models are the models that will be automaticly regsitered while starting the node. Please run, `./scripts/fedbiomed_run node --enable-model-approval --allow-default-models start`. As output should be presented as follows. 

```
	- 🆔 Your node ID: node_ba338496-736c-471d-8e0c-944493a36e57 

2021-11-22 17:36:31,408 fedbiomed INFO - Node started as process with pid = 42355
To stop press Ctrl + C.
2021-11-22 17:36:31,409 fedbiomed INFO - Launching node...
2021-11-22 17:36:31,409 fedbiomed INFO - Checking hashes for registered models...
2021-11-22 17:36:31,410 fedbiomed INFO - There is no models registered
2021-11-22 17:36:31,410 fedbiomed INFO - Loading default models
2021-11-22 17:36:31,563 fedbiomed INFO - Starting communication channel with network
2021-11-22 17:36:31,571 fedbiomed INFO - Messaging node_ba338496-736c-471d-8e0c-944493a36e57 successfully connected to the message broker, object = <fedbiomed.common.messaging.Messaging object at 0x7f4b44843280>
2021-11-22 17:36:31,577 fedbiomed DEBUG -  adding handler: MQTT
2021-11-22 17:36:31,577 fedbiomed INFO - Starting task manager

```

## Create an experiment to train a model on the data found

Declare a torch.nn MyTrainingPlan class to send for training on the node

In [2]:
from fedbiomed.researcher.environ import environ
import tempfile
tmp_dir_model = tempfile.TemporaryDirectory(dir=environ['TMP_DIR']+'/')
model_file = tmp_dir_model.name + '/class_export_mnist.py'

Note : write **only** the code to export in the following cell

In [3]:
%%writefile "$model_file"

import torch
import torch.nn as nn
from fedbiomed.common.torchnn import TorchTrainingPlan
from torch.utils.data import DataLoader
from torchvision import datasets, transforms

# Here we define the model to be used. 
# You can use any class name (here 'Net')
class MyTrainingPlan(TorchTrainingPlan):
    def __init__(self):
        super(MyTrainingPlan, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, 3, 1)
        self.conv2 = nn.Conv2d(32, 64, 3, 1)
        self.dropout1 = nn.Dropout(0.25)
        self.dropout2 = nn.Dropout(0.5)
        self.fc1 = nn.Linear(9216, 128)
        self.fc2 = nn.Linear(128, 10)
        
        # Here we define the custom dependencies that will be needed by our custom Dataloader
        # In this case, we need the torch DataLoader classes
        # Since we will train on MNIST, we need datasets and transform from torchvision
        deps = ["from torchvision import datasets, transforms",
               "from torch.utils.data import DataLoader"]
        self.add_dependency(deps)

    def forward(self, x):
        x = self.conv1(x)
        x = F.relu(x)
        x = self.conv2(x)
        x = F.relu(x)
        x = F.max_pool2d(x, 2)
        x = self.dropout1(x)
        x = torch.flatten(x, 1)
        x = self.fc1(x)
        x = F.relu(x)
        x = self.dropout2(x)
        x = self.fc2(x)
        
        
        output = F.log_softmax(x, dim=1)
        return output

    def training_data(self, batch_size = 48):
        # Custom torch Dataloader for MNIST data
        transform = transforms.Compose([transforms.ToTensor(),
        transforms.Normalize((0.1307,), (0.3081,))])
        dataset1 = datasets.MNIST(self.dataset_path, train=True, download=False, transform=transform)
        train_kwargs = {'batch_size': batch_size, 'shuffle': True}
        data_loader = torch.utils.data.DataLoader(dataset1, **train_kwargs)
        return data_loader
    
    def training_step(self, data, target):
        output = self.forward(data)
        loss   = torch.nn.functional.nll_loss(output, target)
        return loss


Writing /user/scansiz/home/Desktop/Inria/development/fedbiomed/var/tmp/tmpgw_010lb/class_export_mnist.py


This group of arguments correspond respectively:
* `model_args`: a dictionary with the arguments related to the model (e.g. number of layers, features, etc.). This will be passed to the model class on the node side.
* `training_args`: a dictionary containing the arguments for the training routine (e.g. batch size, learning rate, epochs, etc.). This will be passed to the routine on the node side.

**NOTE:** typos and/or lack of positional (required) arguments will raise error. 🤓

In [4]:
model_args = {}

training_args = {
    'batch_size': 48, 
    'lr': 1e-3, 
    'epochs': 1, 
    'dry_run': False,  
    'batch_maxnum': 100 # Fast pass for development : only use ( batch_maxnum * batch_size ) samples
}

Define an experiment
- search nodes serving data for these `tags`, optionally filter on a list of node ID with `nodes`
- run a round of local training on nodes with model defined in `model_path` + federation with `aggregator`
- run for `rounds` rounds, applying the `node_selection_strategy` between the rounds

In [5]:
from fedbiomed.researcher.experiment import Experiment
from fedbiomed.researcher.aggregators.fedavg import FedAverage

tags =  ['#MNIST', '#dataset']
rounds = 2

exp = Experiment(tags=tags,
                 #nodes=None,
                 model_path=model_file,
                 model_args=model_args,
                 model_class='MyTrainingPlan',
                 training_args=training_args,
                 rounds=rounds,
                 aggregator=FedAverage(),
                 node_selection_strategy=None)

2021-11-22 17:43:33,063 fedbiomed INFO - Messaging researcher_dac6b4aa-3359-4918-a20e-36f8564c3910 successfully connected to the message broker, object = <fedbiomed.common.messaging.Messaging object at 0x7fa4e86bd7f0>
2021-11-22 17:43:33,111 fedbiomed INFO - Searching dataset with data tags: ['#MNIST', '#dataset'] for all nodes
2021-11-22 17:43:33,113 fedbiomed INFO - log from: node_ba338496-736c-471d-8e0c-944493a36e57 - DEBUG Message received: {'researcher_id': 'researcher_dac6b4aa-3359-4918-a20e-36f8564c3910', 'tags': ['#MNIST', '#dataset'], 'command': 'search'}
2021-11-22 17:43:33,113 fedbiomed INFO - log from: node_5f0bce0e-4f54-4ca1-aecf-f0a9c0ba1bd4 - DEBUG Message received: {'researcher_id': 'researcher_dac6b4aa-3359-4918-a20e-36f8564c3910', 'tags': ['#MNIST', '#dataset'], 'command': 'search'}
2021-11-22 17:43:43,147 fedbiomed INFO - Node selected for training -> node_5f0bce0e-4f54-4ca1-aecf-f0a9c0ba1bd4
2021-11-22 17:43:43,257 fedbiomed DEBUG - torchnn saved model filename: /us

### Getting Final Model File From Experiment

`get_model_file` displays the model model file that will be send to the nodes. 

In [6]:
exp.get_model_file(display = True)

from fedbiomed.common.torchnn import TorchTrainingPlan
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import DataLoader
from torchvision import datasets, transforms
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

class MyTrainingPlan(TorchTrainingPlan):
    def __init__(self):
        super(MyTrainingPlan, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, 3, 1)
        self.conv2 = nn.Conv2d(32, 64, 3, 1)
        self.dropout1 = nn.Dropout(0.25)
        self.dropout2 = nn.Dropout(0.5)
        self.fc1 = nn.Linear(9216, 128)
        self.fc2 = nn.Linear(128, 10)
        
        # Here we define the custom dependencies that will be needed by our custom Dataloader
        # In this case, we need the torch DataLoader classes
        # Since we will train on MNIST, we need datasets and transform from torchvision
        deps = ["from torchvision import datasets, transforms",
               "from torch.util

'/user/scansiz/home/Desktop/Inria/development/fedbiomed/var/tmpwujcu76s/my_model_2b5987a1-4d77-4719-8565-5e434e970086.py'

The `exp.get_model_status()` send request to nodes that have been found after dataset search to check whether the model is approved or not.

In [7]:
exp.check_model_status()

2021-11-22 17:43:46,994 fedbiomed INFO - Sending request to node node_5f0bce0e-4f54-4ca1-aecf-f0a9c0ba1bd4 to check model is approved or not
2021-11-22 17:43:46,995 fedbiomed DEBUG - researcher_dac6b4aa-3359-4918-a20e-36f8564c3910
2021-11-22 17:43:47,022 fedbiomed INFO - log from: node_5f0bce0e-4f54-4ca1-aecf-f0a9c0ba1bd4 - DEBUG Message received: {'researcher_id': 'researcher_dac6b4aa-3359-4918-a20e-36f8564c3910', 'job_id': 'b33424ee-0d2f-4afa-860c-8ebdb7bea24a', 'model_url': 'http://localhost:8844/media/uploads/2021/11/22/my_model_2b5987a1-4d77-4719-8565-5e434e970086.py', 'command': 'model-status'}
2021-11-22 17:43:57,006 fedbiomed INFO - Model has been approved by the node: node_5f0bce0e-4f54-4ca1-aecf-f0a9c0ba1bd4


[{'researcher_id': 'researcher_dac6b4aa-3359-4918-a20e-36f8564c3910',
  'node_id': 'node_5f0bce0e-4f54-4ca1-aecf-f0a9c0ba1bd4',
  'job_id': 'b33424ee-0d2f-4afa-860c-8ebdb7bea24a',
  'success': True,
  'approval_obligation': True,
  'is_approved': True,
  'msg': 'Model is approved by the node',
  'model_url': 'http://localhost:8844/media/uploads/2021/11/22/my_model_2b5987a1-4d77-4719-8565-5e434e970086.py',
  'command': 'model-status'}]

## Changing Model And Testing Model Approval Status

Let's change the model codes and test whether it is approved or not. All we'll do is the add `print` function in `traning_data` method.

In [10]:
from fedbiomed.researcher.environ import environ
import tempfile
tmp_dir_model = tempfile.TemporaryDirectory(dir=environ['TMP_DIR']+'/')
model_file_2 = tmp_dir_model.name + '/class_export_mnist_2.py'

In [11]:
%%writefile "$model_file_2"

import torch
import torch.nn as nn
from fedbiomed.common.torchnn import TorchTrainingPlan
from torch.utils.data import DataLoader
from torchvision import datasets, transforms

# Here we define the model to be used. 
# You can use any class name (here 'Net')
class MyTrainingPlan(TorchTrainingPlan):
    def __init__(self):
        super(MyTrainingPlan, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, 3, 1)
        self.conv2 = nn.Conv2d(32, 64, 3, 1)
        self.dropout1 = nn.Dropout(0.25)
        self.dropout2 = nn.Dropout(0.5)
        self.fc1 = nn.Linear(9216, 128)
        self.fc2 = nn.Linear(128, 10)
        
        # Here we define the custom dependencies that will be needed by our custom Dataloader
        # In this case, we need the torch DataLoader classes
        # Since we will train on MNIST, we need datasets and transform from torchvision
        deps = ["from torchvision import datasets, transforms",
               "from torch.utils.data import DataLoader"]
        self.add_dependency(deps)

    def forward(self, x):
        x = self.conv1(x)
        x = F.relu(x)
        x = self.conv2(x)
        x = F.relu(x)
        x = F.max_pool2d(x, 2)
        x = self.dropout1(x)
        x = torch.flatten(x, 1)
        x = self.fc1(x)
        x = F.relu(x)
        x = self.dropout2(x)
        x = self.fc2(x)
        
        
        output = F.log_softmax(x, dim=1)
        return output

    def training_data(self, batch_size = 48):
        # Custom torch Dataloader for MNIST data
        
        transform = transforms.Compose([transforms.ToTensor(),
        transforms.Normalize((0.1307,), (0.3081,))])
        dataset1 = datasets.MNIST(self.dataset_path, train=True, download=False, transform=transform)
        train_kwargs = {'batch_size': batch_size, 'shuffle': True}
        data_loader = torch.utils.data.DataLoader(dataset1, **train_kwargs)
        
        # New added code
        print(dataset1)
        
        return data_loader
    
    def training_step(self, data, target):
        output = self.forward(data)
        loss   = torch.nn.functional.nll_loss(output, target)
        return loss


Writing /user/scansiz/home/Desktop/Inria/development/fedbiomed/var/tmp/tmp4vk9uq4_/class_export_mnist_2.py


In [12]:
from fedbiomed.researcher.experiment import Experiment
from fedbiomed.researcher.aggregators.fedavg import FedAverage

tags =  ['#MNIST', '#dataset']
rounds = 2

exp2 = Experiment(tags=tags,
                 #nodes=None,
                 model_path=model_file_2,
                 model_args=model_args,
                 model_class='MyTrainingPlan',
                 training_args=training_args,
                 rounds=rounds,
                 aggregator=FedAverage(),
                 node_selection_strategy=None)

2021-11-22 17:44:23,211 fedbiomed INFO - Searching dataset with data tags: ['#MNIST', '#dataset'] for all nodes
2021-11-22 17:44:23,213 fedbiomed INFO - log from: node_ba338496-736c-471d-8e0c-944493a36e57 - DEBUG Message received: {'researcher_id': 'researcher_dac6b4aa-3359-4918-a20e-36f8564c3910', 'tags': ['#MNIST', '#dataset'], 'command': 'search'}
2021-11-22 17:44:23,214 fedbiomed INFO - log from: node_5f0bce0e-4f54-4ca1-aecf-f0a9c0ba1bd4 - DEBUG Message received: {'researcher_id': 'researcher_dac6b4aa-3359-4918-a20e-36f8564c3910', 'tags': ['#MNIST', '#dataset'], 'command': 'search'}
2021-11-22 17:44:33,222 fedbiomed INFO - Node selected for training -> node_5f0bce0e-4f54-4ca1-aecf-f0a9c0ba1bd4
2021-11-22 17:44:33,265 fedbiomed DEBUG - torchnn saved model filename: /user/scansiz/home/Desktop/Inria/development/fedbiomed/var/tmprgrylmi5/my_model_15521569-303f-42a1-a7dd-ef797b7d8a0d.py


Since we changed the codes follwing method should say that the model is not approved by the node. 

In [13]:
exp2.check_model_status()

2021-11-22 17:44:33,479 fedbiomed INFO - Sending request to node node_5f0bce0e-4f54-4ca1-aecf-f0a9c0ba1bd4 to check model is approved or not
2021-11-22 17:44:33,480 fedbiomed DEBUG - researcher_dac6b4aa-3359-4918-a20e-36f8564c3910
2021-11-22 17:44:33,506 fedbiomed INFO - log from: node_5f0bce0e-4f54-4ca1-aecf-f0a9c0ba1bd4 - DEBUG Message received: {'researcher_id': 'researcher_dac6b4aa-3359-4918-a20e-36f8564c3910', 'job_id': '11f11c99-f065-49de-b6a2-a74da2492d29', 'model_url': 'http://localhost:8844/media/uploads/2021/11/22/my_model_15521569-303f-42a1-a7dd-ef797b7d8a0d.py', 'command': 'model-status'}


[{'researcher_id': 'researcher_dac6b4aa-3359-4918-a20e-36f8564c3910',
  'node_id': 'node_5f0bce0e-4f54-4ca1-aecf-f0a9c0ba1bd4',
  'job_id': '11f11c99-f065-49de-b6a2-a74da2492d29',
  'success': True,
  'approval_obligation': True,
  'is_approved': False,
  'msg': 'Model is not approved by the node',
  'model_url': 'http://localhost:8844/media/uploads/2021/11/22/my_model_15521569-303f-42a1-a7dd-ef797b7d8a0d.py',
  'command': 'model-status'}]