# Running FL Admin to orchestrate an FL Experiment 

This notebook will walk you through the work flow of FL admin who would conduct FL experiments. 
Note this is the only persona that has control over the FL experiments. 
That is once server and clients have started, lead researcher can run Fl experiments using the CLI through the admin client.
The following types of commands are available:
- Check system operating status
- View system logs
- Deploy MMARs (training configuration) to server and clients
- Start, stop training
- Clean up training results data (not the training datasets).
- Shutdown, restart server or clients 

This note book will walk you though how to perform commands above to complete an FL experiment. 


## Prerequisites
- Ran [Provisioning Notebook](Provisioning.ipynb) and started the server.
- (Optional) Looked at [Client Notebook](Client.ipynb). 


## Dataset 

##### Option 1 
This notebook uses a sample dataset (ie. a single image volume of the spleen dataset) provided in the package to train a small neural network for a few epochs. 
This single file is duplicated 32 times for the training set and 9 times for the validation set to mimic the full spleen dataset. 


##### Option 2  
You could do minor changes as recommended in the excersise section to train on the spleen segmentation task. The dataset used is Task09_Spleen.tar from 
the [Medical Segmentation Decathlon](http://medicaldecathlon.com/). 
Prior to running this notebook the data should be downloaded following 
the steps in [Data Download Notebook](../../Data_Download.ipynb).

##### Option 3  
You can use your own dataset with your own mmar. For example using mmar for the domainExample folder.


### Disclaimer  
We will be training a small networks so that both clients can fit the model on 1 gpu. 
Training will run for a couple of epochs, in order to show the concepts, we are not targeting accuracy.    


# Lets get started
In order to learn how FL works in clara train SDK and have hand on experience for all participant (server, clients, admin), 
we will start by running all participants from the same docker using different terminals as in image below. 
<br><img src="screenShots/Workshop_1Docker.png" alt="Drawing" style="height: 300px;"/><br> 



### Recommended JupyterLab setup 
We recommend you open a multiple terminals as image below. 
This will allow you to see what is the output for the client and server. 
As below we have a server terminal and 2 clients on the top right and the admin shell on the bottom right. 
The Admin tool runs in an interactive shell therefore unfortunately we can't have cells to run the notebook. 
Therefore, you should leave this notebook on the left in order to follow instructions  
<br><img src="screenShots/JLabLayout.png" alt="Drawing" style="height: 300px;"/><br>

In order to open terminals you will need to:
- Click on the folder tab
- Click + sign
- Select terminal as shown below
<br><img src="screenShots/openTerminals.png" alt="Drawing" style="height: 300px;"/><br>
 

### Start server, and clients
In the server terminal run:
```
cd /claraDevDay/FL/project1/server/startup
./start.sh
```  
In the client1 terminal run:
```
cd /claraDevDay/FL/project1/client1/startup
./start.sh
```  
In the client2 terminal run:
```
cd /claraDevDay/FL/project1/client2/startup
./start.sh
```  
In the Admin terminal run:
```
cd /claraDevDay/FL/project1/admin/startup
./fl_admin.sh
```  

# Admin Workflow
By now the server and 2 clients are running waiting for Lead Researcher to start the experiments. 
<br><img src="screenShots/AdminSteps.png" alt="Drawing" style="height: 500px;"/><br>

Figure above shows the steps she needs to preform:
1. Starting the admin tool and logging in
1. Check server and client status
1. Upload MMAR to client and server
1. Start training 
1. Get Metric 
 

Lets start by installing tree to look at directory structures 

In [None]:
MMAR_DIR="/claraDevDay/FL/project1/"
#!ln -s /claraDevDay/sampleData /data
!ln -s /claraDevDay/sampleData /data_4FL
!apt-get install tree

## 1- Starting Admin Shell
In the admin terminal, if you haven't already started the admin console you should to admin folder in side your project and run
```
cd /claraDevDay/FL/project1/admin/startup
./fl_admin.sh
``` 
you should see
```
Admin Server: localhost on port 5000
User Name: `
```
type `admin@admin.com` 

Admin Server: localhost on port 5000
User Name: admin@admin.com

Type ? to list commands; type "? cmdName" to show usage of a command.

you should see 
```
> ?
Client Commands
-------------------------------------------------------------------------------------------
| SCOPE         | COMMAND         | DESCRIPTION                                           |
-------------------------------------------------------------------------------------------
|               | bye             | exit from the client                                  |
|               | help            | get command help information                          |
|               | lpwd            | print local work dir of the admin client              |
| file_transfer | download_binary | download one or more binary files in the download_dir |
| file_transfer | download_folder | download a folder from the server                     |
| file_transfer | download_text   | download one or more text files in the download_dir   |
| file_transfer | info            | show folder setup info                                |
| file_transfer | upload_folder   | upload a folder to the server                         |
-------------------------------------------------------------------------------------------

Server Commands
------------------------------------------------------------------------
| SCOPE      | COMMAND           | DESCRIPTION                         |
------------------------------------------------------------------------
| sys        | sys_info          | get the system info                 |
| training   | abort             | abort the FL server/client training |
| training   | check_status      | check_status the FL server/client   |
| training   | delete_run_number | delete the FL training run number   |
| training   | deploy            | deploy MMAR to client/server        |
| training   | restart           | restart the FL server/client        |
| training   | set_run_number    | set the FL training run number      |
| training   | set_timeout       | set the admin commands timeout      |
| training   | shutdown          | shutdown the FL server/client       |
| training   | start             | start the FL server/client training |
| utils      | cat               | show content of a file              |
| utils      | env               | show system environment vars        |
| utils      | grep              | search for PATTERN in a file.       |
| utils      | head              | print the first 10 lines of a file  |
| utils      | ls                | list files in work dir              |
| utils      | pwd               | print the name of work directory    |
| utils      | tail              | print the last 10 lines of a file   |
| validation | taskname          | get the FL taskname                 |
| validation | validate          | cross sites validation              |
------------------------------------------------------------------------
```

## 2- Check server/client status
type 
```
> check_status server
```
to see 
```
FL run number has not been set.
FL server status: training not started
Registered clients: 2
 client name:client1    instance name:client1   token: 3c3d2276-c3bf-40c1-bc02-9be84d7c339f 
client name:client2     instance name:client2   token: 92806548-5515-4977-894e-612900ff8b1b
```
To check on clients type 
```
> check_status client
```
to see 
```
instance:client1 : client name: client1 token: 3c3d2276-c3bf-40c1-bc02-9be84d7c339f     status: training not started
instance:client2 : client name: client2 token: 92806548-5515-4977-894e-612900ff8b1b     status: training not started
```
To check on folder structure 

```
> info
```
To see
```
Local Upload Source: /claraDevDay/FL/project1/admin/startup/../transfer
Local Download Destination: /claraDevDay/FL/project1/admin/startup/../transfer
Server Upload Destination: /claraDevDay/FL/project1/server/startup/../transfer
Server Download Source: /claraDevDay/FL/project1/server/startup/../transfer
```

## 3- Upload files to server staging 

To upload files 
```
> upload_folder ../../../adminMMAR
```
Or if you already have a MMAR ready (as from teh DomainExample folder) you should copy it to transfer folder then run 
```
> upload_folder <my MMAR from DomainExmaples>
```
This will create folder on server/ transfer 
```
Created folder /claraDevDay/FL/project1/server/startup/../transfer/adminMMAR
```

We can verify that files has been transferred to staging 


In [None]:
!tree $MMAR_DIR/server/transfer


## 4- Deploy from server staging into a run in the server and client  
Folder is not on the server staging area in `transfer` folder. 
we need to create a run then copy this folder into the server and client.

1. `> set_run_number 1`
2. `> deploy adminMMAR server`
3. `> deploy adminMMAR client`  For all clients OR
    1. `> deploy adminMMAR client client1`  --> to copy only for client1

We can verify that files has been transferred to server and clients 

In [None]:
!tree $MMAR_DIR/server/run_1

In [None]:
!tree $MMAR_DIR/client1/run_1

In [None]:
!tree $MMAR_DIR/client2/run_1


## 5- Start Training
Now you can start training by:
1. `> start server`
2. `> start client`

You can check on the status of the training using:
1. `> check_status client` or `> check_status server`  to see 
```
> check_status server
FL run number:1
FL server status: training started
run number:1    start round:0   max round:20    current round:0
min_num_clients:1       max_num_clients:100
Registered clients: 1
 client name:client1    instance name:client1   token: 3eb835cc-2359-4683-8a0a-3083abf2e5d2
```
2. get logs from clients using `cat server log.txt`



## 6- Stop Training (if needed ) 
You could send signals to stop the training if you need to using:
- `abort client`
- `abort server`



## 7- Cross site validate
Once training is completed, you would like to get the validation matrices. 
This is another area where Clara FL shines. 
Since the true power of FL is to get teh off diagonal values, which show that the model generalizes across sites. 
Before you needed to move either hte data or the selected model to each site and run validataion. 
Now with cross site validation feature it is done automatically for you.
All you need to do is have the file `config_cross_site_validataion.json` as part of your MMAR, and have set the flag 
`"cross_site_validate": true` in the client section of the config_fed_client.json. 
These settings is already set up in this example so all if left is to

Run `validate all` to pull the result back to the server, you could also run `validate source_site target_site`

You should see something like 
```
validate all
{'client1': {'client2': {'validation': {'mean_dice': 0.0637669786810875}}, 'client1': {'validation': {'mean_dice': 0.07123523205518723}}, 'server': {'validation': {'mean_dice': 0.07032141834497452}}}, 'client2': {'client2': {'validation': {'mean_dice': 0.06376668065786362}}, 'client1': {'validation': {'mean_dice': 0.07123514264822006}}, 'server': {'validation': {'mean_dice': 0.07032135874032974}}}}
Done [11570 usecs] 2020-09-03 18:49:41.485214
``` 
parsing this json and putting it in a table would look like 
Metric = validation mean_dice 

 _ | Client 1 | Client 2 | Server  
 :--- | :--- | :---: | --- 
Client 1 | 0.07123523205518723 | 0.0637669786810875 | 0.07032141834497452
Client 2 | 0.07123514264822006 | 0.06376668065786362| 0.07032135874032974


## 8- Check models in your model dir 
After training is completed you can check the models in the model folder of the mmar. 
Note each client will have slitly different model as each client selects the best model for his own validataion data. 
Also the FL server will have a different model in the model folder.  

Running cell below will show you the model folder in for a client and server 

In [None]:
!tree $MMAR_DIR/server/run_1

In [None]:
!tree $MMAR_DIR/client1/run_1


## 9- (Optional) restart clients 
If something happens with one of your client related to gpu issues, memory, temp disk space.
 You can restart the server or client using
- `restart client`
- `restart server`


## 10- Finally shut down client and servers 
If you are done with all experiments you can shut down client and servers.
- `shutdown client`
- `shutdown server`

__Note: this will kill the client/server connection, 
for any new experiments you will need to contact client sites to run the start.sh again__


# Next steps:
You can now move to more advanced features of FL by running [Admin BYOC Notebook](Admin_BYOC.ipynb)


# Exercise:
1. You could change the dataset to use the spleen data set already downloaded by following Follow [Data Download Notebook](../../Data_Download.ipynb). 
things you would want to change:
    1. Make sure to split the data between different clients.
    1. Change number of `local_epochs` in the config_fed_client.json to a higher number 
    ```
        "client": {
        "local_epochs": 5,
    ``` 
    1. Change parameters in the config_fed_server.json as how many rounds:
    ```
        "min_num_clients": 1,
        "max_num_clients": 100,
        "start_round": 0,
        "num_rounds": 2,  --> Number of training rounds 
        "num_rounds_per_valid": 1, --> how often to run validation on clients
    ```