## Running FL with secure aggregation using homomorphic encryption

This notebook will walk you through how to setup FL with homomorphic encryption (HE). 


## Prerequisites
Before starting this notebook, please make yourself familiar with other FL notebooks in this repo.

- (Optional) Look at the introduction Notebook for [Federated Learning with Clara Train SDK](FederatedLearning.ipynb).
- (Optional) Look at [Client Notebook](Client.ipynb).
- (Optional) Look at [Admin Notebook](Admin.ipynb).
- Run [Provisioning Notebook](Provisioning.ipynb) and started the server.

Make sure the project.yml used for provision contains these HE related settings:

    # homomorphic encryption
    he:
      lib: tenseal
      config:
        poly_modulus_degree: 8192
        coeff_mod_bit_sizes: [60, 40, 40]
        scale_bits: 40
        scheme: CKKS
        
*Note:* These settings are recommended and should work for most tasks but could be further optimized depending on your specific model architecture and machine learning task. See this [tutorial on the CKKS scheme](https://github.com/OpenMined/TenSEAL/blob/master/tutorials/Tutorial%202%20-%20Working%20with%20Approximate%20Numbers.ipynb) and [benchmarking](https://github.com/OpenMined/TenSEAL/blob/master/tutorials/Tutorial%203%20-%20Benchmarks.ipynb) for more information of different settings.

## Dataset 

##### Option 1 
This notebook uses a sample dataset (ie. a single image volume of the spleen dataset) provided in the package to train a small neural network for a few epochs. 
This single file is duplicated 32 times for the training set and 9 times for the validation set to mimic the full spleen dataset. 

##### Option 2  
You could do minor changes as recommended in the excersise section to train on the spleen segmentation task. The dataset used is Task09_Spleen.tar from 
the [Medical Segmentation Decathlon](http://medicaldecathlon.com/). 
Prior to running this notebook the data should be downloaded following 
the steps in [Data Download Notebook](../../Data_Download.ipynb).

### Disclaimer  
We will be training a small networks so that both clients can fit the model on 1 gpu. 
Training will run for a couple of epochs, in order to show the concepts, we are not targeting accuracy.

# Lets get started
In order to learn how FL works with homomorphic encryption (HE) in Clara Train SDK we will first give some background on what homomorphic encryption is and how the MMAR configurations need to be modifyed to enable it.
<br><img src="screenShots/homomorphic_encryption.png" alt="Drawing" style="height: 450px;"/><br> 

## *TODO: Explain new HE components*
- Encryptor (all layers, partial, regex)
- Just in time HE aggregator
- Decryptor
- HE ShareableGenerator
- HE Persistor
- Cross-site validation with HE

# Run FL experiment with HE

## 1 - Start server, and clients (if they are not already running)
In the server terminal run:
```
cd /claraDevDay/FL/project1/server/startup
./start.sh
```  
In the client1 terminal run:
```
cd /claraDevDay/FL/project1/client1/startup
./start.sh
```  
In the client2 terminal run:
```
cd /claraDevDay/FL/project1/client2/startup
./start.sh
```  
In the Admin terminal run:
```
cd /claraDevDay/FL/project1/admin/startup
./fl_admin.sh
```

## 2 - Starting Admin Shell
In the admin terminal, if you haven't already started the admin console you should to admin folder in side your project and run
```
cd /claraDevDay/FL/project1/admin/startup
./fl_admin.sh
``` 
you should see
```
Admin Server: localhost on port 5000
User Name: `
```
type `admin@admin.com` 

Admin Server: localhost on port 8003
User Name: admin@admin.com

Type ? to list commands; type "? cmdName" to show usage of a command.

## 3 - Check server/client status
type 
```
> check_status server
```
to see 
```
FL run number has not been set.
FL server status: training not started
Registered clients: 2 
-------------------------------------------------------------------------------------------------
| CLIENT NAME | TOKEN                                | LAST ACCEPTED ROUND | CONTRIBUTION COUNT |
-------------------------------------------------------------------------------------------------
| client1     | f735c245-ce35-4a08-89e0-0292bb053a9c |                     | 0                  |
| client2     | e36db52e-2624-4989-855a-28fa195f58e9 |                     | 0                  |
-------------------------------------------------------------------------------------------------
```
To check on clients type 
```
> check_status client
```
to see 
```
instance:client1 : client name: client1 token: 3c3d2276-c3bf-40c1-bc02-9be84d7c339f     status: training not started
instance:client2 : client name: client2 token: 92806548-5515-4977-894e-612900ff8b1b     status: training not started
```
To check on folder structure 

```
> info
```
To see
```
Local Upload Source: /claraDevDay/FL/project1/admin/startup/../transfer
Local Download Destination: /claraDevDay/FL/project1/admin/startup/../transfer
Server Upload Destination: /claraDevDay/FL/project1/server/startup/../transfer
Server Download Source: /claraDevDay/FL/project1/server/startup/../transfer

## 4- Upload and deploy the MMAR configurations for HE and set FL run number
First set a run number (Choose a different one if you don't want to overwrite previous results)
```
> set_run_number 1
```

Then, upload the HE MMAR and deploy to server and clients
```
> upload_folder ../../../adminMMAR_HE
> deploy adminMMAR_HE server
> deploy adminMMAR_HE client
```

## 5 - Start Training
Now you can start training by:
1. `> start server`
2. `> start client`

You can check on the status of the training using:
1. `> check_status client` or `> check_status server`  to see 
```
> check_status server
FL run number:1
FL server status: training started
run number:1    start round:0   max round:2     current round:0
min_num_clients:2       max_num_clients:100
Registered clients: 2 
Total number of clients submitted models for current round: 0
-------------------------------------------------------------------------------------------------
| CLIENT NAME | TOKEN                                | LAST ACCEPTED ROUND | CONTRIBUTION COUNT |
-------------------------------------------------------------------------------------------------
| client1     | f735c245-ce35-4a08-89e0-0292bb053a9c |                     | 0                  |
| client2     | e36db52e-2624-4989-855a-28fa195f58e9 |                     | 0                  |
-------------------------------------------------------------------------------------------------
```
2. get logs from server or clients using `cat server log.txt` or `cat client1 log.txt`

## 6 - Stop Training (if needed ) 
You could send signals to stop the training if you need to using:
- `abort client`
- `abort server`

## 7 - Cross site validate
Once training is completed, you would like to get the validation matrices. 
This is another area where Clara FL shines. 
Since the true power of FL is to get teh off diagonal values, which show that the model generalizes across sites. 
Before you needed to move either hte data or the selected model to each site and run validataion. 
Now with cross site validation feature it is done automatically for you.
All you need to do is have the file `config_cross_site_validataion.json` as part of your MMAR, and have set the flag 
`"cross_site_validate": true` in the client section of the config_fed_client.json. 
These settings is already set up in this example so all if left is to

Run `validate all` to pull the result back to the server, you could also run `validate source_site target_site`

You should see something like 
```
validate all
{'client1': {'client2': {'validation': {'mean_dice': 0.0637669786810875}}, 'client1': {'validation': {'mean_dice': 0.07123523205518723}}, 'server': {'validation': {'mean_dice': 0.07032141834497452}}}, 'client2': {'client2': {'validation': {'mean_dice': 0.06376668065786362}}, 'client1': {'validation': {'mean_dice': 0.07123514264822006}}, 'server': {'validation': {'mean_dice': 0.07032135874032974}}}}
Done [11570 usecs] 2020-09-03 18:49:41.485214
``` 
parsing this json and putting it in a table would look like 
Metric = validation mean_dice 

 _ | Client 1 | Client 2 | Server  
 :--- | :--- | :---: | --- 
Client 1 | 0.07123523205518723 | 0.0637669786810875 | 0.07032141834497452
Client 2 | 0.07123514264822006 | 0.06376668065786362| 0.07032135874032974