# FL client Joining FL experiment 

The purpose of this notebook is to walk a client participating in a FL experiment.
     

## Prerequisites
- Should have received a provisioning package with password.  
- Extracted it as in the last cell of provisioning notebook (or run first cell in this notebook).   


## Setup your DataSet 
- Docker start script will map your data to `/dataset` folder 
- Each client will need to set a file named `dataset.json`  

This notebook uses sample dataset (Single image of spleen dataset) provided in the package to train small networks for a couple of epochs. 
This single file is duplicated 32 times in the training set and 9 times to mimic the full spleen data set. 

#### Disclaimer  


# Lets get started

If you followed the provisioning notebook, you have already unzipped packages for all clients. 
Otherwise you should start by unzipping the provisioning package if you haven't already done so. 
You should use `unzip -oP <password> filename.zip -d <directoryToUnzip>`

Lets start by installing tree to look at directory structures

In [None]:
!apt-get install tree


Lets examine what is in the package 


In [None]:
MMAR_DIR="/claraDevDay/FL/project1/client1/"

## 1- Starting the Docker for the Client 
Inside the startup folder you should edit the docker.sh file then run it to start the docker.

Docker start expects:
1. Dataset to be mapped as /dataset
2. dataset.json file to be in /dataset/dataset.json

You should modify the top part of the file and edit:
```
MY_DATA_DIR=/mydata/
```
you may want to limit/change the number of GPU exposed to the docker


In [None]:
!cat $MMAR_DIR/startup/docker.sh 


it should look like 
```
#!/bin/bash
MY_DATA_DIR=/raid
# for all gpus use line below 
#GPU2USE=all   
# for 2 gpus use line below
#GPU2USE=2  
# for specific gpus as gpu#0 and gpu#2 use line below
GPU2USE='"device=0,2"'

DOCKER_IMAGE=<<latest clara image will change from one release to another>>
echo "Starting docker with $DOCKER_IMAGE"
docker run --rm -it --name=client1 \
--gpus $GPU2USE \
-u $(id -u):$(id -g) -v /etc/passwd:/etc/passwd -v /etc/group:/etc/group \
-v $PWD/..:/workspace/ \
-v $MY_DATA_DIR:/data/ \
-w /workspace/ \
--shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 \
$DOCKER_IMAGE \
/bin/bash
```

## 2- Check network connection to server   
You should run cell below and get the server name and port from the `target` tag 

In [None]:
!cat $MMAR_DIR/startup/fed_client.json

Run telnet with server port. 
You should see a connection established and it is waiting for your input. 
Simply escape out of it. 
running below cell should give an error as 
```
Trying 127.0.0.1...
Trying ::1...
telnet: Unable to connect to remote host: Cannot assign requested address
```

In [None]:
!telnet localhost 8000

A successful connection should give message as below.
```
Trying 3.135.235.198...
Connected to ec2-3-135-235-198.us-east-2.compute.amazonaws.com.
Escape character is '^]'.
```

In [None]:
!telnet ec2-3-135-235-198.us-east-2.compute.amazonaws.com 8002

See list of commands in the end of this notebook to debug network issues

## 2b Find your external IP (Optional)
In some cases, administrators of the federated server would like to limit network traffic to the clients IPs. 
This would be through adding client IPs in a white list. 
Therefore, as a client, you would be required to provide your external ip.
You can simply visit [https://whatismyipaddress.com/](https://whatismyipaddress.com/) or you could run cell below

In [None]:
!curl ifconfig.co
!curl icanhazip.com
!curl ifconfig.me

## 3- Create your data file (Most Important Step) 
This is the only think clients need to worry about. Create your data list json file. 
to simulate this we can create a symbolic link to spleen data we have downloaded from the performance script 

In [None]:
!ln -s /claraDevDay/spleenData/Task09_Spleen /data


Now run cell below to verify that all files exist and are in the correct path  

In [None]:
dataset="/data/dataset.json"
folder="/data/"
!ls $folder
#!cat $dataset


You should make sure that all files exist in the right location by running cell below

In [None]:
!python /opt/nvidia/medical/ai4med/tools/check_image_files.pyc -d $dataset -f $folder


## 4- Starting the Client 

You should run either: 
- `./start.sh` if you plan to use single gpu 
- `./start_mgpu.sh <number_of_GPUS> ` if you plan to run with multiple GPUs 


In [None]:
!$MMAR_DIR/startup/start.sh
 

## List of helpful command for network debugging:   
- `telnet <serverName> <port>` 
- `traceroute -P <port> <serverName>`  you could increase maximum hops if needed using `-m <Maxhops>` 
- `ping <servername>` make sure the serve name is resolved to and ip as below
```
ping ec2-3-135-235-198.us-east-2.compute.amazonaws.com
PING ec2-3-135-235-198.us-east-2.compute.amazonaws.com (18.224.5.19) 56(84) bytes of data.
```
-  `ping <serverIP>` use this to rule out problems with your DNS 
