# Scikit-Learn SVM with NVFLARE


## Prepare data

In this section, we will download the data and split the data and save to the local disk

### Download data

In [1]:
from utils.prepare_data import download_data

The download data function will download one of the two datasets from Scikit-learn: Iris or Cancer
* the file will be save to the output directory 
* the file format will be CSV format with comma separated
* the file will be remove the header 
* default dataset is iris
* filename = dataset name


In [2]:
output_dir="/tmp/nvflare/sklearn/data"
download_data(output_dir)

Verify the file is downloaded


In [3]:
!ls {output_dir}

data_split.json  iris.csv  site-1  site-2  valid


### Split Data
* **Split Method**


Split the data into different datasets, one for each client. 
There are several split methods, we use test our algorithms in different scenarios. Here we just pick uniform split from the followns
* Uniform 
* linear
* Sqare
* Exponential



* **data store method**

similar to the real application, we split the data total into different directories (sites), and each client will ready one-site's data

```
   /tmp/nvflare/sklearn/data/site-1/iris.csv
   /tmp/nvflare/sklearn/data/site-2/iris.csv
   /tmp/nvflare/sklearn/data/valid/iris.csv
```
 

In [4]:
from utils.prepare_data_split import split_data, SplitMethod, StoreMethod

In [5]:
input_path = "/tmp/nvflare/sklearn/data/iris.csv"
output_dir = "/tmp/nvflare/sklearn/data"
site_num = 2
valid_frac = 0.3
split_method: SplitMethod = SplitMethod.UNIFORM
# store_method = StoreMethod.STORE_DATA

In [6]:

split_data(input_path, output_dir, site_num, valid_frac, split_method=split_method)

In [7]:
!ls -l {output_dir}

total 20
-rw-rw-r-- 1 chester chester  316 Dec 17 10:52 data_split.json
-rw-rw-r-- 1 chester chester 3000 Dec 17 11:02 iris.csv
drwxrwxr-x 2 chester chester 4096 Dec 17 10:55 site-1
drwxrwxr-x 2 chester chester 4096 Dec 17 10:55 site-2
drwxrwxr-x 2 chester chester 4096 Dec 17 10:55 valid


In [8]:
! head -n 10 {output_dir}/site-1/iris.csv

0.0,4.8,3.0,1.4,0.3
0.0,5.1,3.8,1.6,0.2
0.0,4.6,3.2,1.4,0.2
0.0,5.3,3.7,1.5,0.2
0.0,5.0,3.3,1.4,0.2
1.0,7.0,3.2,4.7,1.4
1.0,6.4,3.2,4.5,1.5
1.0,6.9,3.1,4.9,1.5
1.0,5.5,2.3,4.0,1.3
1.0,6.5,2.8,4.6,1.5


## Setup Jobs

There are multiple jobs configured, the differences are dataset
* iris  -- uniform distributed 
* cancer -- linear distributed
* cancer -- uniform distributed 
They share the same custom code


In [1]:
! ./setup_jobs.sh

/home/chester/projects/NVFlare/examples/sklearn-svm
/home/chester/projects/NVFlare/examples/sklearn-svm
/home/chester/projects/NVFlare/examples/sklearn-svm


## Running Job

### FL Simulator

In [2]:
! nvflare simulator -w /tmp/nvflare/ -n 2 -t 2 job_configs/sklearn_svm_iris

2022-12-17 21:57:45,742 - SimulatorRunner - INFO - Create the Simulator Server.
2022-12-17 21:57:45,790 - nvflare.fuel.hci.server.hci - INFO - Starting Admin Server localhost on Port 42869
2022-12-17 21:57:45,791 - SimulatorServer - INFO - starting insecure server at localhost:43837
2022-12-17 21:57:45,793 - SimulatorRunner - INFO - Deploy the Apps.
2022-12-17 21:57:45,795 - SimulatorRunner - INFO - Create the simulate clients.
2022-12-17 21:57:46,129 - ClientManager - INFO - Client: New client site-1@127.0.0.1 joined. Sent token: f59b2aee-32fd-41ff-8a6c-da24c168ca0f.  Total clients: 1
2022-12-17 21:57:46,129 - FederatedClient - INFO - Successfully registered client:site-1 for project simulator_server. Token:f59b2aee-32fd-41ff-8a6c-da24c168ca0f SSID:
2022-12-17 21:57:51,476 - ClientManager - INFO - Client: New client site-2@127.0.0.1 joined. Sent token: 19a9dc01-6cc5-4857-a87e-bdc677b0c034.  Total clients: 2
2022-12-17 21:57:51,477 - FederatedClient - INFO - Successfully registered cli

In [10]:
!ls -l  /tmp/nvflare/simulate_job/app_site-1

total 16
-rw-rw-r-- 1 chester chester    0 Dec 16 21:55 audit.log
drwxrwxr-x 2 chester chester 4096 Dec 16 21:55 config
drwxrwxr-x 3 chester chester 4096 Dec 16 21:55 custom
-rw-rw-r-- 1 chester chester   40 Dec 16 21:55 events.out.tfevents.1671256551.RTX.21091.0
-rw-rw-r-- 1 chester chester 1454 Dec 16 21:55 log.txt


In [2]:
!ls -l  /tmp/nvflare/sklearn/model

total 0
