# Hello World Examples

In this notebook, we will walk you through some Hello World examples in `NVFlare/examples/hello-world` to get familiar with basic workflow of NVIDIA FLARE.

We will run the examples in FLARE with the POC mode using the [FLARE API](../tutorials/flare_api.ipynb). You can also run these examples in the [FLARE simulator](../tutorials/flare_simulator.ipynb).

Each example below is self-contained. You can start from any example, but you must run through the 3 steps of each example in sequence.

## Prerequisites
Before you can run the examples here, the following preparation work must be done:

1. Install a virturalenv following the instructions in [README.md](https://github.com/NVIDIA/NVFlare/tree/dev/examples)
2. Install Jupyter Lab and install a new kernel for the virtualenv called `nvflare_example`
3. Install NVFlare following this [notebook](../nvflare_setup.ipynb)
4. Start NVFlare in POC mode following this [notebook](../tutorials/setup_poc.ipynb). All the examples in this notebook require 2 clients to run.

Just to quickly recap the NVFLARE installation and POC installation steps:

**NVFLARE Installation**

In [None]:
! pip install nvflare

Check nvflare version with ``` nvflare --version ``` or  ```nvflare -V```

In [None]:
! nvflare -V



**Start NVFLARE in POC mode**

nvflare POC commands
```
   nvflare poc --prepare -n N
   nvflare poc --start -ex admin  
   nvflare poc --stop 
   nvflare poc --clean
```

``-ex admin`` means exclude FLARE Console (Admin Console) 

  the default workspace = /tmp/nvflare/poc


In [1]:
import os
NVFLARE_HOME=os.path.abspath(os.path.join(os.getcwd(), "../.."))
%env NVFLARE_HOME={NVFLARE_HOME}

env: NVFLARE_HOME=/home/chester/projects/NVFlare


In [2]:
! echo ${NVFLARE_HOME}

$/home/chester/projects/NVFlare


In [3]:
! echo y | nvflare poc --prepare -n 2 

prepare_poc at /tmp/nvflare/poc for 2 clients
This will delete poc folder in /tmp/nvflare/poc directory and create a new one. Is it OK to proceed? (y/N) Successfully creating poc folder at /tmp/nvflare/poc.  Please read poc/Readme.rst for user guide.


******* Files generated by this poc command are NOT intended for production environments.
link examples from /home/chester/projects/NVFlare/examples to /tmp/nvflare/poc/admin/transfer


In [4]:
ls -al  /tmp/nvflare/poc/admin/transfer

lrwxrwxrwx 1 chester chester 39 Mar 21 21:12 [0m[01;36m/tmp/nvflare/poc/admin/transfer[0m -> [01;34m/home/chester/projects/NVFlare/examples[0m[K/


In [5]:
%%bash --bg  

nvflare poc --start -ex admin
echo "sleep for few seconds for system to start"
sleep 3

Check system status

In [6]:
import os
import time

from nvflare.fuel.flare_api.flare_api import new_insecure_session
from nvflare.fuel.flare_api.flare_api import (
    ClientInfo,
    JobInfo,
    NoConnection,
    ServerInfo,
    SystemInfo,
)
workspace = "/tmp/nvflare/poc"
admin_dir = os.path.join(workspace, "admin")

# the following try/except is usually not needed, we need it here to hanlde the case when you "Run all cells" or notebook automation. 
# in "Run all cells" case, JupyterLab seems to try to connect to Server before it starts (even though the execution is supposed to be sequencial),
# which will resulting connection timeout, We use try/except to capture the scenario. extra sleep time doesn't seem to help

try: 
   sess = new_insecure_session(admin_dir, timeout=5)
except NoConnection:
    time.sleep(2)
    
    
flare_not_ready = True
while flare_not_ready: 
    print("trying to connect to server")
    sess = new_insecure_session(admin_dir)
    sys_info = sess.get_system_info()

    print(f"Server info:\n{sys_info.server_info}")
    print("\nclient_info")
    for client in sys_info.client_info:
        print(client)
    flare_not_ready = len( sys_info.client_info) < 2
        
    time.sleep(2)


trying to connect to server
Server info:
status: stopped, start_time: Tue Mar 21 21:12:28 2023

client_info
trying to connect to server
Server info:
status: stopped, start_time: Tue Mar 21 21:12:28 2023

client_info
trying to connect to server
Server info:
status: stopped, start_time: Tue Mar 21 21:12:28 2023

client_info
site-1(last_connect_time: Tue Mar 21 21:12:33 2023)
trying to connect to server
Server info:
status: stopped, start_time: Tue Mar 21 21:12:28 2023

client_info
site-1(last_connect_time: Tue Mar 21 21:12:33 2023)
site-2(last_connect_time: Tue Mar 21 21:12:36 2023)


## Utilities

**Monitoring Job**

You can choose your monitoring output, here is one function to display the job information 

In [7]:
import json
from nvflare.fuel.flare_api.flare_api import Session

def status_monitor_cb(
        session: Session, job_id: str, job_meta, *cb_args, **cb_kwargs
    ) -> bool:
    if job_meta["status"] == "RUNNING":
        if cb_kwargs["cb_run_counter"]["count"] < 3 or cb_kwargs["cb_run_counter"]["count"]%15 == 0:
            print(job_meta)            
        else: 
            print(".", end="")
    else:
        print("\n" + str(job_meta))
    
    cb_kwargs["cb_run_counter"]["count"] += 1
    return True



def format_json( data: dict): 
    print(json.dumps(data, sort_keys=True, indent=4,separators=(',', ': ')))


## Hello Scatter and Gather

The example job in `hello-world/hello-numpy-sag/jobs/hello-numpy-sag` demonstrate the scatter and gather workflow. See [this](https://nvflare.readthedocs.io/en/main/examples/hello_scatter_and_gather.html#hello-scatter-and-gather) for the details of the example.

### 1. Submit job using FLARE API

Starting a FLARE API session and submit the `hello-numpy-sag` job

In [None]:
import os
from nvflare.fuel.flare_api.flare_api import new_insecure_session

poc_workspace = "/tmp/nvflare/poc"
admin_dir = os.path.join(poc_workspace, "admin")
sess = new_insecure_session(admin_dir)

job_folder = os.path.join(os.getcwd(), "hello-numpy-sag/jobs/hello-numpy-sag")
job_id = sess.submit_job(job_folder)

print(f"Job is running with ID {job_id}")

### 2. Wait for the job

The command `monitor_job()` will wait for the job till it's done.

In [None]:
list_jobs_output_detailed = sess.list_jobs(detailed=True)
print(format_json(list_jobs_output_detailed))

In [None]:
sess.get_job_meta(job_id)

In [None]:
sess.monitor_job(job_id, cb=status_monitor_cb, cb_run_counter={"count":0})

### 3. Get the result


In [None]:
import numpy as np
result = sess.download_job_result(job_id)
array = np.load(result + "/workspace/models/server.npy")
print(array)

#### Cleanup result directory

In [None]:
rm -r {result}

## Hello Cross-Site Validation

The example job in `hello-world/hello-numpy-cross/jobs/hello-numpy-cross` demonstrates how to perform cross site validation after training.

Please refer to the [documentation](https://nvflare.readthedocs.io/en/main/examples/hello_cross_val.html) for the details.

### 1. Submit job using FLARE API

Starting a FLARE API session and submit the `hello-numpy-cross-val` job

In [None]:
import os
from nvflare.fuel.flare_api.flare_api import new_insecure_session

poc_workspace = "/tmp/nvflare/poc"
admin_dir = os.path.join(poc_workspace, "admin")
sess = new_insecure_session(admin_dir)

job_folder = os.path.join(os.getcwd(), "hello-numpy-cross-val/jobs/hello-numpy-cross-val")
job_id = sess.submit_job(job_folder)

print(f"Job is running with ID {job_id}")

### 2. Wait for the job

In [None]:
sess.get_job_meta(job_id)

In [None]:
sess.monitor_job(job_id, cb=status_monitor_cb, cb_run_counter={"count":0})

### 3. Get the result

In [None]:
import json
import pprint

result = sess.download_job_result(job_id)
with open(result + "/workspace/cross_site_val/cross_val_results.json", "r") as f:
  cross_val_result = json.load(f)

pp = pprint.PrettyPrinter(indent=2)
pp.pprint(cross_val_result)

#### Cleanup result directory

In [None]:
rm -r {result}

## Hello Cyclic Weight Transfer

This example uses the CyclicController workflow to implement [Cyclic Weight Transfer](https://pubmed.ncbi.nlm.nih.gov/29617797/) with TensorFlow as the deep learning training framework. The job is `hello-world/hello-cyclic/jobs/hello-cyclic`.

To use this example, tensorflow must be installed using the `requirements.txt`,

    pip install -r hello-world/hello-cyclic/requirements.txt
    
This examples needs access to [MNIST dataset](http://yann.lecun.com/exdb/mnist/)


In [None]:
! pwd

In [None]:
! pip install -r ../hello-world/hello-cyclic/requirements.txt    


### 1. Submit job using FLARE API

Starting a FLARE API session and submit the hello-cyclic job

In [None]:
import os
from nvflare.fuel.flare_api.flare_api import new_insecure_session

poc_workspace = "/tmp/nvflare/poc"
admin_dir = os.path.join(poc_workspace, "admin")
sess = new_insecure_session(admin_dir)

job_folder = os.path.join(os.getcwd(), "hello-cyclic/jobs/hello-cyclic")
job_id = sess.submit_job(job_folder)
print(f"Job is running with ID {job_id}")

### 2. Wait for the job

In [None]:
sess.monitor_job(job_id)

### 3. Get the result

In [None]:
from nvflare.fuel.utils import fobs
from nvflare.app_common.decomposers import common_decomposers
import pprint

# This example stores numpy arrays in FOBS format. Decomposers for Numpy is not registered automatically.
common_decomposers.register()

result = sess.download_job_result(job_id)
with open(result + "/workspace/app_server/tf2weights.fobs", "rb") as f:
    bytes = f.read()

weights = fobs.loads(bytes)

pp = pprint.PrettyPrinter(indent=4)
pp.pprint(weights)

#### Cleanup result directory

In [None]:
rm -r {result}

## Hello PyTorch

This example demonstrates how to use NVFlare with the popular deep learning framework PyTorch. The job is `hello-world/hello-pt/jobs/hello-pt`.

Refer to the [documentation](https://nvflare.readthedocs.io/en/main/examples/hello_pt.html) for details.

To use this example, PyTorch must be installed using the `requirements.txt`,

    pip install -r hello-world/hello-pt/requirements.txt
    
This examples also needs access to CIFAR10 dataset.


In [None]:
! pip install -r ../hello-world/hello-pt/requirements.txt    

### 1. Submit job using FLARE API

Starting a FLARE API session and submit the hello-pt job

In [None]:
import os
from nvflare.fuel.flare_api.flare_api import new_insecure_session

poc_workspace = "/tmp/nvflare/poc"
admin_dir = os.path.join(poc_workspace, "admin")
sess = new_insecure_session(admin_dir)

job_folder = os.path.join(os.getcwd(), "hello-pt/jobs/hello-pt")
job_id = sess.submit_job(job_folder)

print(f"Job is running with ID {job_id}")

### 2. Wait for the job

In [None]:
sess.monitor_job(job_id)

### 3. Get the result

In [None]:
import os
import pprint
import torch

print("this will take a bit of time")
result = sess.download_job_result(job_id)
model_path = os.path.join(result, "workspace/app_server/FL_global_model.pt")

model = torch.load(model_path)

pp = pprint.PrettyPrinter(indent=4)
pp.pprint(model)

#### Cleanup result directory

In [None]:
rm -r {result}

## Hello TensorFlow 2

This example demonstrates how to use NVFlare with the popular deep learning framework TensorFlow 2. The job is `examples/hello-world/hello-tf2/jobs/hello-tf2`.

Refer to the [documentation](https://nvflare.readthedocs.io/en/main/examples/hello_tf2.html) for details.

To use this example, PyTorch must be installed using the `requirements.txt`,

    pip install -r hello-world/hello-tf2/requirements.txt
    
This examples also needs access to [MNIST dataset](http://yann.lecun.com/exdb/mnist/)

In [8]:
! pip install -r hello-tf2/requirements.txt

You should consider upgrading via the '/home/chester/nvflare_example/bin/python3 -m pip install --upgrade pip' command.[0m[33m
[0m

### 1. Submit job using FLARE API

Starting a FLARE API session and submit the hello-tf2 job

This time, we tail the server log

In [9]:
import os
from nvflare.fuel.flare_api.flare_api import new_insecure_session

poc_workspace = "/tmp/nvflare/poc"
admin_dir = os.path.join(poc_workspace, "admin")
sess = new_insecure_session(admin_dir)

job_folder = os.path.join(os.getcwd(), "hello-tf2/jobs/hello-tf2")
job_id = sess.submit_job(job_folder)                          
print(f"Job is running with ID {job_id}")

Job is running with ID 58b34e97-4158-4a78-b419-9565410828dd


In [None]:
! tail -f /tmp/nvflare/poc/server/log.txt

2023-03-21 21:12:36,661 - ClientManager - INFO - Client: New client site-2@10.2.92.54 joined. Sent token: c53d3d59-0508-418b-8f2e-7cc031d54179.  Total clients: 2
2023-03-21 21:12:40,234 - JobRunner - INFO - [identity=example_project, run=?]: Got the job: b7b9bbe0-f94e-4321-a6a5-98177803962a from the scheduler to run
2023-03-21 21:12:40,237 - JobRunner - INFO - [identity=example_project, run=?]: Application app deployed to the server for job: b7b9bbe0-f94e-4321-a6a5-98177803962a
2023-03-21 21:12:40,237 - JobRunner - INFO - [identity=example_project, run=?]: App app to be deployed to the clients: site-1,site-2 for run: b7b9bbe0-f94e-4321-a6a5-98177803962a
2023-03-21 21:12:40,305 - JobRunner - INFO - [identity=example_project, run=?]: Started run: b7b9bbe0-f94e-4321-a6a5-98177803962a for clients: site-1,site-2
2023-03-21 21:12:51,215 - JobRunner - INFO - [identity=example_project, run=?]: Try to abort run (b7b9bbe0-f94e-4321-a6a5-98177803962a) on clients.
2023-03-21 21:13:06,348 - JobRunn

### 2. Wait the job

In [None]:
sess.monitor_job(job_id)

In [None]:
! ls -al /tmp/nvflare/poc

### 3. Get the result

In [None]:
from nvflare.fuel.utils import fobs
from nvflare.app_common.decomposers import common_decomposers
import pprint

common_decomposers.register()
result = sess.download_job_result(job_id)
print(result)
with open(result + "/workspace/app_server/tf2weights.fobs", "rb") as f:
    bytes = f.read()

weights = fobs.loads(bytes)

pp = pprint.PrettyPrinter(indent=4)
pp.pprint(weights)

In [None]:
ls -l /tmp/nvflare/poc/admin/transfer/1c7f2007-62e3-4a7d-8999-8f410e7131a1/workspace/app_server/


#### Cleanup result directory

In [None]:
rm -r {result}

## Cleanup
We need to shutdown NVFLARE system and clean up POC workspace


In [None]:
! nvflare poc --stop
! sleep 3

In [None]:
!ps -eaf | grep nvflare


In [None]:
! nvflare poc --clean


In [None]:
# Cleanup NVFLARE storages
import shutil, os
storage_paths = ["/tmp/nvflare/jobs-storage", "/tmp/nvflare/snapshot-storage"]
for p in storage_paths:
    if os.path.isdir(p): 
        print(f"removing {p}")
        shutil.rmtree(p)

