# Notebook #4: Model Training on Federated Data

#### Import the Rhino Health Python library & Authenticate to the Rhino Cloud
We'll again import any necessary functions from the `rhino_health` library and authenticate to the Rhino Cloud. Please refer to Notebook #1 for an explanation of the `session` interface for interacting with various endpoints in the Rhino Health ecosystem. In addition, you can always find more information about the Rhino SDK on our <a target="_blank" href="https://rhinohealth.github.io/rhino_sdk_docs/html/autoapi/index.html">Official SDK Documentation</a> and on our <a target="_blank" href="https://pypi.org/project/rhino-health/" >PyPI Repository Page</a>

In [None]:
import getpass
import rhino_health as rh
from rhino_health.lib.endpoints.code_object.code_object_dataclass import (
    CodeObjectCreateInput,
    CodeObjectTypes,
    CodeObjectRunInput,
    CodeRunMultiDatasetInput,
    ModelTrainInput 
)

my_username = "FCP_LOGIN_EMAIL" # Replace this with the email you use to log into Rhino Health
session = rh.login(username=my_username, password=getpass.getpass())

#### Retrieve Project and Dataset Information
As you've surely noticed by this point, we'll start by instantiating a `Project` object. We'll continue specifying the same project name that we've been using throughout this guided sandbox experience. In addition, we'll retrieve the identifiers for the JPEG datasets that were produced in notebook #2 so that we can use them to train our AI model. 

In [None]:
project = session.project.get_project_by_name("YOUR_PROJECT_NAME")  # Replace with your project name

datasets = project.datasets
hco_cxr_dataset = project.get_datasets_by_name("mimic_cxr_hco_conv")
aidev_cxr_dataset = project.get_datasets_by_name("mimic_cxr_dev_conv")
cxr_datasets = [aidev_cxr_dataset.uid, hco_cxr_dataset.uid]
print(f"Loaded CXR Datasets '{hco_cxr_dataset.uid}', '{aidev_cxr_dataset.uid}'")

#### Create a Code Object to generate distinct training and testing datasets
When training any machine learning algorithm in a supervised fashion, we need to 'hold out' a segment of the data so that we can then use that held-out segment to generate an unbiased estimate of model performance. We can accomplish this using another container image that executes Python code to generate both a training set and a testing set. 

In [None]:
# get the container image that'll create the test-train split
train_split_image_uri = "MY CONTAINER IMAGE URI"

# get the schema that was created after JPG conversion
cxr_schema = project.get_data_schema_by_name('mimic_cxr_hco_conv', project_uid=project.uid)
cxr_schema_uid =cxr_schema.uid

# create a code object using the container image
test_train_split = CodeObjectCreateInput(
    name="Train Test Split",
    description="Splitting data into train and test datasets per site",
    input_data_schema_uids=[cxr_schema_uid],
    output_data_schema_uids=[None], # Auto-Generating the Output Data Schema for the Code Object
    code_type=CodeTypes.GENERALIZED_COMPUTE,
    project_uid = project.uid,
    config={"container_image_uri": train_split_image_uri}
)
test_train_compute = session.code_object.create_code_object(test_train_split)
print(f"Got Code Object named '{test_train_compute.name}' with uid {test_train_compute.uid}")

# run the code object to create new datasets at each site
run_params = CodeRunMultiDatasetInput(
    code_object_uid= test_train_compute.uid,
    input_dataset_uids=[aidev_cxr_dataset.uid, hco_cxr_dataset.uid],
    output_dataset_naming_templates= ['{{ input_dataset_names.0 }} - Train', '{{ input_dataset_names.0 }} - Test'],
    timeout_seconds=600,
    sync=False,
)

print(f"Starting to run {test_train_compute.name}")
code_run = session.code.run_code_object(run_params)
run_result = code_run.wait_for_completion()
print(f"Finished running {test_train_compute.name}")
print(f"Result status is '{run_result.status.value}', errors={run_result.result_info.get('errors') if run_result.result_info else None}")

#### Use NVIDIA's FLARE framework to federate model training 
Rhino's platform includes a seamless integration of NVIDIA's Federated Learning framework (NVFlare), enabling you to train machine learning models collaboratively across distributed health data sources. This framework offers a few key advantages:

1. **Secure Distributed Training**: NVFlare empowers users to conduct Federated Training across a network of healthcare institutions, each contributing their data insights without sharing raw data. This distributed approach ensures that sensitive patient information remains secure behind institutional firewalls.
2. **NVIDIA GPU Acceleration**: NVFlare taps into the computational prowess of NVIDIA GPUs, expediting model training and optimization. This acceleration is a game-changer, reducing training time and enhancing the accuracy of models trained on massive healthcare datasets.
3. **Versatility Across ML Frameworks**: NVFlare's framework compatibility extends to major machine learning frameworks such as PyTorch and TensorFlow. Adapt your existing machine learning code to NVFlare, ensuring seamless integration into the Federated Learning ecosystem.



#### Create a code object for model training
In the function call below, we must only pass the input and output data schemas. This is because we are only *defining* the code object in the below cell. We musn't pass the actual datasets until we execute the code object. 

In [None]:
# path for container image
model_train_image_uri = "YOUR CONTAINER URI"

# create code object to train the model using our container image
flare_model = CodeObjectCreateInput(
    name="Pneumonia Prediction Model Training",
    description="Pneumonia Prediction Model Training",
    input_data_schema_uids=[cxr_schema_uid],
    output_data_schema_uids=[None], # Auto-Generating the Output Data Schema for the Code Object
    project_uid= project.uid,
    model_type=CodeTypes.NVIDIA_FLARE_V2_2,
    config={"container_image_uri": model_train_image_uri}
)

flare_model = session.code_object.create_code_object(flare_model)
print(f"Got FLARE model '{flare_model.name}' with uid {flare_model.uid}")

#### Run the model training code object
When it comes time to actually execute our model training process, we can pass the code object's unique identifier to the function that executes the container image. We'll pass both the training and testing data to the function. Note that the `config_*` and `secrets_*` arguments can be left blank because we are required to pass neither a configuration for the federated server nor the federated configuration file associated with all NVFlare implementations.  

In [None]:
# retrieve training Dataset
input_training_datasets = session.dataset.search_for_datasets_by_name('Train')
print(['Training Datasets: ' + x.name for x in input_training_datasets])

# retrieve testing Dataset
input_validation_datasets =  session.dataset.search_for_datasets_by_name('Test')
print(['Testing Datasets: ' + x.name for x in input_validation_datasets])

run_params = ModelTrainInput(
    code_object_uid=flare_model.uid,
    input_dataset_uids=[x.uid for x in input_training_datasets], 
    simulate_federated_learning=True ,        
    validation_dataset_uids=[x.uid for x in input_validation_datasets], 
    validation_datasets_inference_suffix=" - Pneumonia training results",
    timeout_seconds=600,
    config_fed_server="",
    config_fed_client="",
    secrets_fed_client="",
    secrets_fed_server="",
    sync=False,
)

print(f"Starting to run federated training of {flare_model.name}")
model_train = session.code_object.train_model(run_params)
train_result = model_train.wait_for_completion()
print(f"Finished running {flare_model.name}")
print(f"Result status is '{train_result.status.value}', errors={train_result.result_info.get('errors') if train_result.result_info else None}")