# Working with MONAI Auto3DSeg using MONAI Cloud API
This comprehensive guide is designed to help you navigate the process of training and testing with MONAI Auto3DSeg on the NVIDIA DGX Cloud, focusing on leveraging the powerful capabilities of DGX systems for medical imaging applications.

## Table of Contents

- Datasets Creation
- Auto3DSeg Experiment Creation
- Monitoring Job Status
- AutoML Generated Model Inference
- Conclusion

## Introduction

Auto3DSeg is a MONAI native project, tailored to demonstrate optimal 3D segmentation workflows for various algorithms. It simplifies the process for non-experts, allowing them to train models on 3D CT or MRI data with just a few lines of code. For experts, it offers a compilation of best practices for segmentation training using MONAI components. This enables users to achieve and customize state-of-the-art baseline segmentation performances.

A key focus of Auto3DSeg is on computational efficiency, aiming to minimize training and inference times while maximizing GPU compute utilization. Leveraging the MONAI Cloud API enhances this efficiency, streamlining data management and model training. Integrated with NVIDIA DGX Cloud, it provides scalable computational resources, ideal for handling large medical imaging datasets and complex training scenarios. This combination accelerates the development of advanced medical imaging solutions.

If you haven't already generated your key or if you're unsure about the process, follow our step-by-step guide for [Generating and Managing Your Credentials](./Generating%20and%20Managing%20Your%20Credentials.ipynb).


## Setup

In [None]:
import json
import os
import requests
import time

In [None]:
# API Endpoint and Credentials
host_url = "<MONAI Cloud API URL>"
ngc_api_key = "<NGC API Key>"

# Object Storage URL and Credentials
train_data_url = "<container url for the training dataset>"      # Training dataset container URL
inference_data_url = "<container url for the inference dataset>" # Inference dataset container URL
storage_client_id = "<object storage ID>"                        # Object storage username/id
storage_client_secret = "<object storage secret>"                # Object storage password/secret

#### Login into NGC and API setup

In [None]:
# NGC Login
api_url = f"{host_url}/api/v1"
response = requests.post(f"{api_url}/login", data=json.dumps({"ngc_api_key": ngc_api_key}))
assert response.status_code == 201, f"Login failed, got status code: {response.status_code}."
assert "user_id" in response.json().keys(), "user_id is not in response."
assert "token" in response.json().keys(), "token is not in response."

uid = response.json()["user_id"]
token = response.json()["token"]

# Construct the URL and Headers
base_url = f"{api_url}/users/{uid}"
headers = {"Authorization": f"Bearer {token}"}

## Datasets Creation

### **1. Remote Object as Data Sources**

MONAI Cloud platform supports a range of other cloud storage solutions, including Azure Blob Storage, Google Cloud Storage (GCP) and Amazon S3, providing you with the flexibility to choose the service that best fits your project's needs. Below is an example of Azure:

**Steps:**
1. Creating a Storage Account and Container
   - **Storage Account**: Start by creating a new storage account in your Azure portal. This account will host your blob storage containers.
   - **Container Creation**: Within your storage account, create a new container. This container will hold your datasets.

2. Container URL
   - Once the container is created, you will be provided with a unique URL that can be used to access it. This URL will be essential for accessing your data.

#### Obtaining Credentials

- **Access Keys**: Access your storage account and navigate to the 'Access keys' section. Here, you will find the necessary credentials to access your Blob Storage programmatically.
- **Shared Access Signature (SAS)**: Alternatively, you can create a SAS for more granular control over permissions and access duration.

#### Creating a Manifest JSON File

In the root of your Azure container, create a manifest JSON file to keep track of your datasets. The file format is as follows:

```json
{
    "root_path": "https://[your-storage-account-name].blob.core.windows.net/[your-container-name]",
    "data": [
        {
            "image": {
                "path": ["path/to/your/image_1"],
                "id": "unique-uuid-1"
            },
            "label": {
                "path": ["path/to/your/label_1"],
                "id": "unique-uuid-2"
            }
        },
        // Additional data objects follow the same format
    ]
}
````

- Each dataset (training, testing, etc.) should have their own root directory
- All the data should be under a root directory
- The root directory should contain a `manifest.json` file
- The `manifest.json` file should contain "data" field, which is a list of all the data entries
- Each data entry should contain "image" and "label" fields
- Each "image"/"label" field should contain "path" field, which is the list of relative path to the image/label files


After preparing your dataset, please modify the following variables in [Parameters](#Parameters):

```python
train_data_url = ...
inference_data_url = ...
storage_client_id = ...
storage_client_secret = ...
```

### **2. Creating the training datasets**

In [None]:
dataset_api = f"{base_url}/datasets"
data = {
    "name": "train_sim_data_azure",
    "description": "Simulated dataset for training Auto3DSeg on Azure",
    "type": "semantic_segmentation",
    "format": "monai",
    "client_url": train_data_url,
    "client_id": storage_client_id,
    "client_secret": storage_client_secret,
}
response = requests.post(dataset_api, json=data, headers=headers)
assert response.status_code == 201, f"Create dataset failed, got {response.json()}."
res = response.json()
train_dataset_id = res["id"]
print("Train dataset created with dataset ID：", train_dataset_id)
print("----------------------------------------------------------------------------")
print(json.dumps(res, indent=2))
print("----------------------------------------------------------------------------")

### **3. Creating the inference datasets**

In [None]:
dataset_api = f"{base_url}/datasets"
data = {
    "name": "test_sim_data_azure",
    "description": "Simulated for evaluation of Auto3DSeg on Azure",
    "type": "semantic_segmentation",
    "format": "monai",
    "client_url": inference_data_url,
    "client_id": storage_client_id,
    "client_secret": storage_client_secret,
}
response = requests.post(dataset_api, json=data, headers=headers)
assert response.status_code == 201, f"Create dataset failed, got {response.json()}."
res = response.json()
infer_dataset_id = res["id"]
print("Inference dataset created with dataset ID:", infer_dataset_id)
print("-------------------------------------------------------------------------------")
print(json.dumps(res, indent=2))
print("-------------------------------------------------------------------------------")

## Auto3DSeg Experiment Creation

Users have the ability to initiate an experiment and execute the **auto3dseg** action to activate the Auto3DSeg pipeline. This process automatically sets up four distinct neural networks, undertaking multi-fold training to attain state-of-the-art performance in segmentation tasks. While the module is designed to be highly configurable to cater to various user needs, it maintains simplicity by requiring only minimal user input.

Incorporating MONAI Cloud API into this workflow further enhances the user experience. The API facilitates seamless integration and management of data, models, and computational resources within a unified interface. This integration not only simplifies the process but also ensures efficient use of computational resources, particularly when running complex and resource-intensive tasks.

**Minimal Inputs**

Moreover, with the minimal input, users benefit from these advanced capabilities without needing to delve into complex configurations, making the Auto3DSeg pipeline accessible to a wide range of users, from beginners to experts in the field of medical imaging.

### **1. Find the base experiment for Auto3DSeg**

In [None]:
endpoint = f"{base_url}/experiments"
response = requests.get(endpoint, headers=headers)
assert response.status_code == 200, f"List experiment failed, got {response.json()}."
res = response.json()
automl_base_exps = [p for p in res if p["network_arch"] == "monai_automl" and p["name"] == "MONAI Auto3dSeg"]
assert len(automl_base_exps) > 0, "No base experiment found for Auto3DSeg Experiment"
print(f"List of available base experiments for Auto3DSeg:")
print({p["id"]: {"name": p["name"], "version": p["version"]} for p in automl_base_exps})
base_experiment = sorted(automl_base_exps, key=lambda x: x["version"])[-1]  # take the latest version
base_experiment_id = base_experiment["id"]
print("-----------------------------------------------------------------------------------------")
print(f"Base experiment ID for '{base_experiment['name']}' v{base_experiment['version']}: {base_experiment_id}")
print("-----------------------------------------------------------------------------------------")

### **2. Create MONAI AutoML Experiment**

In [None]:
data = {
    "name": "automl_experiment",
    "description": "MONAI AutoML Experiment for Segmentation",
    "type": "medical",
    "base_experiment": [base_experiment_id],
    "network_arch": "monai_automl",
    "train_datasets": [train_dataset_id],
    "inference_dataset": infer_dataset_id,
    "realtime_infer": False,
}

endpoint = f"{base_url}/experiments"
response = requests.post(endpoint, json=data, headers=headers)
assert response.status_code == 201, f"Experiment creation failed, got {response.json()}."
res = response.json()
automl_experiment_id = res["id"]
print("Experiment creation succeeded with experiment ID:", automl_experiment_id)
print("--------------------------------------------------------------------------------------")
print(json.dumps(res, indent=2))
print("--------------------------------------------------------------------------------------")

### **3. Run Auto3DSeg Action**

In [None]:
data = {
    "action": "auto3dseg",
    "specs": {
        "num_gpu": 2,
        "output_experiment_name": "Auto3DSegGenModel",
        "output_experiment_description": "AutoML generated segmentation experiment using MONAI Auto3DSeg",
        "modality": "MRI",
        "num_fold": 1,
        "train_params": {
            "num_epochs_per_validation": 1,
            "num_images_per_batch": 2,
            "num_epochs": 1,
            "num_warmup_epochs": 1,
            "use_pretrain": False,  # can modify to True to use pretrained weights
        },
    },
}

endpoint = f"{base_url}/experiments/{automl_experiment_id}/jobs"
response = requests.post(endpoint, json=data, headers=headers)
assert response.status_code == 201, f"Create job failed, got {response.json()}."
automl_job_id = response.json()
print("Job creation succeeded with job ID:", automl_job_id)

**Experiment Management**

User can track the experiments in Auto3DSeg by adding a mlflow tracking server URL to the payload data:

```python
data = {
    ...,
     "mlflow_tracking_uri": <mlflow_uri>,
}
```

## Monitoring Job Status

Monitoring the status of your jobs is a crucial aspect of managing workflows effectively. In our system, the job monitoring feature provides a straightforward yet essential overview of your job's current state. Here's what you need to know:

**Basic Status Overview**: The monitoring functionality in our system is designed to inform you whether your jobs are in a pending, running, done, or error state. This status update allows you to quickly assess the overall progress and detect any immediate issues that may require attention.

Status interpretation:
- "Pending": MONAI cloud is looking for resources and preparing the datasets. This can take quite a while, and depends on the size of the dataset.
- "Running": MONAI cloud has submitted the job to the DGX. 
- "Done": The training is complete
- "Error": There is some error in the job. User probably wants to download the job as a `.tar.gz` archive and inspect the detailed log.

In [None]:
# Helper functions for running jobs
def wait_for_job(endpoint, headers, timeout):
    start_time = time.time()
    response = requests.get(endpoint, headers=headers)
    assert response.status_code == 200, f"Failed to get job status, got {response.json()}."
    status = response.json()["status"].title()
    print("Waiting for job to complete...")
    print(status, end="", flush=True)
    while True:
        if status not in ["Pending", "Running"]:
            assert status == "Done", f"Job failed with status: {status}"
            break
        time.sleep(5)
        response = requests.get(endpoint, headers=headers)
        assert response.status_code == 200, f"Failed to get job status, got {response.json()}."
        status_new = response.json()["status"].title()
        if status_new != status:
            status = status_new
            print(f"\n{status}", end="", flush=True)
        else:
            print(".", end="", flush=True)
        if time.time() - start_time > timeout:
            print(f"Job timeout after {timeout} seconds.")
            break
    print(f"\nJob status: {status}")

endpoint = f"{base_url}/experiments/{automl_experiment_id}/jobs/{automl_job_id}"
response = requests.get(endpoint, headers=headers)

assert response.status_code == 200, f"Failed to get job status, got {response.json()}."
for k, v in response.json().items():
    if k != "result":
        print(f"{k}: {v}")
    else:
        print("result:")
        for k1, v1 in v.items():
            print(f"    {k1}: {v1}")

print("------------------------------------------------------------------------")
wait_for_job(endpoint, headers, 600)

## AutoML Generated Model Inference

Users can easily deploy models trained via the Auto3DSeg pipeline for inference on their test datasets. This process involves selecting an AutoML-optimized model, tailored for high accuracy and efficiency in medical imaging tasks. The trained model is then applied to the test dataset, allowing users to evaluate its performance in real-world scenarios. This seamless integration from training to inference exemplifies the practical utility of Auto3DSeg in streamlining complex medical imaging analyses.


### **1. List the experiments and select the first generated Auto3DSeg experiment**

In [None]:
endpoint = f"{base_url}/experiments"
params = {"user_only": True, "network_arch": "monai_segmentation"}
# you can use the assigned "output_experiment_name" in the previous steps to filter the experiments
# params = {"user_only": True, "network_arch": "monai_segmentation", "name": "Auto3DSegGenModel"}
response = requests.get(endpoint, params=params, headers=headers)
assert response.status_code == 200, f"List experiment failed, got {response.json()}."
experiments = response.json()
assert len(experiments) > 0, "No experiments found!"
selected = "x"
for m in experiments:
    print(f'- {selected} {m["name"]:25} : {m["id"]} ({m["created_on"]})')
    selected = " "
experiment_id = experiments[0]["id"]

### **2. [Optional] List more information about the selected experiments**

In [None]:
endpoint = f"{base_url}/experiments/{experiment_id}"
response = requests.get(endpoint, headers=headers)

assert response.status_code == 200, f"Failed to get experiment info, got {response.json()}."
for k, v in response.json().items():
    if k != "result":
        print(f"{k}: {v}")
    else:
        print("result:")
        for k1, v1 in v.items():
            print(f"    {k1}: {v1}")

### **3. Run Inference**

With the model and the `inference_dataset` prepared, users can prepare the payload data and submit an inference request as below:

In [None]:
data = {
    "action": "inference",
    "specs": {
        "inference_dataset": infer_dataset_id,
        "num_gpu": 2,
    },
}
endpoint = f"{base_url}/experiments/{experiment_id}/jobs"
response = requests.post(endpoint, json=data, headers=headers)

assert response.status_code == 201, f"Create job failed, got {response.json()}."
infer_job_id = response.json()
print("Job creation succeeded with job ID:", infer_job_id)

### **4. Check on the Inference Job**

After the job is submitted, users can continue to use the APIs to check the status of the inference job:

In [None]:
endpoint = f"{base_url}/experiments/{experiment_id}/jobs/{infer_job_id}"
response = requests.get(endpoint, headers=headers)

assert response.status_code == 200, f"Failed to get job status, got {response.json()}."
for k, v in response.json().items():
    if k != "result":
        print(f"{k}: {v}")
    else:
        print("result:")
        for k1, v1 in v.items():
            print(f"    {k1}: {v1}")

wait_for_job(endpoint, headers, 600)

### **5. Download the result of the Inference Job**

Finally, when the jobs are completed, users can download the result to their local drive and examine the outputs, models, and logs.

In [None]:
endpoint = f"{base_url}/experiments/{experiment_id}/jobs/{infer_job_id}"
response = requests.get(endpoint, headers=headers)
# In order to download the job, the inference process should be finished
if response.json()["status"] == "Done":
    endpoint = f"{base_url}/experiments/{experiment_id}/jobs/{infer_job_id}:download"

    # Download the results
    with requests.get(endpoint, headers=headers, stream=True) as r:
        r.raise_for_status()
        print("Downloading job results...")
        output_file = f"{infer_job_id}.tar.gz"
        with open(output_file, 'wb') as f:
            for chunk in r.iter_content(chunk_size=8192):
                f.write(chunk)
        print(f"Inference results are downloaded at {output_file}")

    assert os.path.exists(output_file), "Download failed, archive has not been created."

## Cleaning Up

Delete the experiment and datasets after jobs are done.

In [None]:
# cancel automl job and inference job if not Done. This step is required before cleaning data
endpoint = f"{base_url}/experiments/{automl_experiment_id}/jobs/{automl_job_id}"
response = requests.get(endpoint, headers=headers)
if response.json()["status"] != "Done":
    endpoint = f"{base_url}/experiments/{automl_experiment_id}/jobs/{automl_job_id}:cancel"
    response = requests.post(endpoint, headers=headers)
    assert response.status_code == 200, f"Cancel job {automl_job_id} failed, got {response.json()}."
    print(response)

In [None]:
endpoint = f"{base_url}/experiments/{experiment_id}/jobs/{infer_job_id}"
response = requests.get(endpoint, headers=headers)
if response.json()["status"] != "Done":
    endpoint = f"{base_url}/experiments/{experiment_id}/jobs/{job_id}:cancel"
    response = requests.post(endpoint, headers=headers)
    assert response.status_code == 200, f"Cancel job {job_id} failed, got {response.json()}."
    print(response)

In [None]:
endpoint = f"{base_url}/experiments/{automl_experiment_id}"
response = requests.delete(endpoint, headers=headers)
assert response.status_code == 200, f"Delete automl experiment failed, got {response.json()}."
print(response)

endpoint = f"{base_url}/experiments/{experiment_id}"
response = requests.delete(endpoint, headers=headers)
assert response.status_code == 200, f"Delete inference experiment failed, got {response.json()}."
print(response)

endpoint = f"{base_url}/datasets/{train_dataset_id}"
response = requests.delete(endpoint, headers=headers)
assert response.status_code == 200, f"Delete train dataset failed, got {response.json()}."
print(response)

endpoint = f"{base_url}/datasets/{infer_dataset_id}"
response = requests.delete(endpoint, headers=headers)
assert response.status_code == 200, f"Delete inference dataset failed, got {response.json()}."
print(response)

## Conclusion

In conclusion, the combination of Auto3DSeg with the MONAI Cloud API and NVIDIA DGX Cloud marks a significant stride in medical imaging technology. It simplifies the 3D segmentation process and harnesses the potential of AutoML, making advanced medical imaging analysis more accessible and efficient for both beginners and experts. This integration, facilitating a smooth progression from model training to inference, showcases the practical and powerful capabilities of this approach in enhancing medical imaging workflows.