In [None]:
import json
import requests

# API Endpoint and Credentials
host_url = "<MONAI Cloud API URL>"
ngc_api_key = "<NGC API Key>"

# NGC UID 
data = json.dumps({"ngc_api_key": ngc_api_key})
response = requests.post(f"{host_url}/api/v1/login", data=data)
assert response.status_code in (200, 201)
assert "user_id" in response.json().keys()
user_id = response.json()["user_id"]
print("User ID",user_id)
assert "token" in response.json().keys()
token = response.json()["token"]
print("JWT",token)

# Construct the URL and Headers
base_url = f"{host_url}/api/v1/users/{user_id}"
print("API Calls will be forwarded to",base_url)

headers = {"Authorization": f"Bearer {token}"}

# MLFlow server
use_mlflow =False
mlflow_server_address = "" # For example "http://127.0.0.1:5000".
mlflow_experiment_name = "" # For example "my_experiment"

## Dataset Creation

### **1. Remote Object as Data Sources**

MONAI Cloud platform supports a range of other cloud storage solutions, including Azure Blob Storage, Google Cloud Storage (GCP) and Amazon S3, providing you with the flexibility to choose the service that best fits your project's needs. Below is an example of Azure:

**Steps:**
1. Creating a Storage Account and Container
   - **Storage Account**: Start by creating a new storage account in your Azure portal. This account will host your blob storage containers.
   - **Container Creation**: Within your storage account, create a new container. This container will hold your datasets.

2. Container URL
   - Once the container is created, you will be provided with a unique URL that can be used to access it. This URL will be essential for accessing your data.

## Obtaining Credentials

- **Access Keys**: Access your storage account and navigate to the 'Access keys' section. Here, you will find the necessary credentials to access your Blob Storage programmatically.
- **Shared Access Signature (SAS)**: Alternatively, you can create a SAS for more granular control over permissions and access duration.

## Creating a Manifest JSON File

In the root of your Azure container, create a manifest JSON file to keep track of your datasets. The file format is as follows:

```json
{
    "root_path": "https://[your-storage-account-name].blob.core.windows.net/[your-container-name]",
    "data": [
        {
            "image": {
                "path": ["path/to/your/image_1"],
                "id": "unique-uuid-1"
            },
            "label": {
                "path": ["path/to/your/label_1"],
                "id": "unique-uuid-2"
            }
        },
        // Additional data objects follow the same format
    ]
}
````

- Each dataset (training, testing, etc.) should have their own root directory
- All the data should be under a root directory
- The root directory should contain a `manifest.json` file
- The `manifest.json` file should contain "data" field, which is a list of all the data entries
- Each data entry should contain "image" and "label" fields
- Each "image"/"label" field should contain "path" field, which is the list of relative path to the image/label files

In [2]:
container_url = "<remote object storage address>"
access_id = "<user id>"
access_secret = "<storage secret>"

## Using the Remote Object to Create Datasets

After you've completed the steps above, it's time to run the API to create your dataset.  Below you'll find an example request along with associated parameters and description.

In [3]:
data = {
    "name": "MONAI_CLOUD",
    "description":"Object storage dataset",
    "type": "semantic_segmentation",
    "format": "monai",
    "client_url": container_url,
    "client_id": access_id,
    "client_secret": access_secret,
}

endpoint = f"{base_url}/datasets"
response = requests.post(endpoint, json=data, headers=headers)

if response.status_code == 201:
    res = response.json()
    dataset_id = res["id"]
    print("Dataset creation succeeded with dataset ID： ", dataset_id)
    print("---------------------------------\n")
    print(json.dumps(res, indent=2))
else:
    print(response.json())
    print(response)

Dataset creation succeeded with dataset ID：  50942089-f445-466d-b42d-d9a0a0ed53c0
---------------------------------

{
  "actions": [
    "nextimage",
    "cacheimage",
    "notify"
  ],
  "client_url": "https://monaiserviceadmin.blob.core.windows.net/msd-spleen-subset",
  "created_on": "2024-01-07T15:21:40.338948",
  "description": "Object storage dataset",
  "docker_env_vars": {},
  "format": "monai",
  "id": "50942089-f445-466d-b42d-d9a0a0ed53c0",
  "jobs": [],
  "last_modified": "2024-01-07T15:21:40.338959",
  "logo": "https://www.nvidia.com",
  "name": "MONAI_CLOUD",
  "pull": null,
  "type": "semantic_segmentation",
  "version": "1.0.0"
}


## Custom MONAI Bundle Creation

1. **MONAI Bundle**: We're using the Spleen Segmentation bundle as an example. Choose the one fitting your application from the MONAI Model Zoo.
2. **Dataset Setup**: All data is under one dataset ID for this demo. Adjust as per your data structure.
3. **Pretrained Weights**: The Official MONAI bundles have pretrained weights.

In [4]:
bundle_url = "https://github.com/Project-MONAI/model-zoo/releases/download/hosting_storage_v1/spleen_ct_segmentation_v0.5.3.zip"

data = {
  "name": "my_spleen_seg",
  "description": "from MONAI model zoo",
  "network_arch": "monai_custom",  # must be using monai_custom
  "eval_dataset": dataset_id,
  "train_datasets": [ dataset_id ],
  "bundle_url": bundle_url,
}

endpoint = f"{base_url}/experiments"
response = requests.post(endpoint, json=data, headers=headers)
if response.status_code == 201:
    res = response.json()
    experiment_id = res["id"]
    print("Model creation succeeded with model ID： ", experiment_id)
    print("---------------------------------\n")
    print(json.dumps(res, indent=2))
else:
    print(response.json())
    print(response)

Model creation succeeded with model ID：  ff3699b9-2965-403e-a532-7a7876525f1e
---------------------------------

{
  "actions": [
    "train"
  ],
  "additional_id_info": null,
  "automl_add_hyperparameters": "[]",
  "automl_algorithm": null,
  "automl_enabled": false,
  "automl_remove_hyperparameters": "[]",
  "base_experiment": [
    "708809fe-2a0b-4a06-943c-53f6717b5483"
  ],
  "calibration_dataset": null,
  "checkpoint_choose_method": "best_model",
  "checkpoint_epoch_number": {},
  "created_on": "2024-01-07T15:21:44.155081",
  "dataset_type": "user_custom",
  "description": "from MONAI model zoo",
  "docker_env_vars": {},
  "encryption_key": "tlt_encode",
  "eval_dataset": "50942089-f445-466d-b42d-d9a0a0ed53c0",
  "id": "ff3699b9-2965-403e-a532-7a7876525f1e",
  "inference_dataset": null,
  "is_ptm_backbone": true,
  "jobs": [],
  "last_modified": "2024-01-07T15:21:44.155092",
  "logo": "https://www.nvidia.com",
  "metric": null,
  "model_params": {},
  "name": "my_spleen_seg",
  "

## Training on DGX Cloud

1. Users have the capability to submit jobs directly through our cloud API, enabling a streamlined and efficient process for initiating their projects.
1. Additionally, users are empowered to modify the job submission payload, allowing the inclusion of additional parameters to tailor the execution according to specific requirements or preferences.
1. The format of the payload aligns with the MONAI bundle configuration standards, ensuring a seamless integration and consistency in how data and parameters are structured and processed.

In [5]:
train_spec = {
    "epochs": 2,
  }
if use_mlflow:
    mlflow_spec = {
        "tracking": "mlflow",
        "tracking_uri": f"{mlflow_server_address}",
        "experiment_name": f"{mlflow_experiment_name}",
        "train#handlers#-1#artifacts": None
    }
    train_spec.update(mlflow_spec)

data = {
  "action": "train",
  "specs": train_spec
}

endpoint = f"{base_url}/experiments/{experiment_id}/jobs"
response = requests.post(endpoint, json=data, headers=headers)

if response.status_code == 201:
    job_id = response.json()
    print(f"Job submitted successfully with {job_id}.")
else:
    print(response.json())
    print(response)

Job submitted successfully with a8a08170-155b-47b0-81f4-c9713842510e.


## Monitoring and Downloading

Monitoring the status of your jobs is a crucial aspect of managing workflows effectively. In our system, the job monitoring feature provides a straightforward yet essential overview of your job's current state. Here's what you need to know:

1. **Basic Status Overview**: The monitoring functionality in our system is designed to inform you whether your jobs are in a pending, running, done, or error state. This status update allows you to quickly assess the overall progress and detect any immediate issues that may require attention.

Status interpretation:
- "Pending": MONAI cloud is looking for resources and preparing the datasets. This can take quite a while, and depends on the size of the dataset.
- "Running": MONAI cloud has submitted the job to the DGX. 
- "Done": The training is complete
- "Error": There is some error in the job. User probably wants to download the job as a `.tar.gz` archive and inspect the detailed log.

2. **Detailed Logging Through Download API**: For a more comprehensive view and detailed logging of your jobs, our platform offers a Download API. This API enables you to access in-depth logs, model checkpoints, and data outputs, which are instrumental for troubleshooting, in-depth analysis, and gaining insights into the specifics of your job's execution. The Download API is particularly useful if your job encounters an error or if you need to understand the performance and behavior of your job in greater detail.

In [None]:
endpoint = f"{base_url}/experiments/{experiment_id}/jobs/{job_id}"
response = requests.get(endpoint, headers=headers)

if response.status_code == 200:
    for k, v in response.json().items():
        if k != "result":
            print(f"{k}: {v}")
        else:
            print("result:")
            for k1, v1 in v.items():
                print(f"    {k1}: {v1}")
else:
    print(response.json())
    print(response)

**Downloading**

In [10]:
endpoint = f"{base_url}/experiments/{experiment_id}/jobs/{job_id}/download"
response = requests.get(endpoint, headers=headers)

In [12]:
if response.status_code == 200:
    #save to file
    attachment_data = response.content
    with open(f"{job_id}.tar.gz", 'wb') as f:
        f.write(attachment_data)
    print(f"Bundle training results are downloaded as {job_id}.tar.gz")
else:
    print(response)

Bundle training results are downloaded as ce111b2c-c1d9-4fcd-85d6-c402df4484d7.tar.gz


## Conclusion

"Congratulations on reaching this pivotal milestone! With your dataset created and model selected, you're now fully equipped to leverage the advanced features of the NVIDIA MONAI Cloud APIs for your medical imaging projects.