# Dataset Creation and Experiment Selection using NVIDIA MONAI Cloud APIs

In this guide, we'll walk you through the foundational steps of creating a dataset suitable to use with NVIDIA MONAI Cloud APIs and selecting an appropriate base experiment for your medical imaging needs.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/NVIDIA/monai-cloud-api/blob/main/notebooks/Dataset%20Creation%20and%20Experiment%20Selection.ipynb)

## Table of Contents

- Dataset Creation
- Experiment Selection
- Deleting Datasets and Experiments
- Conclusion

## Introduction

The creation of a coherent dataset and the choice of the right base experiment are cornerstones of any medical imaging project. NVIDIA MONAI Cloud APIs are designed to streamline this process, allowing you to focus on what's essential. This guide provides step-by-step instructions to facilitate these foundational preparations.

If you haven't already generated your key or if you're unsure about the process, follow our step-by-step for [Generating and Managing Your Credentials](./Generating%20and%20Managing%20Your%20Credentials.ipynb).

In [None]:
import json
import requests
import os

<a id='Parameters'></a>

### Parameters

The following cell contains all parameters that need to be replaced when executing.

In [None]:
# API Endpoint and Credentials
host_url = "<MONAI Cloud API URL>"
ngc_api_key = os.environ.get('MONAI_API_KEY')

# dicomweb parameters (will be introduced in Section: Dataset Creation)
dicom_web_endpoint = "<DICOMWeb address>"  # For example "http://127.0.0.1:8042/dicom-web".
dicom_client_id = "<DICOMWeb user ID>"     # If Authentication is enabled, then provide username
dicom_client_secret = "<DICOMWeb secret>"  # If Authentication is enabled, then provide password

In [None]:
# NGC UID
api_url = f"{host_url}/api/v1"
response = requests.post(f"{api_url}/login", data=json.dumps({"ngc_api_key": ngc_api_key}))
assert response.status_code == 201, f"Login failed, got status code: {response.status_code}."
assert "user_id" in response.json().keys(), "user_id is not in response."
assert "token" in response.json().keys(), "token is not in response."

uid = response.json()["user_id"]
token = response.json()["token"]

# Construct the URL and Headers
base_url = f"{api_url}/users/{uid}"
headers = {"Authorization": f"Bearer {token}"}

## Dataset Creation

### **1. Data Sources**

The first substantial step is curating an appropriate dataset, then you ready to create a reference object to that dataset by utilizing DICOMWeb. DICOMWeb is a modern web standard for accessing DICOM data. By connecting NVIDIA MONAI Cloud APIs to a DICOMWeb endpoint, you can seamlessly integrate to your data into many modern viewers.

**Steps:**
1. Set up your DICOMWeb Interface: Before integrating with MONAI, ensure your DICOMWeb interface is up and functional. This will serve as the primary bridge between your data storage and NVIDIA MONAI Cloud APIs.
2. Permissions and Security: Given the sensitive nature of medical data, always ensure that the necessary security measures are in place. Grant NVIDIA MONAI Cloud APIs the appropriate access permissions while maintaining strict compliance with regulations.

For a comprehensive walkthrough on these steps and more, refer to our detailed guide on [DICOMWeb Server Configuration](./DICOMWeb%20Server%20Configuration.ipynb).

After preparing your DICOMWeb dataset, please modify the following variables in [Parameters](#Parameters):

```
dicom_web_endpoint = ...
dicom_client_id = ...
dicom_client_secret = ...
```

### **2. Using a DICOMWeb Endpoint to Create Datasets**

After you've completed the steps above, it's time to run the API to create your dataset.  Below you'll find an example request along with associated parameters and description.



In [None]:
data = {
        "name": "mydataset",
        "description":"a demo dataset",
        "type": "semantic_segmentation",
        "format": "monai",
        "client_url": f"{dicom_web_endpoint}",
        "client_id": f"{dicom_client_id}",
        "client_secret": f"{dicom_client_secret}",
    }

endpoint = f"{base_url}/datasets"
response = requests.post(endpoint, json=data, headers=headers)
assert response.status_code == 201, f"Create dataset failed, got {response.text}."
res = response.json()
dataset_id = res["id"]
print("Dataset creation succeeded with dataset ID: ", dataset_id)
print("---------------------------------\n")
print(json.dumps(res, indent=2))

## Experiment Selection

### Available Base Experiments

NVIDIA MONAI Cloud APIs boast a variety of base experiments (including pre-trained models and algorithm templates), each honed for different tasks including **DeepEdit**, **VISTA-3D** and **Auto3DSeg**.

**Recommendation:** Start with VISTA-3D. Its versatile design allows you to branch out and customize as your requirements evolve.

### List Available Base Experiments

When referring to experiments in API calls, you'll want to reference the Base Experiment ID when indicated.  You can see all available experiments by calling to the experiment API endpoint.

In [None]:
endpoint = f"{base_url}/experiments"
response = requests.get(endpoint, headers=headers)
assert response.status_code == 200, f"List experiments failed, got {response.json()}."
res = response.json()

# VISTA-3D Base Experiment
ptm_vista = [p for p in res if p["network_arch"] == "monai_vista3d" and not len(p["base_experiment"])][0]["id"]
print(f"Base Experiment ID for VISTA-3D Experiment: {ptm_vista}")

# DeepEdit Base Experiment
ptm_deepedit = [p for p in res if p["network_arch"] == "monai_annotation" and not len(p["base_experiment"])][0]["id"]
print(f"Base Experiment ID for DeepEdit Experiment: {ptm_deepedit}")

### Create Experiment

1. **MONAI Bundle**: We're using the VISTA-3D bundle as an example. Choose the one fitting your application.
2. **Dataset Setup**: All data is under one dataset ID for this demo. Adjust as per your data structure.
3. **Pretrained Weights**: Opt for a pretrained model to enhance performance.
4. **Real-time Inference**: For real-time inference during annotation jobs or auto segmentation, set `realtime_infer` to **True** and provide an `inference_dataset`; otherwise, set it to **False**. In this example, we're setting it to **False** as we aren't initiating an annotation job..

In [None]:
data = {
  "name": "monai_vista",
  "description": "Based on vista",
  "network_arch": "monai_vista3d",
  "type": "medical",
  "base_experiment": [ ptm_vista ],
  "inference_dataset": dataset_id,
  "eval_dataset": dataset_id,
  "train_datasets": [ dataset_id ],
  "realtime_infer": False,
}

endpoint = f"{base_url}/experiments"
response = requests.post(endpoint, json=data, headers=headers)
assert response.status_code == 201, f"Create experiment failed, got {response.json()}."
res = response.json()
experiment_id = res["id"]
model_network = res["network_arch"]
print("Experiment creation succeeded with experiment ID:", experiment_id)
print("---------------------------------\n")
print(json.dumps(res, indent=2))

#### **Customize VISTA-3D Experiment**

The VISTA-3D model provides a comprehensive set of 117 classes. However, there might be scenarios where you need a subset of these classes or want to introduce new ones. Customizing is made easy with the MONAI Cloud APIs:

1. **Selecting a Subset of Classes**

If you're interested in specific classes such as liver, kidney, and spleen, you can choose them without using the entire set by modifying the data object to add a `model_params` key, along with the `labels` you want included from the base 117 classes.

In [None]:
data = {
  "name": "my_vista_3_organ",
  "description": "based on vista",
  "network_arch": "monai_vista3d",
  "base_experiment": [ ptm_vista ],
  "inference_dataset": dataset_id,
  "eval_dataset": dataset_id,
  "train_datasets": [ dataset_id ],
  "realtime_infer": True,
  "model_params":{
      "labels":{
          "1": "liver",
          "2": "kidney",
          "3": "spleen"
      }
  }
}

2. **Adding Custom Classes**

If you have specific classes not present in the base VISTA-3D model, you can easily add them. This customization allows developers to tailor the experiment to their specific needs, ensuring that only relevant classes are present, while also offering the flexibility to introduce new classes as needed.

In [None]:
data = {
  "model_params":{
      "labels":{
          "1": "liver",
          "2": "kidney",
          "118": "myorgan" # add customized class
      }
  }
}

## Deleting Datasets and Experiments

If you've created test datasets or experiments that you no longer need, you can easily remove them using the MONAI Cloud APIs. Let's walk through the cleanup process.

### Deleting an Experiment

To delete an experiment, employ the following API call. Be sure to replace `<experiment_id>` with the appropriate experiment's ID:

In [None]:
endpoint = f"{base_url}/experiments/{experiment_id}"
response = requests.delete(endpoint, headers=headers)
assert response.status_code == 200, f"Delete experiment failed, got {response.json()}."
print(response.json())
print(response)

### Deleting a Dataset

To delete a dataset, use the provided API endpoint. Replace `<dataset_id>` with the ID of the dataset you wish to remove:

In [None]:
endpoint = f"{base_url}/datasets/{dataset_id}"
response = requests.delete(endpoint, headers=headers)
assert response.status_code == 200, f"Delete dataset failed, got {response.json()}."
print(response.json())
print(response)

These commands ensure that your work environment remains clutter-free, allowing for more efficient resource management.

## Conclusion

Bravo! You've now created a dataset and selected an experiment, setting the stage to harness the full capabilities of the NVIDIA MONAI Cloud APIs. Always keep your workspace organized, and you'll find that managing complex projects becomes significantly more straightforward. The upcoming notebooks involve executing annotations and continuous learning tasks, or utilizing platforms like the OHIF Viewer.