# MAIA

This tutorial has the purpose to showcase all the major applications and functionalities available in MAIA. In detail, we will cover all the essential Medical AI lifecycle stages, to provide a comprehensive overview of the platform.

The tutorial is based on two different datasets:
- [**Decathlon Spleen**](http://medicaldecathlon.com/): a dataset of 3D spleen CT scans from the Medical Segmentation Decathlon challenge. This NIFTI dataset is used to demonstrate the model preprocessing, training and evaluation functionalities in MAIA.
- [**CT Lymph Nodes**](https://www.cancerimagingarchive.net/collection/ct-lymph-nodes/) from TCIA: a dataset of 3D lymph node CT scans from The Cancer Imaging Archive. This DICOM dataset is used to demonstrate the data management functionalities in MAIA, including DICOM upload,visualization, annotation and AI model inference (including Active Learning with MONAI Label).

The tutorial will cover all the necessary steps to download the Decathlon Spleen dataset, preprocess it, train a [ResEnc nnU-Net](https://github.com/MIC-DKFZ/nnUNet/blob/master/documentation/resenc_presets.md) model as a MONAI Bundle, evaluate the model, and finally deploy it for inference on the CT Lymph Nodes dataset. The tutorial will also cover the necessary steps to upload the CT Lymph Nodes dataset, visualize it, annotate it, and perform AI model inference on it.

## Setup environment and imports

In [None]:
import pip
import sys

!{sys.executable} -m pip install odict plotly dtale "monai[nibabel, skimage, scipy, pillow, tensorboard, gdown, ignite, torchvision, itk, tqdm, pandas, mlflow, matplotlib, pydicom]" "pydicom==2.4.4" nnunetv2==2.5.1 fire

In [2]:
from monai.apps import DecathlonDataset
import subprocess
import json
import os
import pandas as pd
import pathlib
import mlflow
from pathlib import Path
import numpy as np
import yaml
import random
import dtale
import dtale.app as dtale_app

from monai.bundle import ConfigParser

from mlflow.models import ModelSignature
from mlflow.types.schema import Schema, TensorSpec

ModuleNotFoundError: No module named 'monai'

## Download the Decathlon Spleen dataset

In [2]:
os.environ["MONAI_DATA_DIRECTORY"] = "/home/maia-user/Documents/GitHub/tutorials/bundle/MONAI/Data"

In [3]:
directory = os.environ.get("MONAI_DATA_DIRECTORY")
if directory is not None:
    os.makedirs(directory, exist_ok=True)
root_dir = tempfile.mkdtemp() if directory is None else directory
print(root_dir)

/home/maia-user/Documents/GitHub/tutorials/bundle/MONAI/Data


In [None]:
DecathlonDataset(root_dir=root_dir,"Task09_Spleen","training",download=True)

## Visualize Spleen CT and SEG in 3D Slicer

You can navigate to [Remote Desktop](/user/{USERNAME}/proxy/80/desktop/{USERNAME}) to interact with the Remote Desktop environment and 3D Slicer.

In [None]:
slicer_executable = "/home/maia-user/Documents/Slicer-5.7.0-2024-08-05-linux-amd64/Slicer"

case_id = "spleen_2"

ct_volume = Path(root_dir).joinpath("Task09_Spleen/imagesTr",f"{case_id}.nii.gz")

seg_volume = Path(root_dir).joinpath("Task09_Spleen/labelsTr",f"{case_id}.nii.gz")

subprocess.run([
    slicer_executable,
    "--python-code", f"slicer.util.loadVolume('{ct_volume}'); seg=slicer.util.loadSegmentation('{seg_volume}'); seg.CreateClosedSurfaceRepresentation()"
])

## nnUNet Training

To run the nnUNet Training, follow the Tutorial [nnUnet Bundle Training](06_nnunet_monai_bundle.ipynb).


## Validation

To visualize in the 3D Slicer the fold-0 validation predictions, together with the ground truth and the CT volume, we can use the following code:

In [None]:
with open(Path(root_dir).joinpath("nnUNet/nnUNet_raw_data_base/Dataset009_Task09_Spleen/datalist.json")) as f:
    datalist = json.load(f)

In [7]:
fold_0_datalist_cases = [case for case in datalist['training'] if case['fold'] == 0]

In [14]:
slicer_executable = "/home/maia-user/Documents/Slicer-5.7.0-2024-08-05-linux-amd64/Slicer"

case =fold_0_datalist_cases[0]

case_id = case['image'].split("/")[-1].split(".")[0]

new_name = case["new_name"]


ct_volume = Path(root_dir).joinpath("Task09_Spleen/imagesTr",f"{case_id}.nii.gz")

seg_volume = Path(root_dir).joinpath("Task09_Spleen/labelsTr",f"{case_id}.nii.gz")

pred_volume = Path(root_dir).joinpath("nnUNet/nnUNet_trained_models/Dataset009_Task09_Spleen/nnUNetTrainer__nnUNetPlans__3d_fullres/fold_0/validation",new_name+".nii.gz")

subprocess.run([
    slicer_executable,
    "--python-code", f"slicer.util.loadVolume('{ct_volume}'); pred=slicer.util.loadSegmentation('{pred_volume}'); seg=slicer.util.loadSegmentation('{seg_volume}'); seg.CreateClosedSurfaceRepresentation()"
])

Switch to module:  "Welcome"
"Volume" Reader has successfully read the file "/home/maia-user/Documents/GitHub/tutorials/bundle/MONAI/Data/Task09_Spleen/imagesTr/spleen_26.nii.gz" "[0.57s]"
ReferenceImageExtentOffset attribute was not found in NRRD segmentation file. Assume no offset.


"Segmentation" Reader has successfully read the file "/home/maia-user/Documents/GitHub/tutorials/bundle/MONAI/Data/nnUNet/nnUNet_trained_models/Dataset009_Task09_Spleen/nnUNetTrainer__nnUNetPlans__3d_fullres/fold_0/validation/case_0.nii.gz" "[0.21s]"
ReferenceImageExtentOffset attribute was not found in NRRD segmentation file. Assume no offset.


"Segmentation" Reader has successfully read the file "/home/maia-user/Documents/GitHub/tutorials/bundle/MONAI/Data/Task09_Spleen/labelsTr/spleen_26.nii.gz" "[0.20s]"
Switch to module:  "Data"


KeyboardInterrupt: 

### Validation Metrics

The validation metrics are stored in a JSON file named `summary.json` in the validation folder. We can load the file and visualize the metrics with [DTale](https://github.com/man-group/dtale) using the following code:

In [4]:
summary_file = Path(root_dir).joinpath("nnUNet/nnUNet_trained_models/Dataset009_Task09_Spleen/nnUNetTrainer__nnUNetPlans__3d_fullres/fold_0/validation","summary.json")


In [5]:
with open(summary_file) as f:
    summary = json.load(f)

In [6]:
config_dict = {
    "label_dict":{
        "Background": 0,
        "Spleen": 1
    },
    "label_suffix": ".nii.gz"
}

In [7]:
df = []

label_to_name = {v: k for k, v in config_dict["label_dict"].items()}

for case in summary['metric_per_case']:
    for label_id in case['metrics']:
        for metric in case['metrics'][label_id]:
           
            df.append({
                "Case": Path(case['reference_file']).name[:-len(config_dict["label_suffix"])],
                "Label": label_to_name[int(label_id)],
                "Metric": metric,
                "Value": case['metrics'][label_id][metric]
            })

In [8]:
df = pd.DataFrame(df)

In [10]:
dtale_app.JUPYTER_SERVER_PROXY = True

d = dtale.show(df,host="0.0.0.0",)

In [11]:
from IPython.display import Markdown
from IPython.core.magic import register_cell_magic
import os


DTALE_URL = d._main_url
@register_cell_magic
def markdown(line, cell):
    return Markdown(cell.format(**globals()))

In [30]:
%%markdown

[DTale]({DTALE_URL})


[DTale](/user/simben@kth.se/proxy/40000/dtale/main/1)


The DTale charts can be recreated and visualized in the notebook using the following code:

In [13]:
# DISCLAIMER: 'df' refers to the data you passed in when calling 'dtale.show'

import pandas as pd

if isinstance(df, (pd.DatetimeIndex, pd.MultiIndex)):
	df = df.to_frame(index=False)

# remove any pre-existing indices for ease of use in the D-Tale code, but this is not required
df = df.reset_index().drop('index', axis=1, errors='ignore')
df.columns = [str(c) for c in df.columns]  # update columns to strings in case they are numbers

df = df.query("""`Metric` == 'Dice'""")

chart_data = pd.concat([
	df['Case'],
	df['Value'],
], axis=1)
chart_data = chart_data.sort_values(['Case'])
chart_data = chart_data.rename(columns={'Case': 'x'})
chart_data = chart_data.dropna()

import plotly.graph_objs as go

charts = []
charts.append(go.Bar(
	x=chart_data['x'],
	y=chart_data['Value']
))
figure = go.Figure(data=charts, layout=go.Layout({
    'barmode': 'group',
    'legend': {'orientation': 'h', 'y': -0.3},
    'title': {'text': 'Validation Fold 0, Dice score'},
    'xaxis': {'title': {'text': 'Case'}},
    'yaxis': {'title': {'text': 'Dice'}, 'type': 'linear'}
}))

# If you're having trouble viewing your chart in your notebook try passing your 'chart' into this snippet:
#
from plotly.offline import iplot, init_notebook_mode
#
init_notebook_mode(connected=True)
for chart in charts:
    chart.pop('id', None) # for some reason iplot does not like 'id'
#iplot(figure)

figure.write_html("Fold_0_Val_Dice.html")

In [14]:
df.groupby(["Metric"]).describe()

Unnamed: 0_level_0,Value,Value,Value,Value,Value,Value,Value,Value
Unnamed: 0_level_1,count,mean,std,min,25%,50%,75%,max
Metric,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2
Dice,9.0,0.970359,0.012429,0.944765,0.965239,0.969701,0.978839,0.985021


In [15]:
df.groupby(["Metric"]).describe()['Value']['mean'].values[0]

0.9703585638730954

Finally, we can upload the validation metrics and the plots to MLFlow using the following code:

In [None]:
mlflow.set_tracking_uri(os.environ["MLFLOW_TRACKING_URI"])
mlflow.set_experiment("Task09_Spleen")

with mlflow.start_run(run_id="1b31a872e1b644cb9787cf91b60be449"):
    mean_dice = df.groupby(["Metric"]).describe()['Value']['mean'].values[0]
    
    mlflow.log_metric("Val_Dice_Fold_0", mean_dice)
    mlflow.log_artifact("Fold_0_Val_Dice.html")
    

In the final step of the validation phase, we export the trained model, saving the nnUNet Bundle as a zip file (`Task09_Spleen_nnUNet.zip`). The zip file contains the model, the configuration files, and the environment files.

In [22]:
%%bash

export nnUNet_raw="/home/maia-user/Documents/GitHub/tutorials/bundle/MONAI/Data/nnUNet/nnUNet_raw_data_base"
export nnUNet_preprocessed="/home/maia-user/Documents/GitHub/tutorials/bundle/MONAI/Data/nnUNet/nnUNet_preprocessed"
export nnUNet_results="/home/maia-user/Documents/GitHub/tutorials/bundle/MONAI/Data/nnUNet/nnUNet_trained_models"

touch /home/maia-user/Documents/GitHub/tutorials/bundle/MONAI/Data/nnUNet/nnUNet_trained_models/Dataset009_Task09_Spleen/nnUNetTrainer__nnUNetPlans__3d_fullres/fold_0/progress.png

nnUNetv2_export_model_to_zip -d 009 -c 3d_fullres -f 0 -o Task09_Spleen_nnUNet.zip

Configuration 3d_fullres
Exporting fold_0
No ensemble directory found for task 009


## Package MONAI Bundle

After completing the training and validation of the nnUNet model, we can package the model as a MONAI Bundle, to be used for inference (i.e. Active Learning with MONAI Label).

In [None]:
%%bash

export BUNDLE_ROOT=nnUNetBundle
export PYTHONPATH=$PYTHONPATH:$BUNDLE_ROOT

python -m monai.bundle run \
    --config-file $BUNDLE_ROOT/configs/inference.yaml \
    --bundle-root $BUNDLE_ROOT \
    --data-dir $BUNDLE_ROOT/test_input \
    --output-dir $BUNDLE_ROOT/test_output \
    --logging-file $BUNDLE_ROOT/configs/logging.conf

To visualize the prediction in 3D Slicer:

In [8]:
import os
import subprocess

slicer_executable = "/home/maia-user/Documents/Slicer-5.7.0-2024-08-05-linux-amd64/Slicer"

bundle_root = "/home/maia-user/Documents/GitHub/tutorials/bundle/nnUNetBundle"
data_dir = os.path.join(bundle_root, "test_input")
ct_volume = os.path.join(data_dir, "spleen_1","spleen_1.nii.gz")

pred_volume = os.path.join(bundle_root, "test_output", "spleen_1", "spleen_1_prediction.nii.gz")

subprocess.run([
    slicer_executable,
    "--python-code", f"slicer.util.loadVolume('{ct_volume}'); pred=slicer.util.loadSegmentation('{pred_volume}'); pred.CreateClosedSurfaceRepresentation()"
])

Switch to module:  "Welcome"
"Volume" Reader has successfully read the file "/home/maia-user/Documents/GitHub/tutorials/bundle/nnUNetBundle/test_input/spleen_1/spleen_1.nii.gz" "[0.32s]"
ReferenceImageExtentOffset attribute was not found in NRRD segmentation file. Assume no offset.




vtkMRMLSegmentationStorageNode (0xffd0160): vtkMRMLSegmentationStorageNode::ReadBinaryLabelmapRepresentation: Segmentation is a floating point scalar type and will be cast to an integer type by truncation (rounding towards 0).




"Segmentation" Reader has successfully read the file "/home/maia-user/Documents/GitHub/tutorials/bundle/nnUNetBundle/test_output/spleen_1/spleen_1_prediction.nii.gz" "[0.21s]"
Switch to module:  ""
Switch to module:  ""


CompletedProcess(args=['/home/maia-user/Documents/Slicer-5.7.0-2024-08-05-linux-amd64/Slicer', '--python-code', "slicer.util.loadVolume('/home/maia-user/Documents/GitHub/tutorials/bundle/nnUNetBundle/test_input/spleen_1/spleen_1.nii.gz'); pred=slicer.util.loadSegmentation('/home/maia-user/Documents/GitHub/tutorials/bundle/nnUNetBundle/test_output/spleen_1/spleen_1_prediction.nii.gz'); pred.CreateClosedSurfaceRepresentation()"], returncode=0)

We finally export the python environment and the requirements for the MONAI Bundle:

In [1]:
%%bash

conda env export -n MONAI > nnUNetBundle/environment.yml
python -m pip freeze > nnUNetBundle/requirements.txt

[0m

In [2]:
%%bash

zip -r Task09_Spleen_Bundle.zip nnUNetBundle

  adding: nnUNetBundle/ (stored 0%)
  adding: nnUNetBundle/test_output/ (stored 0%)
  adding: nnUNetBundle/test_output/spleen_1/ (stored 0%)
  adding: nnUNetBundle/test_output/spleen_1/spleen_1_prediction.nii.gz (deflated 87%)
  adding: nnUNetBundle/docs/ (stored 0%)
  adding: nnUNetBundle/docs/README.md (deflated 49%)
  adding: nnUNetBundle/LICENSE (stored 0%)
  adding: nnUNetBundle/logs/ (stored 0%)
  adding: nnUNetBundle/logs/events.out.tfevents.1738945616.jupyter-simben-40kth-2ese.531043.0 (deflated 9%)
  adding: nnUNetBundle/logs/events.out.tfevents.1738936466.jupyter-simben-40kth-2ese.372773.0 (deflated 9%)
  adding: nnUNetBundle/logs/events.out.tfevents.1739311133.jupyter-simben-40kth-2ese.689007.0 (deflated 70%)
  adding: nnUNetBundle/logs/events.out.tfevents.1738945553.jupyter-simben-40kth-2ese.528235.1 (deflated 9%)
  adding: nnUNetBundle/logs/events.out.tfevents.1738936984.jupyter-simben-40kth-2ese.423421.1 (deflated 56%)
  adding: nnUNetBundle/logs/events.out.tfevents.17391

## MLFlow Model Upload

To store the model and be able to deploy it for inference in future use cases, we can upload it to MLFlow. We will use the MLFlow Python API to log the model:

In [3]:
import sys
import os
import yaml
from monai.bundle import ConfigParser
import torch
import numpy as np
import mlflow
from mlflow.models import ModelSignature
from mlflow.types.schema import Schema, TensorSpec

sys.path.append("nnUNetBundle")

  from .autonotebook import tqdm as notebook_tqdm


In [4]:
config_files = [f.path for f in os.scandir("nnUNetBundle/configs") if f.path.endswith("inference.yaml")]

config = {}
for config_file in config_files:
    with open(config_file, 'r') as file:
        config.update(yaml.safe_load(file))

config["bundle_root"] = "nnUNetBundle"

parser = ConfigParser(config,globals={"os": "os",
                                      "pathlib":"pathlib",
                                      "json":"json",
                                      "ignite":"ignite"
                                     })

parser.parse(True)

In [5]:
net = parser.get_parsed_content("network_def",instantiate=True)

nnUNet_raw is not defined and nnU-Net can only be used on data for which preprocessed files are already present on your system. nnU-Net cannot be used for experiment planning and preprocessing like this. If this is not intended, please read documentation/setting_up_paths.md for information on how to set this up properly.
nnUNet_preprocessed is not defined and nnU-Net can not be used for preprocessing or training. If this is not intended, please read documentation/setting_up_paths.md for information on how to set this up.
nnUNet_results is not defined and nnU-Net cannot be used for training or inference. If this is not intended behavior, please read documentation/setting_up_paths.md for information on how to set this up.


You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possibl

In [6]:
net.network_weights.load_state_dict(torch.load("nnUNetBundle/models/model.pt")['network_weights'])

You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.


<All keys matched successfully>

In [7]:
os.environ["MLFLOW_TRACKING_URI"] = "http://127.0.0.1:5000"

In [8]:
mlflow.set_experiment("nnUNet_Bundle_Spleen")
mlflow.end_run()



input_schema = Schema(
    [
        TensorSpec(np.dtype(np.float32), (1, *net.predictor.configuration_manager.patch_size),name="ct")
         
    ]
    
)
output_schema = Schema([TensorSpec(np.dtype(np.float32), (1, *net.predictor.configuration_manager.patch_size),name="Spleen")])

signature = ModelSignature(inputs=input_schema, outputs=output_schema)

with mlflow.start_run(run_id='01bd0abe19744cc598acb9a56c2e4ae5'):
    mlflow.pytorch.log_model(
        net,
        "Task09_Spleen",
        signature=signature,
        conda_env = "/home/maia-user/Documents/GitHub/tutorials/bundle/nnUNetBundle/environment.yml",
        registered_model_name = "Task09_Spleen",
        extra_files = [
            "/home/maia-user/Documents/GitHub/tutorials/bundle/Task09_Spleen_Bundle.zip",
            "/home/maia-user/Documents/GitHub/tutorials/bundle/nnUNetBundle/environment.yml",
            "/home/maia-user/Documents/GitHub/tutorials/bundle/nnUNetBundle/requirements.txt"
        ]
    )

Downloading artifacts: 100%|██████████| 1/1 [00:00<00:00,  2.84it/s]
Downloading artifacts: 100%|██████████| 1/1 [00:00<00:00, 178.29it/s]
Downloading artifacts: 100%|██████████| 1/1 [00:00<00:00, 512.00it/s]
 - importlib-metadata (current: 8.6.1, required: importlib-metadata==8.5.0)
 - nnunetv2 (current: 2.5.1, required: nnunetv2==2.5.2)
 - platformdirs (current: 4.3.6, required: platformdirs==3.11.0)
To fix the mismatches, call `mlflow.pyfunc.get_model_dependencies(model_uri)` to fetch the model's environment and install dependencies using the resulting environment file.
Successfully registered model 'Task09_Spleen'.
2025/02/12 20:55:09 INFO mlflow.store.model_registry.abstract_store: Waiting up to 300 seconds for model version to finish creation. Model name: Task09_Spleen, version 1
Created version '1' of model 'Task09_Spleen'.


🏃 View run nnUNet_Bundle_Spleen at: http://127.0.0.1:5000/#/experiments/444387398057374671/runs/01bd0abe19744cc598acb9a56c2e4ae5
🧪 View experiment at: http://127.0.0.1:5000/#/experiments/444387398057374671
