# MAIA

This tutorial has the purpose to showcase all the major applications and functionalities available in MAIA. In detail, we will cover all the essential Medical AI lifecycle stages, to provide a comprehensive overview of the platform.

The tutorial is based on two different datasets:
- [**Decathlon Spleen**](http://medicaldecathlon.com/): a dataset of 3D spleen CT scans from the Medical Segmentation Decathlon challenge. This NIFTI dataset is used to demonstrate the model preprocessing, training and evaluation functionalities in MAIA.
- [**CT Lymph Nodes**](https://www.cancerimagingarchive.net/collection/ct-lymph-nodes/) from TCIA: a dataset of 3D lymph node CT scans from The Cancer Imaging Archive. This DICOM dataset is used to demonstrate the data management functionalities in MAIA, including DICOM upload,visualization, annotation and AI model inference (including Active Learning with MONAI Label).

The tutorial will cover all the necessary steps to download the Decathlon Spleen dataset, preprocess it, train a [ResEnc nnU-Net](https://github.com/MIC-DKFZ/nnUNet/blob/master/documentation/resenc_presets.md) model as a MONAI Bundle, evaluate the model, and finally deploy it for inference on the CT Lymph Nodes dataset. The tutorial will also cover the necessary steps to upload the CT Lymph Nodes dataset, visualize it, annotate it, and perform AI model inference on it.

## Setup environment and imports

In [None]:
import pip
import sys

!{sys.executable} -m pip install odict plotly dtale "monai[nibabel, skimage, scipy, pillow, tensorboard, gdown, ignite, torchvision, itk, tqdm, pandas, mlflow, matplotlib, pydicom]" "pydicom==2.4.4" nnunetv2==2.5.1 fire

In [None]:
from monai.apps import DecathlonDataset
import subprocess
import json
import os
import pandas as pd
import pathlib
import mlflow
from pathlib import Path
import numpy as np
import yaml
import random
import dtale
import dtale.app as dtale_app

from monai.bundle import ConfigParser

from mlflow.models import ModelSignature
from mlflow.types.schema import Schema, TensorSpec

## Download the Decathlon Spleen dataset

In [None]:
os.environ["MONAI_DATA_DIRECTORY"] = "/home/maia-user/Documents/GitHub/tutorials/bundle/MONAI/Data"

In [None]:
directory = os.environ.get("MONAI_DATA_DIRECTORY")
if directory is not None:
    os.makedirs(directory, exist_ok=True)
root_dir = tempfile.mkdtemp() if directory is None else directory
print(root_dir)

In [None]:
DecathlonDataset(root_dir=root_dir,"Task09_Spleen","training",download=True)

## Visualize Spleen CT and SEG in 3D Slicer

You can navigate to [Remote Desktop](/user/{USERNAME}/proxy/80/desktop/{USERNAME}) to interact with the Remote Desktop environment and 3D Slicer.

In [None]:
slicer_executable = "/home/maia-user/Documents/Slicer-5.7.0-2024-08-05-linux-amd64/Slicer"

case_id = "spleen_2"

ct_volume = Path(root_dir).joinpath("Task09_Spleen/imagesTr",f"{case_id}.nii.gz")

seg_volume = Path(root_dir).joinpath("Task09_Spleen/labelsTr",f"{case_id}.nii.gz")

subprocess.run([
    slicer_executable,
    "--python-code", f"slicer.util.loadVolume('{ct_volume}'); seg=slicer.util.loadSegmentation('{seg_volume}'); seg.CreateClosedSurfaceRepresentation()"
])

## nnUNet Training

To run the nnUNet Training, follow the Tutorial [nnUnet Bundle Training](06_nnunet_monai_bundle.ipynb).


## Validation

To visualize in the 3D Slicer the fold-0 validation predictions, together with the ground truth and the CT volume, we can use the following code:

In [None]:
with open(Path(root_dir).joinpath("nnUNet/nnUNet_raw_data_base/Dataset009_Task09_Spleen/datalist.json")) as f:
    datalist = json.load(f)

In [None]:
fold_0_datalist_cases = [case for case in datalist['training'] if case['fold'] == 0]

In [None]:
slicer_executable = "/home/maia-user/Documents/Slicer-5.7.0-2024-08-05-linux-amd64/Slicer"

case =fold_0_datalist_cases[0]

case_id = case['image'].split("/")[-1].split(".")[0]

new_name = case["new_name"]


ct_volume = Path(root_dir).joinpath("Task09_Spleen/imagesTr",f"{case_id}.nii.gz")

seg_volume = Path(root_dir).joinpath("Task09_Spleen/labelsTr",f"{case_id}.nii.gz")

pred_volume = Path(root_dir).joinpath("nnUNet/nnUNet_trained_models/Dataset009_Task09_Spleen/nnUNetTrainer__nnUNetPlans__3d_fullres/fold_0/validation",new_name+".nii.gz")

subprocess.run([
    slicer_executable,
    "--python-code", f"slicer.util.loadVolume('{ct_volume}'); pred=slicer.util.loadSegmentation('{pred_volume}'); seg=slicer.util.loadSegmentation('{seg_volume}'); seg.CreateClosedSurfaceRepresentation()"
])

### Validation Metrics

The validation metrics are stored in a JSON file named `summary.json` in the validation folder. We can load the file and visualize the metrics with [DTale](https://github.com/man-group/dtale) using the following code:

In [None]:
summary_file = Path(root_dir).joinpath("nnUNet/nnUNet_trained_models/Dataset009_Task09_Spleen/nnUNetTrainer__nnUNetPlans__3d_fullres/fold_0/validation","summary.json")


In [None]:
with open(summary_file) as f:
    summary = json.load(f)

In [None]:
config_dict = {
    "label_dict":{
        "Background": 0,
        "Spleen": 1
    },
    "label_suffix": ".nii.gz"
}

In [None]:
df = []

label_to_name = {v: k for k, v in config_dict["label_dict"].items()}

for case in summary['metric_per_case']:
    for label_id in case['metrics']:
        for metric in case['metrics'][label_id]:
           
            df.append({
                "Case": Path(case['reference_file']).name[:-len(config_dict["label_suffix"])],
                "Label": label_to_name[int(label_id)],
                "Metric": metric,
                "Value": case['metrics'][label_id][metric]
            })

In [None]:
df = pd.DataFrame(df)

In [None]:
dtale_app.JUPYTER_SERVER_PROXY = True

d = dtale.show(df,host="0.0.0.0",)

In [None]:
from IPython.display import Markdown
from IPython.core.magic import register_cell_magic
import os


DTALE_URL = d._main_url
@register_cell_magic
def markdown(line, cell):
    return Markdown(cell.format(**globals()))

In [None]:
%%markdown

[DTale]({DTALE_URL})

The DTale charts can be recreated and visualized in the notebook using the following code:

In [None]:
# DISCLAIMER: 'df' refers to the data you passed in when calling 'dtale.show'

import pandas as pd

if isinstance(df, (pd.DatetimeIndex, pd.MultiIndex)):
	df = df.to_frame(index=False)

# remove any pre-existing indices for ease of use in the D-Tale code, but this is not required
df = df.reset_index().drop('index', axis=1, errors='ignore')
df.columns = [str(c) for c in df.columns]  # update columns to strings in case they are numbers

df = df.query("""`Metric` == 'Dice'""")

chart_data = pd.concat([
	df['Case'],
	df['Value'],
], axis=1)
chart_data = chart_data.sort_values(['Case'])
chart_data = chart_data.rename(columns={'Case': 'x'})
chart_data = chart_data.dropna()

import plotly.graph_objs as go

charts = []
charts.append(go.Bar(
	x=chart_data['x'],
	y=chart_data['Value']
))
figure = go.Figure(data=charts, layout=go.Layout({
    'barmode': 'group',
    'legend': {'orientation': 'h', 'y': -0.3},
    'title': {'text': 'Validation Fold 0, Dice score'},
    'xaxis': {'title': {'text': 'Case'}},
    'yaxis': {'title': {'text': 'Dice'}, 'type': 'linear'}
}))

# If you're having trouble viewing your chart in your notebook try passing your 'chart' into this snippet:
#
from plotly.offline import iplot, init_notebook_mode
#
init_notebook_mode(connected=True)
for chart in charts:
    chart.pop('id', None) # for some reason iplot does not like 'id'
#iplot(figure)

figure.write_html("Fold_0_Val_Dice.html")

In [None]:
df.groupby(["Metric"]).describe()

In [None]:
df.groupby(["Metric"]).describe()['Value']['mean'].values[0]

Finally, we can upload the validation metrics and the plots to MLFlow using the following code:

In [None]:
mlflow.set_tracking_uri(os.environ["MLFLOW_TRACKING_URI"])
mlflow.set_experiment("Task09_Spleen")

with mlflow.start_run(run_id="1b31a872e1b644cb9787cf91b60be449"):
    mean_dice = df.groupby(["Metric"]).describe()['Value']['mean'].values[0]
    
    mlflow.log_metric("Val_Dice_Fold_0", mean_dice)
    mlflow.log_artifact("Fold_0_Val_Dice.html")
    

In the final step of the validation phase, we export the trained model, saving the nnUNet Bundle as a zip file (`Task09_Spleen_nnUNet.zip`). The zip file contains the model, the configuration files, and the environment files.

In [None]:
%%bash

export nnUNet_raw="/home/maia-user/Documents/GitHub/tutorials/bundle/MONAI/Data/nnUNet/nnUNet_raw_data_base"
export nnUNet_preprocessed="/home/maia-user/Documents/GitHub/tutorials/bundle/MONAI/Data/nnUNet/nnUNet_preprocessed"
export nnUNet_results="/home/maia-user/Documents/GitHub/tutorials/bundle/MONAI/Data/nnUNet/nnUNet_trained_models"

touch /home/maia-user/Documents/GitHub/tutorials/bundle/MONAI/Data/nnUNet/nnUNet_trained_models/Dataset009_Task09_Spleen/nnUNetTrainer__nnUNetPlans__3d_fullres/fold_0/progress.png

nnUNetv2_export_model_to_zip -d 009 -c 3d_fullres -f 0 -o Task09_Spleen_nnUNet.zip

## Package MONAI Bundle

After completing the training and validation of the nnUNet model, we can package the model as a MONAI Bundle, to be used for inference (i.e. Active Learning with MONAI Label).

In [None]:
%%bash

export BUNDLE_ROOT=nnUNetBundle
export PYTHONPATH=$PYTHONPATH:$BUNDLE_ROOT

python -m monai.bundle run \
    --config-file $BUNDLE_ROOT/configs/inference.yaml \
    --bundle-root $BUNDLE_ROOT \
    --data-dir $BUNDLE_ROOT/test_input \
    --output-dir $BUNDLE_ROOT/test_output \
    --logging-file $BUNDLE_ROOT/configs/logging.conf

To visualize the prediction in 3D Slicer:

In [None]:
import os
import subprocess

slicer_executable = "/home/maia-user/Documents/Slicer-5.7.0-2024-08-05-linux-amd64/Slicer"

bundle_root = "/home/maia-user/Documents/GitHub/tutorials/bundle/nnUNetBundle"
data_dir = os.path.join(bundle_root, "test_input")
ct_volume = os.path.join(data_dir, "spleen_1","spleen_1.nii.gz")

pred_volume = os.path.join(bundle_root, "test_output", "spleen_1", "spleen_1_prediction.nii.gz")

subprocess.run([
    slicer_executable,
    "--python-code", f"slicer.util.loadVolume('{ct_volume}'); pred=slicer.util.loadSegmentation('{pred_volume}'); pred.CreateClosedSurfaceRepresentation()"
])

We finally export the python environment and the requirements for the MONAI Bundle:

In [None]:
%%bash

conda env export -n MONAI > nnUNetBundle/environment.yml
python -m pip freeze > nnUNetBundle/requirements.txt

In [None]:
%%bash

zip -r Task09_Spleen_Bundle.zip nnUNetBundle

## MLFlow Model Upload

To store the model and be able to deploy it for inference in future use cases, we can upload it to MLFlow. We will use the MLFlow Python API to log the model:

In [None]:
import sys
import os
import yaml
from monai.bundle import ConfigParser
import torch
import numpy as np
import mlflow
from mlflow.models import ModelSignature
from mlflow.types.schema import Schema, TensorSpec

sys.path.append("nnUNetBundle")

In [None]:
config_files = [f.path for f in os.scandir("nnUNetBundle/configs") if f.path.endswith("inference.yaml")]

config = {}
for config_file in config_files:
    with open(config_file, 'r') as file:
        config.update(yaml.safe_load(file))

config["bundle_root"] = "nnUNetBundle"

parser = ConfigParser(config,globals={"os": "os",
                                      "pathlib":"pathlib",
                                      "json":"json",
                                      "ignite":"ignite"
                                     })

parser.parse(True)

In [None]:
net = parser.get_parsed_content("network_def",instantiate=True)

In [None]:
net.network_weights.load_state_dict(torch.load("nnUNetBundle/models/model.pt")['network_weights'])

In [None]:
os.environ["MLFLOW_TRACKING_URI"] = "http://127.0.0.1:5000"

In [None]:
mlflow.set_experiment("nnUNet_Bundle_Spleen")
mlflow.end_run()



input_schema = Schema(
    [
        TensorSpec(np.dtype(np.float32), (1, *net.predictor.configuration_manager.patch_size),name="ct")
         
    ]
    
)
output_schema = Schema([TensorSpec(np.dtype(np.float32), (1, *net.predictor.configuration_manager.patch_size),name="Spleen")])

signature = ModelSignature(inputs=input_schema, outputs=output_schema)

with mlflow.start_run(run_id='01bd0abe19744cc598acb9a56c2e4ae5'):
    mlflow.pytorch.log_model(
        net,
        "Task09_Spleen",
        signature=signature,
        conda_env = "/home/maia-user/Documents/GitHub/tutorials/bundle/nnUNetBundle/environment.yml",
        registered_model_name = "Task09_Spleen",
        extra_files = [
            "/home/maia-user/Documents/GitHub/tutorials/bundle/Task09_Spleen_Bundle.zip",
            "/home/maia-user/Documents/GitHub/tutorials/bundle/nnUNetBundle/environment.yml",
            "/home/maia-user/Documents/GitHub/tutorials/bundle/nnUNetBundle/requirements.txt"
        ]
    )