# Brain tumor 3D segmentation with AzureML and MONAI (BRATS21)

Glioma brain tumors are among the most aggressive and lethal types of brain tumors. They can cause a range of symptoms, including headaches, seizures, and difficulty with speech and movement. Gliomas can be difficult to diagnose and treat, and early detection is critical for improving patient outcomes.

Computer vision AI has emerged as a promising tool for supporting the diagnosis and treatment of glioma brain tumors. AI algorithms can analyze medical images of the brain and identify the location and extent of tumors with a high degree of accuracy. This can help clinicians make more informed decisions about treatment options, such as surgery or radiation therapy, and monitor the progress of the disease over time. Additionally, AI algorithms can help researchers better understand the underlying biology of gliomas and develop new therapies for this challenging disease.

This demo is based on the [MONAI 3d brain tumor segmentation tutorial](https://github.com/Project-MONAI/tutorials/blob/main/3d_segmentation/swin_unetr_brats21_segmentation_3d.ipynb) and shows how to construct a training workflow of multi-labels segmentation task.

The sub-regions considered for evaluation in the BraTS 21 challenge are the "enhancing tumor" (ET), the "tumor core" (TC), and the "whole tumor" (WT). The ET is described by areas that show hyper-intensity in T1Gd when compared to T1, but also when compared to “healthy” white matter in T1Gd. The TC describes the bulk of the tumor, which is what is typically resected. The TC entails the ET, as well as the necrotic (NCR) parts of the tumor. The appearance of NCR is typically hypo-intense in T1-Gd when compared to T1. The WT describes the complete extent of the disease, as it entails the TC and the peritumoral edematous/invaded tissue (ED), which is typically depicted by the hyper-intense signal in FLAIR [BraTS 21].

![image](./media/fig_brats21.png)

This notebook has been developed and tested with VSCode connected to an AzureML `STANDARD_D13_V2` Compute Instance using the `azureml_py310_sdkv2` kernel.

## References

[1]: Hatamizadeh, A., Nath, V., Tang, Y., Yang, D., Roth, H. and Xu, D., 2022. Swin UNETR: Swin Transformers for Semantic Segmentation of Brain Tumors in MRI Images. arXiv preprint arXiv:2201.01266.

[2]: Tang, Y., Yang, D., Li, W., Roth, H.R., Landman, B., Xu, D., Nath, V. and Hatamizadeh, A., 2022. Self-supervised pre-training of swin transformers for 3d medical image analysis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 20730-20740).

[3] U.Baid, et al., The RSNA-ASNR-MICCAI BraTS 2021 Benchmark on Brain Tumor Segmentation and Radiogenomic Classification, arXiv:2107.02314, 2021.

[4] B. H. Menze, A. Jakab, S. Bauer, J. Kalpathy-Cramer, K. Farahani, J. Kirby, et al. "The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS)", IEEE Transactions on Medical Imaging 34(10), 1993-2024 (2015) DOI: 10.1109/TMI.2014.2377694

[5] S. Bakas, H. Akbari, A. Sotiras, M. Bilello, M. Rozycki, J.S. Kirby, et al., "Advancing The Cancer Genome Atlas glioma MRI collections with expert segmentation labels and radiomic features", Nature Scientific Data, 4:170117 (2017) DOI: 10.1038/sdata.2017.117

[6] S. Bakas, H. Akbari, A. Sotiras, M. Bilello, M. Rozycki, J. Kirby, et al., "Segmentation Labels and Radiomic Features for the Pre-operative Scans of the TCGA-GBM collection", The Cancer Imaging Archive, 2017. DOI: 10.7937/K9/TCIA.2017.KLXWJJ1Q



## Installs and Imports
Do it only once

In [1]:
# based on azureml_py310_sdkv2 kernel
# %pip install torch==1.12.0 torchvision==0.13.0 torchaudio==0.12.0
# %pip install 'monai[nibabel, ignite, tqdm]'
# %pip install itkwidgets

In [2]:
import os
import tempfile
import base64
import json
import datetime
import random

import numpy as np
import matplotlib.pyplot as plt

import torch

import tarfile
import urllib.request

from itkwidgets import view
from ipywidgets import interact

from azure.identity import DefaultAzureCredential
from azure.ai.ml import MLClient, command, Input, dsl
from azure.ai.ml.constants import AssetTypes
from azure.ai.ml.entities import ManagedOnlineEndpoint, ManagedOnlineDeployment, Model, Environment, JobService, Data, CodeConfiguration, OnlineRequestSettings, AmlCompute
from azure.core.exceptions import ResourceNotFoundError

from monai.apps import DecathlonDataset
from monai.data import DataLoader, Dataset
from monai.transforms import Compose, LoadImaged, EnsureChannelFirstd, EnsureTyped, Orientationd, Spacingd, NormalizeIntensityd, MapTransform
from monai.visualize.utils import blend_images
from azure.ai.ml import load_component
from azure.identity import DefaultAzureCredential, InteractiveBrowserCredential

## Define central variables

In [3]:
# Training
experiment_name = 'monai-brain-tumor-segmentation' # AzureML experiment name
#dataset_name="BRATS-dataset"
train_target = 'NC64asT4v3-GPU'

# Model Deployment
registered_model_name = 'SegResNet'
# deployment_name = 'blue'

#Environments
train_env_name = "monai-demo-train-env"

#Pipeline
pipeline_name = 'training_pipeline'

#BraTS data from kaggle
tar_location = 'azureml://subscriptions/b7d41fc8-d35d-41db-92ed-1f7f1d32d4d9/resourcegroups/monai-3d-rg/workspaces/aml-monai-3d/datastores/data/paths/braintumor/BraTS2021_Training_Data.tar'

# Visualization and validation sample
sample_image = '../samples/BraTS2021_00402_flair.nii.gz' 
sample_image_t1 = '../samples/BraTS2021_00402_t1.nii.gz'
sample_image_t1ce = '../samples/BraTS2021_00402_t1ce.nii.gz'
sample_image_t2 = '../samples/BraTS2021_00402_t2.nii.gz'
sample_label = '../samples/BraTS2021_00402_seg.nii.gz'

In [4]:
try:
    credential = DefaultAzureCredential()
    # Check if given credential can get token successfully.
    credential.get_token("https://management.azure.com/.default")
except Exception as ex:
    # Fall back to InteractiveBrowserCredential in case DefaultAzureCredential not work
    # This will open a browser page for
    credential = InteractiveBrowserCredential()

ml_client = MLClient.from_config(credential=credential)

2023-04-06 06:31:14,848 - No environment configuration found.
2023-04-06 06:31:14,849 - ManagedIdentityCredential will use Azure ML managed identity
2023-04-06 06:31:14,957 - DefaultAzureCredential acquired a token from ManagedIdentityCredential


## Inspect sample image

In [5]:
class ConvertToMultiChannelBasedOnBratsClassesd(MapTransform):
    """
    Convert labels to multi channels based on brats 2021 classes:
    label 1 necrotic tumor core (NCR)
    label 2 peritumoral edematous/invaded tissue 
    label 3 is not used in the new dataset version
    label 4 GD-enhancing tumor 
    The possible classes are:
      TC (Tumor core): merge labels 1 and 4
      WT (Whole tumor): merge labels 1,2 and 4
      ET (Enhancing tumor): label 4

    """

    def __call__(self, data):
        d = dict(data)
        for key in self.keys:
            result = []
            # merge label 1 and label 4 to construct TC
            result.append(torch.logical_or(d[key] == 1, d[key] == 4))
            # merge labels 1, 2 and 4 to construct WT
            result.append(
                torch.logical_or(
                    torch.logical_or(d[key] == 1, d[key] == 2), d[key] == 4
                )
            )
            # label 4 is ET
            result.append(d[key] == 4)
            d[key] = torch.stack(result, axis=0).float()
        return d

val_transform = Compose(
[
    LoadImaged(keys=["image", "label"]),
    EnsureChannelFirstd(keys="image"),
    EnsureTyped(keys=["image", "label"]),
    ConvertToMultiChannelBasedOnBratsClassesd(keys="label"),
    Orientationd(keys=["image", "label"], axcodes="RAS"),
    Spacingd(
        keys=["image", "label"],
        pixdim=(1.0, 1.0, 1.0),
        mode=("bilinear", "nearest"),
    ),
    NormalizeIntensityd(keys="image", nonzero=True, channel_wise=True),
])

data_list = [{'image': sample_image, 'label': sample_label}]
val_ds = Dataset(data=data_list, transform=val_transform)

img_vol = val_ds[0]["image"].numpy()
seg_vol = val_ds[0]["label"].numpy()


In [6]:
# Inspect 3d structure - viewer only works in VSCode 

# img_vol_ch = img_vol[0,:,:,:]
# seg_vol_ch = seg_vol[0,:,:,:]


# viewer = view(image= img_vol_ch * 255,
#               label_image= seg_vol_ch * 255,
#               gradient_opacity=0.4,
#               background = (0.5, 0.5, 0.5))
              
# viewer

In [7]:
# viewer.close()

## Create compute resources, environments and datasets
__Note:__ Creating compute resources, training/scoring environments and the dataset __need only performed once__. If you have executed these steps previously, navigate to the next section of this notebook.  

Note that we are using low priority compute in this demo as the most cost efficient option. Low priority VMs are significantly cheaper than standard dedictaed compute. However, these resources are not always available and there is a risk that a training job might be pre-empted.

In [8]:
try:
    _ = ml_client.compute.get(train_target)
    print("Found existing compute target.")
except ResourceNotFoundError:
    print("Creating a new compute target...")
    compute_config = AmlCompute(
        name=train_target,
        type="amlcompute",
        size="STANDARD_NC24RS_V3", # 4 x Tesla V100, 16 GB GPU memory each
        tier="low_priority",
        idle_time_before_scale_down=600,
        min_instances=0,
        max_instances=2,
    )
    ml_client.begin_create_or_update(compute_config)

2023-04-06 06:31:16,284 - DefaultAzureCredential acquired a token from ManagedIdentityCredential
Found existing compute target.


In [9]:


#Get training environment
training_environment = ml_client.environments.get(name=train_env_name,label='latest')

#Create training environment
#Run it once
# training_environment = Environment(
#     image="mcr.microsoft.com/azureml/" + "openmpi4.1.0-cuda11.1-cudnn8-ubuntu20.04:latest",
#     conda_file="./train-env.yml",
#     name=train_env,
#     description="Parallel PyTorch training on AzureML with MONAI")

# ml_client.environments.create_or_update(training_environment)

2023-04-06 06:31:17,046 - DefaultAzureCredential acquired a token from ManagedIdentityCredential


In [10]:
#from azure.ai.ml import load_component
# importing the Component Package
from azure.ai.ml import load_component
upload_component = load_component(source="../components/upload_from_blob/spec.yaml")
from azure.ai.ml import load_component
train_component = load_component(source="../components/train_segmentation/spec.yaml")

In [11]:
@dsl.pipeline(
    name=pipeline_name,
    description=f'Pipeline for segmentation. The unique identifier is that can help you to track files in the storage account.',
    default_compute=train_target,
)
def image_pipeline_func(pipeline_input_file):

    # Load data pipeline step   
    load_step = upload_component(
        blob_file_location=pipeline_input_file,
    )
    #load_step.compute = "cpucluster"
    
    # # Train pipeline step
    train_step = train_component(
        input_data=load_step.outputs.image_data_folder, best_model_name=registered_model_name
    )
    train_step.distribution.process_count_per_instance=4
    train_step.resources = {'instance_count' : 1, 'shm_size':'300g'}
    train_step.environment_variables = {'AZUREML_ARTIFACTS_DEFAULT_TIMEOUT' : '1000'}

    return {
        "model" : train_step.outputs.model,
    }


In [12]:
pipeline_job = image_pipeline_func(
       pipeline_input_file=Input(type="uri_file", path=tar_location),
)

# pipeline_job.outputs.uncompressed_data = Output(type="uri_folder", path="...")
# pipeline_job.outputs.model = Output(type="uri_folder", path="...")

ml_client.jobs.create_or_update(pipeline_job, experiment_name=experiment_name)

2023-04-06 06:31:23,800 - DefaultAzureCredential acquired a token from ManagedIdentityCredential
2023-04-06 06:31:26,498 - DefaultAzureCredential acquired a token from ManagedIdentityCredential


Experiment,Name,Type,Status,Details Page
monai-brain-tumor-segmentation,sharp_office_l4h4r612dg,pipeline,Preparing,Link to Azure Machine Learning studio
