# ACDC Tutorial <a class="jp-toc-ignore"></a>
This Jupyter notebook demonstrates how to run **ASCENT** on the **ACDC** dataset. The dataset utilized in this tutorial is a cropped version of the original dataset. You can find the original dataset [here](https://humanheart-project.creatis.insa-lyon.fr/database/#collection/637218c173e9f0047faa00fb).

Reference:  
*O. Bernard, A. Lalande, C. Zotti, F. Cervenansky, et al. "Deep Learning Techniques for Automatic MRI Cardiac Multi-structures Segmentation and Diagnosis: Is the Problem Solved ?" in IEEE Transactions on Medical Imaging, vol. 37, no. 11, pp. 2514-2525, Nov. 2018.*

## I. Setup and Dependencies Installation <a class="anchor" id="setup"></a>
The first step is to install the necessary dependencies. Please follow the instructions in the **Install** section in the [README](../README.md) file to create a conda environment and install the required dependencies. Make sure you have created and activated the `conda` environment as per the README instructions. If you haven't installed the required dependencies yet, execute the following cell to install them

In [None]:
%%capture project_path_setup
import sys

if "../" in sys.path:
    print(sys.path)
else:
    sys.path.append("../")
    print(sys.path)

%%capture packages_install
# Install PyTorch
%conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia -y

# Install ASCENt as an editable package
%pip install -e ../.

## II. Dataset and Preprocessing <a class="anchor" id="dataset_preprocessing"></a>

### A. Download the dataset <a class="anchor" id="download"></a>
Once the environment is successfully set up, download the ACDC dataset by executing the following cell. The dataset will be downloaded to the current working directory.

In [None]:
import sys
from pathlib import Path

from tqdm.auto import tqdm

if "./" in sys.path:
    print(sys.path)
else:
    sys.path.append("./")
    print(sys.path)

# Make sure the data is downloaded and extracted where it should be
if not Path("./ACDC").is_dir():
    import zipfile
    from io import BytesIO
    from urllib.request import urlopen

    zipurl = "https://humanheart-project.creatis.insa-lyon.fr/database/api/v1/collection/637218c173e9f0047faa00fb/download"
    with urlopen(zipurl) as zipresp:
        with zipfile.ZipFile(BytesIO(zipresp.read())) as zfile:
            for member in tqdm(
                zfile.infolist(), desc="Downloading and extracting data", position=0, leave=True
            ):
                try:
                    zfile.extract(member, "./")
                except zipfile.error as e:
                    pass

acdc_data_path = "./ACDC/database"

### B. Convert the dataset to nnU-Net format

Next, we reformat the dataset to the nnU-Net format. You can find the conversion script we use [here](../ascent/dataset_conversion/acdc.py). For the purpose of this tutorial and to save time, we will be working with a cropped version of the ACDC dataset. Execute the following cells to convert the dataset to the nnU-Net format.

In [None]:
# Executing this cell will show the help message for the ACDC dataset conversion script
%run -i ../ascent/dataset_conversion/acdc.py -h

In [None]:
# Executing this cell will crop the ACDC dataset and convert it to the nnU-Net format
# Use -cf flag to crop the dataset
%run -i ../ascent/dataset_conversion/acdc.py -d {acdc_data_path} -o "../data" -n "ACDC" -cf

### C. Preprocess the dataset and generate experiment plans

The converted dataset is now located in `../data/ACDC/raw`. We will proceed to preprocess the dataset and generate experiment plans by executing the following cell. The preprocessed dataset will be located in `../data/ACDC/preprocessed`. During preprocessing, the data is cropped to non-zero regions, resampled to median image spacing, and normalized using z-score normalization. The preprocessed data are stored in `.npz` format for faster loading after being unpacked to `.npy`.

> **ⓘ**
> In the next cell, we are using `%run` magic command of jupyter to achieve a more visually appealing and real-time output. In a real terminal environment, simply use the `ascent_preprocess_and_plan` command.

In [None]:
%run -i ../ascent/preprocess_and_plan.py dataset=ACDC

**YAML** plans for training 2D and 3D nnU-Nets are generated and stored in `../ascent/configs`, starting with the `acdc_` prefix. Examine the generated 3D model plan by executing the following cell.

In [None]:
from ruamel.yaml import YAML

yaml = YAML(typ="safe")
plan = yaml.load(open("../ascent/configs/model/acdc_3d.yaml"))
print(plan)

## III. Training and Validation
We are now prepared to train the model. In this step, a 3D nnU-Net model will be trained on the ACDC dataset using the `acdc_3d` experiment plan. We will use the tensorboard logger to monitor the training process, and other loggers such as wandb and Comet can also be employed.

The log folder to save the training logs will be provided at the end of the training. Execute the following cell to start the training. We will use a reduced number of epochs and batch size to save time and resources. You can retain the default values for better performance.

> **ⓘ**
> By default, the mixed precision is activated to speed up the training process. For Windows users, mixed precision might have issues. You can deactivate it by adding `trainer.precision=32` in the command line arguments in the next cell.  
> **ⓘ**
> In the next cell, we are using the `%run` magic command in Jupyter for a more visually appealing and real-time output. In a real terminal environment, simply use the `ascent_train` command..

In [None]:
try:
    import ipywidgets
except ModuleNotFoundError:
    %pip install ipywidgets

model = "3d"  # Change to "2d" for 2D model

# Add trainer.accelerator=cpu to train on CPU
%run -i ../ascent/train.py experiment=acdc_{model} logger=tensorboard datamodule.test_splits=False datamodule.batch_size=2 fold=0 trainer.max_epochs=2 test=False trainer.precision=32

Visualize the training process using TensorBoard by executing the following cell.

In [None]:
from pathlib import Path

# Copy and paste output dir returned from the previous cell here
output_dir = r"<PASTE_HERE>"
output_dir = Path(output_dir)
tensorboard_dir = str((output_dir / "tensorboard").as_posix())

# Install tensorboard
try:
    import tensorboard
except ModuleNotFoundError:
    %pip install tensorboard
# Launch tensorboard
%load_ext tensorboard
%tensorboard --logdir {tensorboard_dir}

Usually, `ascent_train` trains the model and uses the best weights to run inference on validation data if `test=True` is set. Due to problems with Jupyter, we will manually run the inference on the validation set in the next cell.
> **ⓘ**
> In the next cell, we are using the `%run` magic command in Jupyter for a more visually appealing and real-time output. In a real terminal environment, simply use the `ascent_evaluate` command.

In [None]:
# Checkpoint path
ckpt = str((output_dir / "checkpoints" / "last.ckpt").as_posix())

%run -i ../ascent/eval.py experiment=acdc_{model} logger=tensorboard datamodule.test_splits=False fold=0 trainer.precision=32 ckpt_path={ckpt}

## IV. Inference
Now, we will run inference on new data, such as the test data, using the trained model. The inference results will be saved in the `./inference` folder. Execute the following cell to run inference on the test data.

In [None]:
# Input folder
input_folder = "../data/ACDC/raw/imagesTs"

# Specify the output folder
output_folder = "./inference"

%run -i ../ascent/predict.py dataset=ACDC model=acdc_{model} trainer.precision=32 ckpt_path={ckpt} input_folder={input_folder} output_folder={output_folder}

## V. Visualization
Finally, visualize the results of the inference, including the input image, ground truth, and the predicted segmentation mask, by executing the following cell.

In [None]:
# Path to the GT of the test set
gt_folder = "../data/ACDC/raw/labelsTs"

import numpy as np
import SimpleITK as sitk
from matplotlib import pyplot as plt
from matplotlib.colors import ListedColormap

from ascent.utils.file_and_folder_operations import subfiles
from ascent.utils.visualization import imagesc

pred_dir = Path(output_folder) / "inference_raw"

# Get a random index to display an image with label from inference folder
all_inference_files = subfiles(pred_dir, suffix=".nii.gz", join=False)
idx = np.random.randint(len(all_inference_files))

# Print the selected case
print("selected case: ", all_inference_files[idx][:-7])

# Load the image, label, and prediction
image = sitk.ReadImage(str(Path(input_folder) / f"{all_inference_files[idx][:-7]}_0000.nii.gz"))
label = sitk.ReadImage(str(Path(gt_folder) / all_inference_files[idx]))
pred = sitk.ReadImage(str(Path(pred_dir) / all_inference_files[idx]))
image_array = sitk.GetArrayFromImage(image)
label_array = sitk.GetArrayFromImage(label)
pred_array = sitk.GetArrayFromImage(pred)

# Select a random slice
slice_idx = np.random.randint(image_array.shape[0])
print("selected slice: ", slice_idx)

# Plot the image and label
colors = ["black", "red", "green", "blue"]
cmap = ListedColormap(colors)

figure = plt.figure(figsize=(8, 8))
ax = figure.add_subplot(1, 3, 1)
imagesc(ax, image_array[slice_idx], title="Image", show_colorbar=False)
ax = figure.add_subplot(1, 3, 2)
imagesc(
    ax,
    label_array[slice_idx],
    title="Ground Truth",
    show_colorbar=False,
    colormap=cmap,
    interpolation="nearest",
)
ax = figure.add_subplot(1, 3, 3)
imagesc(
    ax,
    pred_array[slice_idx],
    title="Prediction",
    show_colorbar=False,
    colormap=cmap,
    interpolation="nearest",
)
figure.tight_layout()
plt.show()