# MICCAI 2023 Tutorial - Part II:
## Analyze computation outputs
This part is designed to analyze the results from [Part I] and answer several reproducibility questions, such as:
- Are the outputs of the BraTS pipeline repeatable from one execution to another ?
- Are they reproducible between version 1.8.1 and version 1.9.0 ?
- Can we reproduce the numerical results of the [published paper](https://hal.science/hal-04006057) ?

**This Notebook will be run twice:**
1. While computations of [Part I] are still running, download and analyse the results of the published paper (Author: `vip-team`).
2. When most computations of [Part I] are finished, (see progressions on the [VIP Portal](vip-portal)) at the end of the tutorial, download and analyze the results of all participants (Author: `tutorial`).

[Part I]: 1-launch-application.ipynb
[vip-portal]: https://vip.creatis.insa-lyon.fr/ "https://vip.creatis.insa-lyon.fr/"

In [27]:
# Builtins
import matplotlib.pyplot as plt
from pathlib import *
# Installed
import nibabel as nib
import numpy as np
import pandas as pd
from ipywidgets import interact
from parse import parse
from scipy.ndimage import gaussian_filter

from vip_client.classes import VipLoader

## Understand the application's outputs

The next section will help you understand the outputs of the BraTS pipeline.
These outputs will be downloaded first from the VIP servers and then visualized in 3 dimensions.

<img src="imgs/BraTS-Pipeline.png" alt="BraTS-Pipeline" height="150" title="Full Pipeline for Brain Tumor Segmentation"/>

First, download the outputs from previous executions.
- During the 1rst run, the following cell will download results from the `vip-team` (paper results).
- During the 2nd run, the following cell will also download results from the `tutorial` team (including your results).

_N.B.: You need to paste again your **VIP API key**_

In [None]:
# Paste your VIP API key here
VipLoader.init(api_key="VIP_API_KEY");
# VIP and local directories
vip_dir = PurePosixPath("/vip/EGI tutorial (group)/outputs")
res_dir = Path("data")
# Use the client to download the data
VipLoader.download_dir(vip_dir, res_dir)

The following cell checks existence of the two BraTS outputs :
- 1 MRI **brain scan** with skull strip
- 1 **tumor mask** which indicates the location of the tumor in the brain scan

In [None]:
# Define a filename for each type of result
filenames = {
    "tumor": 'brainTumorMask_SRI.nii.gz',
    "brain": 'T1_to_SRI_brain.nii.gz'
}
# Get 1 tumor file and 1 brain scan
tumor_file = next(res_dir.rglob('brainTumorMask_SRI.nii.gz'))
brain_file = next(res_dir.rglob('T1_to_SRI_brain.nii.gz'))
# Display their path
print("\n".join([str(tumor_file), str(brain_file)]))

Extract data from the brain & tumor files and display the brain volumes slice by slice 

In [None]:
# Extract brain & tumor data from the previous files
brain = nib.load(brain_file).get_fdata()
tumor = nib.load(tumor_file).get_fdata()
tumor[tumor==0] = np.nan

# Interactive method to display the 3D images slice by slice
@interact
def show_slices(z=(0,150)) -> None:
    # Axes
    _, (ax_brain, ax_tumor) = plt.subplots(1, 2, figsize=(10,5))
    # Display the brain
    ax_brain.set_title("Brain Scan")
    ax_brain.imshow(brain[:,:,z], cmap='bone', origin="lower")
    ax_brain.axis('off')
    # Display the brain with tumor
    ax_tumor.set_title("With Tumor Detection")
    ax_tumor.imshow(brain[:,:,z], cmap='bone', origin="lower")
    ax_tumor.imshow(tumor[:,:,z], origin="lower")
    ax_tumor.axis('off')
    plt.show()

## Compare Execution Results
The next section will help you compare BraTS outputs:
- Across pipeline executions (*exec_1*, *exec_2*, ...)
- Across pipeline versions (*1.8.1*, *1.9.0*)

### Get all output files with metadata

List the result files

In [None]:
all_files = [str(path) for path in res_dir.rglob(filenames["tumor"])] \
          + [str(path) for path in res_dir.rglob(filenames["brain"])]
all_files[0]

Build a [Dataframe](https://www.tutorialspoint.com/python_pandas/python_pandas_dataframe.htm) (*i.e.* a table) containing all files with relative metadata

In [None]:
# The file paths contain useful metadata
metadata_format = "{Author}/{Version}/{Execution}/{_}/{Subject}/{Filename}"
path_format = str(res_dir / metadata_format)
metadata_keys = metadata_format.replace("{","").replace("}","").split("/")
metadata_keys.remove('_')

# Function to get the metadata from 1 path 
def get_metadata_from_path(path: str) -> dict:
    metadata = parse(path_format, path)
    if metadata is None: 
        return {}
    result = metadata.named
    result.update({"Path": path}) 
    return result

# Build the dataframe
data = pd.DataFrame([get_metadata_from_path(file) for file in all_files])
# Drop incomplete examples
data.dropna(axis=0, inplace=True)
# Display
data.head()

*You should see a table with file paths, names, subjects, executions, versions & author.*

Change the execution names to simplify the dataframe (*this will be useful for the 2nd analysis*)

In [None]:
# Function to rename executions for each group defined below 
def map_names(group: pd.Series):
    executions = group.unique()
    execution_map = { 
        executions[i]: "exec_%d" %(i+1) for i in range(len(executions))
    }
    return group.map(execution_map)
# Dataframe samples are grouped by Author, Version, & Subject ; then the mapping function is applied to "Execution"
data["Execution"] = data.groupby(["Author", "Version", "Subject"], group_keys=False)["Execution"].apply(map_names)
data.head()

### Checksums
The following cell computes and displays the checksums of all output files.

In [None]:
from hashlib import md5
# Method to compute the md5sum of a file using its path
def md5sum(file: str) -> str:
    """Computes the md5sum of `file`"""
    with open(file, "rb") as fid:
        return md5(fid.read()).hexdigest()
# Create a Checksum dataframe from the file paths
checksums = data.copy()
checksums["md5sum"] = checksums["Path"].apply(md5sum)
# Compare executions and versions
checksums.drop(columns="Path", inplace=True)
checksums.set_index(metadata_keys).unstack(["Author", "Execution"])

*This table can help you answer the reproducibility questions raised in the introduction.*

### Image Analysis
This session will go deeper in the cross-version differences oberved for the BraTS pipeline, by displaying the outputs side to side.

First, load the images from a single execution

In [None]:
# Select 1 execution
execution = "exec_1"
author = "vip-team"
images = data.query("Execution==@execution & Author==@author").reset_index(drop=True)
# Pre-load all files from subject and execution
def load(file: str) -> str:
    """Loads """
    with open(file, "rb") as fid:
        return nib.load(file)
images["Img"] = images["Path"].apply(load)
images.drop(columns=["Execution", "Path"], inplace=True)
images.head()

Simplify the dataframe

In [None]:
# To make the images type more understandable, we map each filename to its type
filetypes = {
    'brainTumorMask_SRI.nii.gz': "tumor",
    'T1_to_SRI_brain.nii.gz': "brain",
}
images["Result"] = images.pop("Filename").map(filetypes)
# Sort the dataframe
subjects = images["Subject"].unique()
images.set_index(["Result", "Version", "Subject"], inplace=True)
images

Finally, show the differences in the segmented tumors between the two pipeline versions.

*Use parameter `sigma` to enhance the differences*

In [None]:
def show_tumor(tumor: np.ndarray, brain: np.ndarray, ax: plt.Axes=plt):
    if brain is not None:
        ax.imshow(brain, cmap='bone', origin="lower")
    tumor[tumor==0] = np.nan
    ax.imshow(tumor, origin="lower")
    ax.axis('off')

def make_diff(tumor_a, tumor_b):
    values = np.unique(tumor_a)
    diff = np.zeros(np.shape(tumor_a))
    for val in values:
        diff[(tumor_a == val) ^ (tumor_b == val)] = val
    return diff

@interact
def show_diff(z=(0,150), sigma=(0, 1, 0.1), subject=subjects):

    brain = images["Img"]["brain", "v181", subject].get_fdata()[:,:,z]
    _, (ax_181, ax_diff, ax_190) = plt.subplots(1, 3, figsize=(15,5))

    tumor_181 = images["Img"]["tumor", "v181", subject].get_fdata()[:,:,z]
    show_tumor(tumor_181, brain, ax_181)
    ax_181.set_title("Version 1.8.1")
    
    tumor_190 = images["Img"]["tumor", "v190", subject].get_fdata()[:,:,z]
    show_tumor(tumor_190, brain, ax_190)
    ax_190.set_title("Version 1.9.0")

    tumor_diff = make_diff(tumor_181, tumor_190)
    tumor_diff = gaussian_filter(tumor_diff, sigma=sigma)
    show_tumor(tumor_diff, brain, ax_diff)
    ax_diff.set_title("Difference")

    plt.show()

If some new tutorial executions are over, you may now **relaunch this Notebok** to download and analyze their outputs.