# Team Challenge Image Analysis - Team 6

This notebook shows an example workflow for the proposed image analysis software, it consists of two parts: training and evaluation. More information about the setup and file structure can be found on the Github page.

## Part 0: Setup
In this part, the general setup for the workflow will be done. This consists of changing the working directory to be `root` and performing relevant imports. 

In [None]:
# Change working directory to the root folder
import os, sys
if os.path.split(os.getcwd())[-1] != 'TeamChallenge_team6':
    %cd ..
    
    if os.path.split(os.getcwd())[-1] != 'TeamChallenge_team6':
        raise UserError("Something went wrong in the directory reassignment!")

# Add relevant directories to path
if "" not in sys.path : sys.path.append("")
if "src" not in sys.path : sys.path.append("src")

In [None]:
# Relevant imports
import os
from data_preperation import data_prep, inspect_data
from preprocessing import preprocess_data
from model import define_discriminator, define_generator, define_gan
from training import train
from evaluation import evaluate, get_fsl_metrics, resp_vec_correlation
from util.tf_session import setup_tf_session
from util.general import *

# Setup the tf session for possible gpu usage
n_gpus = setup_tf_session()

## Part 1: Training
In this part, the cGAN model will be trained. It will do so by first preprocessing the data, after which the datasets will be loaded and the models are defined. Hereafter, the actual training process is performed. By default, we'll be training for 100 epochs with a batch size of 4 and an augmentation factor of 20. 

### 1a: Data preprocessing

In [None]:
preprocess_data("data", verbose=True)

### 1b: Dataset generation

##### Data loading

In [None]:
# Load data
print("Dataset - TRAIN")
dataset_train, train_subjects = data_prep(os.path.join("data", "preprocessed"), True, "train", verbose=True)
print("Dataset - TEST")
dataset_test, test_subjects = data_prep(os.path.join("data", "preprocessed"), True, "test", verbose=True)

# Define image shape
image_shape = dataset_train[0].shape[1:]
image_shape = (image_shape[0], image_shape[1], 1)

##### Data inspection
We will now also have a look at some of the preprocessed images for quality assurance and a better understanding in the inner workings of the pipeline. Please note that here, the left image is the day 3 image (input), while the right image is the day 0 image (target). You'll notice that the image used are brain extracted and cropped in such a way as to center the brain as much as possible. Also, a histogram equalization is performed to yield better image contrast.

In [None]:
inspect_data(dataset_train, n_samples = 12)

### 1c: Model definition

In [None]:
# TODO: We should add Roos's schematic here!

# Define the models
d_model = define_discriminator(image_shape)
g_model = define_generator(image_shape)

gan_model = define_gan(g_model, d_model, image_shape)

# Show model summaries
print(print_style.BOLD+"=== DISCRIMINATOR MODEL ==="+print_style.END)
d_model.summary()
print(print_style.BOLD+"\n\n===== GENERATOR MODEL ====="+print_style.END)
g_model.summary()
print(print_style.BOLD+"\n\n======== GAN MODEL ========"+print_style.END)
gan_model.summary()

### 1d: Actual training
Here, we will run the actual training. Augmentation will be performed by default, with an augmentation factor of 20. The batch size and number of epochs are 4 and 100, respectively. An Adam optimizer is used, and the loss function is comprised of binary crossentropy (discriminator), the mean average error (generator), and a SSIM loss term (1-SSIM) (generator). 

In [None]:
# Train model
run_name = train(d_model, g_model, gan_model, dataset_train, n_epochs=100, n_batch=4)

## Part 2: Evaluation

Based on the training we did above, we can evaluate the best performing model. First choose which model (based on step) you want to evaluate (e.g. "0029400"), you can do this by typing the following into your prompt (with specified path to logs): 

tensorboard --logdir "../logs" 

and go to http://localhost:6006/

The step which resulted in the best performing step can be specified in the specific_model parameter below to evaluate the correpsonding model ("last" argument results in evaluating the model from the last step).

### Correlation analysis
Firstly, let's run our training set through the generator model and calculate the SSIM scores. We will then compare these scores to SSIM and DSC scores from the FSL run (Bart's method) and the response vector data through a correlation analysis. This is done, since it is expected that all of these features will in some way quantify the edema-related deformation in a specific subject. We will display this analysis in a set of scatter plots. Here, it should be noted that the darkness of the plot is directly proportional to the correlation between those two features.

In [None]:
# Calculate SSIM for our cGAN method
eval_model = "0000000"

eval_SSIMs = evaluate(d_model, g_model, gan_model, dataset_test, time=run_name, specific_model=eval_model)

# Calculate SSIM and DSC for the FSL method (Bart's method)
fsl_SSIM, fsl_DSC = get_fsl_metrics("data", test_subjects)

# Perform correlation analysis and display figure
resp_vec_cor = resp_vec_correlation("data", test_subjects, eval_SSIMs)

### Calculate results
Now, let's calculate and display some results for our method.

In [None]:
eval_SSIMs = evaluate(d_model, g_model, gan_model, dataset_test, time=run_name, specific_model=eval_model, show_fig=True)

Additionally, the SSIM scores between the true day3 and fake day0 images give a quantification for deformation. Here, note that SSIM is given in a range of `[-1, 1]`. The higher the deformation, the lower this number should be.

In [None]:
print(eval_SSIMs)