# UW-Madison GI Tract Image Segmentation
<br><center><img src="https://storage.googleapis.com/kaggle-competitions/kaggle/27923/logos/header.png?t=2021-06-02-20-30-25" width=100%></center>
  Students: David Vaisbrud‏, Shahar Cohen
  
  In 2019, an estimated 5 million people were diagnosed with a cancer of the gastro-intestinal tract worldwide. Of these patients, about half are eligible for radiation therapy, usually delivered over 10-15 minutes a day for 1-6 weeks.<br>
    Radiation oncologists try to deliver high doses of radiation using X-ray beams pointed to tumors while avoiding the stomach and intestines. <br>
    With newer technology such as integrated magnetic resonance imaging and linear accelerator systems, also known as MR-Linacs, oncologists are able to visualize the daily position of the tumor and intestines, which can vary day to day. <br>
    In these scans, radiation oncologists must manually outline the position of the stomach and intestines in order to adjust the direction of the x-ray beams to increase the dose delivery to the tumor and avoid the stomach and intestines. <br>
    This is a time-consuming and labor intensive process that can prolong treatments from 15 minutes a day to an hour a day, which can be difficult for patients to tolerate—unless deep learning could help automate the segmentation process.<br>
    <br>
This notebook will run the following subjects :<br>
1. **Problem description**<br> 
    create a model to automatically segment the stomach and intestines on MRI scans.<br> 
    The MRI scans are from actual cancer patients who had 1-5 MRI scans on separate days during their radiation treatment. <br> You'll base your algorithm on a dataset of these scans to come up with creative deep learning solutions that will help cancer patients get better care.<br><br>
2. **Data gathering**<br> 
    The UW-Madison Carbone Cancer Center is a pioneer in MR-Linac based radiotherapy, and has treated patients with MRI guided radiotherapy based on their daily anatomy since 2015.<br> UW-Madison has generously agreed to support this project which provides anonymized MRIs of patients treated at the UW-Madison Carbone Cancer Center. The University of Wisconsin-Madison is a public land-grant research university in Madison, Wisconsin.<br>  The Wisconsin Idea is the university's pledge to the state, the nation, and the world that their endeavors will benefit all citizens.<br><br>
2. **EDA  - investigate the data set and visualizing the data.**
    * Exploring tha dataset
    * Investigate the segmentation - look at labeling distrebution
    * Investigate Pixel Spacing
    * Investigate mask size/areas
    * Distribution of Images per Case IDS
    * Case id sequence data
    * Mask dataset creation, class overlap.
    * Pixel values
    * Heuristucs or rules regarfing segmantation
    * 3D gif with case mask


3. **Model selection and Evaluation**<br>
    **Model selection**
    
    * Unet
    * DeepLab3 - TBD
   
   **Evaluation:** <br>
   This competition is evaluated on the mean <a href="https://en.wikipedia.org/wiki/S%C3%B8rensen%E2%80%93Dice_coefficient"><b>Dice coefficient</b></a> and <a href="https://github.com/scipy/scipy/blob/master/scipy/spatial/_hausdorff.pyx"><b>3D Hausdorff distance</b></a>. The Dice coefficient can be used to compare the pixel-wise agreement between a predicted segmentation and its corresponding ground truth. The formula is given by:

    $$
    \frac{2 * |X \cap Y|}{|X| + |Y|}
    $$

    where $X$ is the predicted set of pixels and $Y$ is the ground truth. 
    * The Dice coefficient is defined to be $1$ when both $X$ and $Y$ are empty. 
    * The leaderboard score is the <b>mean of the Dice coefficients for each image in the test set.</b>

    Hausdorff distance is a method for calculating the distance between segmentation objects A and B, by calculating the furthest point on object A from the nearest point on object B. For 3D Hausdorff, we construct 3D volumes by combining each 2D segmentation with slice depth as the Z coordinate and then find the Hausdorff distance between them. **(In this competition, the slice depth for all scans is set to 1.)** <a href="https://github.com/scipy/scipy/blob/master/scipy/spatial/_hausdorff.pyx"><b>The scipy code for Hausdorff is linked</b></a>. The expected / predicted pixel locations are normalized by image size to create a bounded 0-1 score.

    <br>

    ---

    <b>NOTE: The two metrics are combined during evaluation!</b>

    * <b>Weight of 0.4 for the Dice metric</b>
    * <b>Weight of 0.6 for the Hausdorff distance.</b>

    ---


 ## To do before submission.
### Improve the existing analyses:

* Improve Unet model visualization and analyses
* Investigate which categories and images the model fails

### Add more analyses:
* EDA: check the image size (currently we use [256,256] px ), and try to find a heuristic based on the pixel spacing.
* Try Unet++ (it should work better, made for this kind of assignment)
* Try DeepLabv3
* Check different optimizers.

# Data gathering 
In this section we will import the relative libraries and obtain the dataset.
<br><br>
In this competition we are segmenting organs cells in images. The training annotations are provided as RLE-encoded masks, and the images are in 16-bit grayscale PNG format.
<br><br>
Each case in this competition is represented by multiple sets of scan slices (each set is identified by the day the scan took place). Some cases are split by time (early days are in train, later days are in test) while some cases are split by case - the entirety of the case is in train or test. The goal of this competition is to be able to generalize to both partially and wholly unseen cases.
<br><br>
Note that, in this case, the test set is entirely unseen. It is roughly 50 cases, with a varying number of days and slices, as seen in the training set.

## Segmentation RLE
RLE is run-length encoding. It is used to encode the location of foreground objects in segmentation. Instead of outputting a mask image, you give a list of start pixels and how many pixels after each of those starts is included in the mask.
<br>
The encoding rule is pretty simple: Where the mask is. Index of the mask, and how many pixels follow


<br>

## DataSet overview

### General information

<b>In this competition we are segmenting organs cells in images</b>. 

The training **annotations are provided as RLE-encoded masks**, and the images are in **16-bit**, **grayscale**, **PNG format**.

Each case in this competition is represented by multiple sets of scan slices
* Each set is identified by the day the scan took place
* Some cases are split by time
    * early days are in train
    * later days are in test
* Some cases are split by case
    * the entirety of the case is in train or test

<b>The goal of this competition is to be able to generalize to both partially and wholly unseen cases.</b>

Note that, in this case, the test set is entirely unseen.
* It is roughly 50 cases
* It contains a varying number of days and slices, (similar to the training set)

### Files imformation

**`train.csv`** 
- IDs and masks for all training objects.
- **Columns**
    * **`id`**
        * unique identifier for object
    * **`class`**
        * the predicted class for the object
    * **`EncodedPixels`**
        * RLE-encoded pixels for the identified object

<br>

**`sample_submission.csv`**
- A sample submission file in the correct format

<br>

**`train/`**
- a folder of case/day folders, each containing slice images for a particular case on a given day.


In [None]:
# install libraries
!pip install -q segmentation_models_pytorch

In [None]:
# Import libraries
# Operating system libraries
from glob import glob
import os
import time
import copy
import gc
from collections import defaultdict

# linear algebra and data processing
import numpy as np
import pandas as pd


# visualization
import cv2
import plotly.express as px
import matplotlib.pyplot as plt
from matplotlib.patches import Rectangle
from matplotlib import animation, rc
rc('animation', html='jshtml')
import seaborn as sns

# Progress bars to know cell progress in pandas apply
from tqdm import tqdm
from tqdm.notebook import tqdm_notebook
tqdm_notebook.pandas()

# PyTorch deep learning semantic segmentation
import torch
import torch.nn as nn
import torch.optim as optim
from torch.optim import lr_scheduler
from torch.utils.data import Dataset, DataLoader
from torch.cuda import amp
import segmentation_models_pytorch as smp
# Sklearn ML algorithems
from sklearn.model_selection import GroupKFold

# Albumentations for image augmentations
import albumentations as A

## Basic data definitions

### EDA Constants

In [None]:
# Open the training dataframe and display the initial dataframe
DATA_DIR = "/kaggle/input/uw-madison-gi-tract-image-segmentation"
TRAIN_DIR = os.path.join(DATA_DIR, "train")
TRAIN_CSV = os.path.join(DATA_DIR, "train.csv")

# Submission constants
TEST_DIR = os.path.join(DATA_DIR, "test")
SS_CSV = os.path.join(DATA_DIR, "sample_submission.csv")


# Dictionary for classes
SF2LF = {"lb":"Large Bowel","sb":"Small Bowel","st":"Stomach"}
LF2SF = {v:k for k,v in SF2LF.items()}

# Directory for working
NPY_DIR = "/kaggle/working/npy_files"

### Data reading

In [None]:
# Get dataframes
train_df = pd.read_csv(TRAIN_CSV)
ss_df = pd.read_csv(SS_CSV)

# Get all training images
all_train_images = glob(os.path.join(TRAIN_DIR, "**", "*.png"), recursive=True)

print("\n... ORIGINAL TRAINING DATAFRAME... \n")
display(train_df)

# Get all testing images if there are any
all_test_images = glob(os.path.join(TEST_DIR, "**", "*.png"), recursive=True)
    

print("\n\n\n... ORIGINAL SUBMISSION DATAFRAME... \n")    
display(ss_df)


print("\n... BASIC DATA SETUP FINISHED ...\n\n")

## Update dataframe with external information
Before we start the EDA, we will preprocess the data to create a informative dataframe with the data seperated to freatures.<br>
The feature we choose to creat from the train df are:
* case_id
* file path
* number of segmentation masks
* specify the segmantation mask
* Slice dimentions of the file(hight and width)
* class
* <br>


The label will be changed as the following<br>
* **large_bowel** --> **lb**
* **small_bowel** --> **sb**
* **stomach** --> **st**

In [None]:
def get_filepath_from_partial_identifier(_ident, file_list):
    return [x for x in file_list if _ident in x][0]


def df_preprocessing(df, globbed_file_list, is_test=False):
    """ The preprocessing steps applied to get column information """
    # 1. Get Case-ID as a column (str and int)
    df["case_id_str"] = df["id"].apply(lambda x: x.split("_", 2)[0])
    df["case_id"] = df["id"].apply(lambda x: int(x.split("_", 2)[0].replace("case", "")))

    # 2. Get Day as a column
    df["day_num_str"] = df["id"].apply(lambda x: x.split("_", 2)[1])
    df["day_num"] = df["id"].apply(lambda x: int(x.split("_", 2)[1].replace("day", "")))

    # 3. Get Slice Identifier as a column
    df["slice_id"] = df["id"].apply(lambda x: x.split("_", 2)[2])

    # 4. Get full file paths for the representative scans
    df["_partial_ident"] = (globbed_file_list[0].rsplit("/", 4)[0] + "/" +  # /kaggle/input/uw-madison-gi-tract-image-segmentation/train/
                            df["case_id_str"] + "/" +  # .../case###/
                            df["case_id_str"] + "_" + df["day_num_str"] +  # .../case###_day##/
                            "/scans/" + df["slice_id"])  # .../slice_#### 
    _tmp_merge_df = pd.DataFrame({"_partial_ident": [x.rsplit("_", 4)[0] for x in globbed_file_list],
                                  "f_path": globbed_file_list})
    df = df.merge(_tmp_merge_df, on="_partial_ident").drop(columns=["_partial_ident"])
    del _tmp_merge_df
    # 5. Get slice dimensions from filepath (int in pixels)
    df["slice_h"] = df["f_path"].apply(lambda x: int(x[:-4].rsplit("_", 4)[1]))
    df["slice_w"] = df["f_path"].apply(lambda x: int(x[:-4].rsplit("_", 4)[2]))

    # 6. Pixel spacing from filepath (float in mm)
    df["px_spacing_h"] = df["f_path"].apply(lambda x: float(x[:-4].rsplit("_", 4)[3]))
    df["px_spacing_w"] = df["f_path"].apply(lambda x: float(x[:-4].rsplit("_", 4)[4]))

    if not is_test:
        # 7. Merge 3 Rows Into A Single Row (As This/Segmentation-RLE Is The Only Unique Information Across Those Rows)
        l_bowel_df = df[df["class"] == "large_bowel"][["id", "segmentation"]].rename(
            columns={"segmentation": "lb_seg_rle"})
        s_bowel_df = df[df["class"] == "small_bowel"][["id", "segmentation"]].rename(
            columns={"segmentation": "sb_seg_rle"})
        stomach_df = df[df["class"] == "stomach"][["id", "segmentation"]].rename(columns={"segmentation": "st_seg_rle"})
        df = df.merge(l_bowel_df, on="id", how="left")
        df = df.merge(s_bowel_df, on="id", how="left")
        df = df.merge(stomach_df, on="id", how="left")
        df = df.drop_duplicates(subset=["id", ]).reset_index(drop=True)
        df["lb_seg_flag"] = df["lb_seg_rle"].apply(lambda x: not pd.isna(x))
        df["sb_seg_flag"] = df["sb_seg_rle"].apply(lambda x: not pd.isna(x))
        df["st_seg_flag"] = df["st_seg_rle"].apply(lambda x: not pd.isna(x))
        df["n_segs"] = df["lb_seg_flag"].astype(int) + df["sb_seg_flag"].astype(int) + df["st_seg_flag"].astype(int)

    # 8. Reorder columns to the a new ordering (drops class and segmentation as no longer necessary)
    new_col_order = ["id", "f_path", "n_segs",
                     "lb_seg_rle", "lb_seg_flag",
                     "sb_seg_rle", "sb_seg_flag",
                     "st_seg_rle", "st_seg_flag",
                     "slice_h", "slice_w", "px_spacing_h",
                     "px_spacing_w", "case_id_str", "case_id",
                     "day_num_str", "day_num", "slice_id", ]
    if is_test: new_col_order.insert(1, "class")
    new_col_order = [_c for _c in new_col_order if _c in df.columns]
    df = df[new_col_order]

    return df

In [None]:
print(f"Old Df columns:\n{train_df.columns}")
train_df =  df_preprocessing(train_df, all_train_images)
print(f"New Df columns:\n{train_df.columns}")

print("\n... UPDATED TRAINING DATAFRAME... \n")
display(train_df)

print("\n... UPDATING DATAFRAMES WITH ACCESSIBLE INFORMATION FINISHED ...\n\n")

# EDA

## Helper functions
Before we start the data observation, we will need some functions to decode and encode the segmentation masks

In [None]:
# ref: https://www.kaggle.com/paulorzp/run-length-encode-and-decode
# modified from: https://www.kaggle.com/inversion/run-length-decoding-quick-start
def rle_decode(mask_rle, shape, color=1):
    """
    Args:
        mask_rle (str): run-length as string formated (start length)
        shape (tuple of ints): (height,width) of array to return 

    Returns: 
        Mask (np.array)
            - 1 indicating mask
            - 0 indicating background

    """
    # Split the string by space, then convert it into a integer array
    s = np.array(mask_rle.split(), dtype=int)

    # Every even value is the start, every odd value is the "run" length
    starts = s[0::2] - 1
    lengths = s[1::2]
    ends = starts + lengths

    # The image is actually flattened since RLE is a 1D "run"
    if len(shape) == 3:
        h, w, d = shape
        img = np.zeros((h * w, d), dtype=np.float32)
    else:
        h, w = shape
        img = np.zeros((h * w,), dtype=np.float32)

    # The color here is actually just any integer you want!
    for lo, hi in zip(starts, ends):
        img[lo: hi] = color

    # Don't forget to change the image back to the original shape
    return img.reshape(shape)


# https://www.kaggle.com/namgalielei/which-reshape-is-used-in-rle
def rle_decode_top_to_bot_first(mask_rle, shape):
    """ 
    Args:
        mask_rle (str): run-length as string formated (start length)
        shape (tuple of ints): (height,width) of array to return 

    Returns:
        Mask (np.array)
            - 1 indicating mask
            - 0 indicating background

    """
    s = mask_rle.split()
    starts, lengths = [np.asarray(x, dtype=int) for x in (s[0:][::2], s[1:][::2])]
    starts -= 1
    ends = starts + lengths
    img = np.zeros(shape[0] * shape[1], dtype=np.uint8)
    for lo, hi in zip(starts, ends):
        img[lo:hi] = 1
    return img.reshape((shape[1], shape[0]), order='F').T  # Reshape from top -> bottom first


# ref.: https://www.kaggle.com/stainsby/fast-tested-rle
def rle_encode(img):
    """
    Args:
        img (np.array): 
            - 1 indicating mask
            - 0 indicating background

    Returns: 
        run length as string formated
    """

    pixels = img.flatten()
    pixels = np.concatenate([[0], pixels, [0]])
    runs = np.where(pixels[1:] != pixels[:-1])[0] + 1
    runs[1::2] -= runs[::2]
    return ' '.join(str(x) for x in runs)


def flatten_l_o_l(nested_list):
    """ Flatten a list of lists """
    return [item for sublist in nested_list for item in sublist]


def load_json_to_dict(json_path):
    """ tbd """
    with open(json_path) as json_file:
        data = json.load(json_file)
    return data


def tf_load_png(img_path):
    return tf.image.decode_png(tf.io.read_file(img_path), channels=3)


def open_gray16(_path, normalize=True, to_rgb=False):
    """ Helper to open files """
    if normalize:
        if to_rgb:
            return np.tile(np.expand_dims(cv2.imread(_path, cv2.IMREAD_ANYDEPTH) / 65535., axis=-1), 3)
        else:
            return cv2.imread(_path, cv2.IMREAD_ANYDEPTH) / 65535.
    else:
        if to_rgb:
            return np.tile(np.expand_dims(cv2.imread(_path, cv2.IMREAD_ANYDEPTH), axis=-1), 3)
        else:
            return cv2.imread(_path, cv2.IMREAD_ANYDEPTH)


def load_img(path):
    img = cv2.imread(path, cv2.IMREAD_UNCHANGED)
    img = np.tile(img[..., None], [1, 1, 3])  # gray to rgb
    img = img.astype('float32')  # original is uint16
    mx = np.max(img)
    if mx:
        img /= mx  # scale image to [0, 1]
    return img


def load_msk(path):
    msk = np.load(path)
    msk = msk.astype('float32')
    return msk

## Exploring tha dataset
After preprocessing the multiple datasets and adding feature to investigate, now we can start exploring the data.

First we will define some visualization function to help us plot the images with the segmantation masks.

In [None]:
def get_overlay(img_path, rle_strs, img_shape, _alpha=0.999, _beta=0.35, _gamma=0):
    """
        A simple function to return the images with the segmentation masks.
        """
    _img = open_gray16(img_path, to_rgb=True).astype(np.float32)
    _img = ((_img - _img.min()) / (_img.max() - _img.min())).astype(np.float32)
    _seg_rgb = np.stack([rle_decode(rle_str, shape=img_shape, color=1)
                         if (rle_str is not None and not pd.isna(rle_str))
                         else np.zeros(img_shape, dtype=np.float32)
                         for rle_str in rle_strs], axis=-1).astype(np.float32)

    seg_overlay = cv2.addWeighted(src1=_img, alpha=_alpha,
                                  src2=_seg_rgb, beta=_beta, gamma=_gamma)
    return seg_overlay


def examine_id(ex_id, df=train_df, plot_overlay=True, print_meta=False, plot_grayscale=False,
               plot_binary_segmentation=False):
    """ Wrapper function to allow for easy visual exploration of an example """
    print(f"\n... ID ({ex_id}) EXPLORATION STARTED ...\n\n")
    demo_ex = df[df.id == ex_id].squeeze()

    if print_meta:
        print(f"\n... WITH DEMO_ID=`{DEMO_ID}` WE HAVE THE FOLLOWING DEMO EXAMPLE TO WORK FROM... \n\n")
        display(demo_ex.to_frame())

    if plot_grayscale:
        print(f"\n\n... GRAYSCALE IMAGE PLOT ...\n")
        plt.figure(figsize=(12, 12))
        plt.imshow(open_gray16(demo_ex.f_path), cmap="gray")
        plt.title(f"Original Grayscale Image For ID: {demo_ex.id}", fontweight="bold")
        plt.axis(False)
        plt.show()

    if plot_binary_segmentation:
        print(f"\n\n... BINARY SEGMENTATION MASKS ...\n")
        plt.figure(figsize=(20, 10))
        for i, _seg_type in enumerate(["lb", "sb", "st"]):
            if pd.isna(demo_ex[f"{_seg_type}_seg_rle"]): continue
            plt.subplot(1, 3, i + 1)
            plt.imshow(rle_decode(demo_ex[f"{_seg_type}_seg_rle"], shape=(demo_ex.slice_w, demo_ex.slice_h), color=1))
            plt.title(f"RLE Encoding For {SF2LF[_seg_type]} Segmentation", fontweight="bold")
            plt.axis(False)
        plt.tight_layout()
        plt.show()

    if plot_overlay:
        print(f"\n\n... IMAGE WITH RGB SEGMENTATION MASK OVERLAY ...\n")
        _rle_strs = [demo_ex[f"{_seg_type}_seg_rle"] if not pd.isna(demo_ex[f"{_seg_type}_seg_rle"]) else None for
                     _seg_type in ["lb", "sb", "st"]]
        seg_overlay = get_overlay(demo_ex.f_path, _rle_strs, img_shape=(demo_ex.slice_w, demo_ex.slice_h))

        plt.figure(figsize=(12, 12))
        plt.imshow(seg_overlay)
        plt.title(f"Segmentation Overlay For ID: {demo_ex.id}", fontweight="bold")
        handles = [Rectangle((0, 0), 1, 1, color=_c) for _c in
                   [(0.667, 0.0, 0.0), (0.0, 0.667, 0.0), (0.0, 0.0, 0.667)]]
        labels = ["Large Bowel Segmentation Map", "Small Bowel Segmentation Map", "Stomach Segmentation Map"]
        plt.legend(handles, labels)
        plt.axis(False)
        plt.show()

    print("\n\n... SINGLE ID EXPLORATION FINISHED ...\n\n")


In [None]:
print("\n... SINGLE ID EXPLORATION STARTED ...\n\n")

# Pick a id case to visulize
DEMO_ID = "case123_day20_slice_0082"
demo_ex = train_df[train_df.id == DEMO_ID].squeeze()  # change dimentions from 1,18 to 18

print(f"\n... WITH DEMO_ID=`{DEMO_ID}` WE HAVE THE FOLLOWING DEMO EXAMPLE TO WORK FROM... \n\n")
display(demo_ex.to_frame())  # Convert Series to DataFrame.

print(f"\n\n... LET'S PLOT THE IMAGE FIRST ...\n")
plt.figure(figsize=(12, 12))
plt.imshow(open_gray16(demo_ex.f_path), cmap="gray")
plt.title(f"Original Grayscale Image For ID: {demo_ex.id}", fontweight="bold")
plt.axis(False)
plt.show()

print(f"\n\n... LET'S PLOT THE 3 SEGMENTATION MASKS ...\n")

plt.figure(figsize=(20, 10))
for i, _seg_type in enumerate(["lb", "sb", "st"]):
    if pd.isna(demo_ex[f"{_seg_type}_seg_rle"]): continue
    plt.subplot(1, 3, i + 1)
    plt.imshow(rle_decode(demo_ex[f"{_seg_type}_seg_rle"], shape=(demo_ex.slice_w, demo_ex.slice_h), color=1))
    plt.title(f"RLE Encoding For {SF2LF[_seg_type]} Segmentation", fontweight="bold")
    plt.axis(False)
plt.tight_layout()
plt.show()

print(f"\n\n... LET'S PLOT THE IMAGE WITH AN RGB SEGMENTATION MASK OVERLAY ...\n")

# We need to normalize the loaded image values to be between 0 and 1 or else our plot will look weird
_img = open_gray16(demo_ex.f_path, to_rgb=True)
_img = ((_img - _img.min()) / (_img.max() - _img.min())).astype(np.float32)
_seg_rgb = np.stack([rle_decode(demo_ex[f"{_seg_type}_seg_rle"], shape=(demo_ex.slice_w, demo_ex.slice_h), color=1) if
                     not pd.isna(demo_ex[f"{_seg_type}_seg_rle"]) else
                     np.zeros((demo_ex.slice_w, demo_ex.slice_h)) for
                     _seg_type in ["lb", "sb", "st"]], axis=-1).astype(np.float32)
seg_overlay = cv2.addWeighted(src1=_img, alpha=0.99,
                              src2=_seg_rgb, beta=0.33, gamma=0.0)

plt.figure(figsize=(12, 12))
plt.imshow(seg_overlay)
plt.title(f"Segmentation Overlay For ID: {demo_ex.id}", fontweight="bold")
handles = [Rectangle((0, 0), 1, 1, color=_c) for _c in [(0.667, 0.0, 0.0), (0.0, 0.667, 0.0), (0.0, 0.0, 0.667)]]
labels = ["Large Bowel Segmentation Map", "Small Bowel Segmentation Map", "Stomach Segmentation Map"]
plt.legend(handles, labels)
plt.axis(False)
plt.show()

print(f"\n\n... LET'S PRINT THE RELEVANT INFORMATION ...\n")
print(f"\t--> IMAGE CASE ID              : {demo_ex.case_id}")
print(f"\t--> IMAGE DAY NUMBER           : {demo_ex.day_num}")
print(f"\t--> IMAGE SLICE WIDTH          : {demo_ex.slice_w}")
print(f"\t--> IMAGE SLICE HEIGHT         : {demo_ex.slice_h}")
print(f"\t--> IMAGE PIXEL SPACING WIDTH  : {demo_ex.px_spacing_w}")
print(f"\t--> IMAGE PIXEL SPACING HEIGHT : {demo_ex.px_spacing_h}")

print("\n\n... SINGLE ID EXPLORATION FINISHED ...\n\n")

# cleanup
del _img
del _seg_rgb,
del seg_overlay


In [None]:
# Plot 3 random-ids where all tumor locales are present (max one id per case)
N_TO_PLOT = 3
for _id in train_df[train_df.n_segs == 3].groupby("case_id")["id"].first().sample(N_TO_PLOT):
    examine_id(_id)

### Investigate the segmentation 
The counter plot can show us that most of the images have larg bowel segmentation mask, and less cases of stomach and small bowel cases.<br>
Each case contaion 3 masks of segmantataion, lets check how many cases have no segmentation masks

In [None]:
def get_seg_combo_str(row):
    seg_str_list = []
    if row["lb_seg_flag"]: seg_str_list.append("Large Bowel")
    if row["sb_seg_flag"]: seg_str_list.append("Small Bowel")
    if row["st_seg_flag"]: seg_str_list.append("Stomach")
    if len(seg_str_list) > 0:
        return ", ".join(seg_str_list)
    else:
        return "No Mask"


train_df["seg_combo_str"] = train_df.progress_apply(get_seg_combo_str, axis=1)

fig = px.histogram(train_df, train_df["n_segs"].astype(str), color="seg_combo_str",
                   title="<b>Number of Segmentation Masks Per Image</b>",
                   labels={"x": "Number of Segmentation Masks Per Image",
                           "seg_combo_str": "<b>Segmentation Masks Present</b>"})
fig.show()

The distrebution shows that most of the images does not have segmantation mask.

Now lets print the summeriezd data and check the number.

In [None]:
print("Total number of images : ", train_df.shape[0])
no_ann_df = train_df[~train_df['lb_seg_flag'] & ~train_df['sb_seg_flag'] & ~train_df['st_seg_flag']]
print("Number of images without any annotation : ", no_ann_df.shape[0], "percentege : ",
      round(no_ann_df.shape[0] / train_df.shape[0], 2))
print("Number of images with 1 annotation or more : ",
      (train_df.shape[0] - no_ann_df.shape[0]), "percentege : ",
      round((train_df.shape[0] - no_ann_df.shape[0]) / train_df.shape[0], 2))
print("Number of Stomach annotation: ", train_df[train_df['st_seg_flag']].shape[0])
print("Number of Small bowel annotation: ", train_df[train_df['sb_seg_flag']].shape[0])
print("Number of Large bowel annotation: ", train_df[train_df['lb_seg_flag']].shape[0])

From the observation we can lean that,<br>
Most of the cases that have only 1 segmentation mask is the **Stomach**.<br>
Images that has Small bowel segmentation mask usualy has large bowel segmentation mask.<br>
As we saw from the previous cell, most of the masks don't have any segmentation mask. <br>
Let's plot without the images without mask to see the distrebution of the annotation.


In [None]:
def add_seg_str(x):
    list_seg = []
    if x["lb_seg_flag"]: list_seg.append("Large Bowel")
    if x["sb_seg_flag"]: list_seg.append("Small Bowel")
    if x["st_seg_flag"]: list_seg.append("Stomach")
    if len(list_seg) > 0:
        return ", ".join(list_seg)
    else:
        return "No Mask"


train_df["seg_str"] = train_df.apply(add_seg_str, axis=1)
fig = sns.catplot(x="n_segs", kind="count",hue = 'seg_str', data=train_df[train_df['n_segs']>0]).set(title = 'Segmentation masks per image')
fig.set_xlabels("number of segmentation masks")
fig.set_ylabels("number of images")

It is clear now that most of the images with 2 segmentation masks are Large Bowel and Small Bowel.
Lets investigate the images with only 1 annotation

In [None]:
n_sb = train_df[(train_df['seg_str'] == 'Small Bowel') & ( train_df['n_segs'] == 1)].shape[0]
n_st = train_df[(train_df['seg_str'] == 'Stomach') & ( train_df['n_segs'] == 1)].shape[0]
n_lb = train_df[(train_df['seg_str'] == 'Large Bowel') & ( train_df['n_segs'] == 1)].shape[0]
bar_df = pd.DataFrame([['Small Bowel',n_sb],['Stomach',n_st],['Large Bowel',n_lb]],columns=['Type','Number of images'])
sns.barplot(x="Type", y="Number of images", data=bar_df).set_title("Only 1 mask of segmentation")

# Clean up
del n_sb
del n_st
del n_lb

Now we can see clearly that the vast majority of images that has 1 segmentation mask are stomach

## Investigate Image Sizes 
While observing the images we identified different image size. We can layout the types and thier distribution and plot it.

In [None]:
# Count the examples
total_obs = train_df['id'].count()
print("Total Observations:", total_obs)

train_df_slices_subset = train_df.drop_duplicates(subset=["slice_w", "slice_h"])
color = "(" + train_df_slices_subset["slice_w"].astype(str) + "x" + train_df_slices_subset["slice_h"].astype(str) + ")"

fig = px.scatter(train_df_slices_subset, x="slice_w", y="slice_h",
                 size=train_df.groupby(["slice_w", "slice_h"])["id"].transform("count").iloc[train_df_slices_subset.index],
                 color=color,
                 title="<b>The Various Image Sizes</b>",
                 labels={"color": "<b>Size Legend</b>",
                         "size": "<b>Total Observations</b>",
                         "slice_h": "<b>Image Slice Height in pixels</b>",
                         "slice_w": "<b>Image Slice Width in pixels</b>"},
                 size_max=128)
fig.show()

<b>After observing the image sized we can see that:</b>
* There are **38,496** total examples.
* There are **4** unique sizes:
    * $234 \times 234$
        * Least frequent image size
        * Smallest image size
        * Only 44 of the 38,496 occurences are this size (0.37%)
    * $266 \times 266$
        * Most frequent image size
        * Second smallest image size
        * 25,920 of the 38,496 occurences are this size (67.33%)
    * $276 \times 276$
        * Second least frequent image size
        * Second largest image size
        * 1,200 of the 38,496 occurences are this size (3.12%)
    * $310 \times 360$
        * Second most frequent image size
        * Largest image size
        * 11,232 of the 38,496 occurences are this size (29.17%)

## Investigate Pixel Spacing 
While observing the images we identified different image pixel spacing. We can layout the types and thier distribution and plot it.
Pixel Spacing defined the physical distance in the patient between the center of each pixel, specified by a numeric pair - adjacent row spacing (delimiter) adjacent column spacing in mm. See Section 10.7.1.3 for further explanation.


In [None]:
train_df_spacing_subset = train_df.drop_duplicates(subset=["px_spacing_w", "px_spacing_h"])
color = "(" + train_df_slices_subset["px_spacing_w"].astype(str) + "x" + train_df_slices_subset["px_spacing_h"].astype(str) + ")"
fig = px.scatter(train_df_slices_subset, x="px_spacing_w", y="px_spacing_h",
                 size=train_df.groupby(["px_spacing_w", "px_spacing_h"])["id"].transform("count").iloc[train_df_slices_subset.index],
                 color=color,
                 title="<b>The Various Pixel Spacings</b>",
                 labels={"color": "<b>Pixel Spacing Legend</b>",
                         "size": "<b>Number Of Observations</b>",
                         "px_spacing_h": "<b>Pixel Spacing Height in mm</b>",
                         "px_spacing_w": "<b>Pixel Spacing Width in mm</b>"},
                 size_max=128)
fig.show()

<b>Pixel spacing observation tells us that:</b>
* There are 38,496 total examples.
* There are only 2 unique sets of pixel spacings:
    * $1.50mm \times 1.50mm$
        * Most frequent pixel spacing
        * Smallest pixel spacing (barely)
        * 37,296 of the 38,496 occurences are this size (96.88%)
    * $1.63mm \times 1.63mm$
        * Least frequent image size
        * Largest pixel spacing (barely)
        * 1,200 of the 38,496 occurences are this size (3.12%)

TBD: to check predictions according to the pixel spacing.

## Investigate mask sizes/areas
let's see of the masks overlap, that may cause an issue in segmentation detection or make better preference if we see that there is a pattern.

In [None]:
def get_mask_area(rle):
    "Returns the sum of area of the rle mask"
    if pd.isna(rle):
        return None
    # sum every other number in an RLE
    return sum([int(x) for x in rle.split()[1::2]])


train_df["lb_seg_area"] = train_df.lb_seg_rle.apply(get_mask_area)
train_df["sb_seg_area"] = train_df.sb_seg_rle.apply(get_mask_area)
train_df["st_seg_area"] = train_df.st_seg_rle.apply(get_mask_area)

fig = px.histogram(train_df, ["lb_seg_area", "sb_seg_area", "st_seg_area"],
                   title="<b>Mask Areas Overlaid</b>",
                   barmode="overlay",
                   labels={"value": "<b>Mask Area</b>"})
fig.show()

The distributions of mask area is mostly normal although it skews slightly to the smaller side...
All the distributions are similar although the Stomach distribution has an odd gap between 400-750 pixels.
It's interesting to note that, while not common, we do have some VERY large masks (>7500 pixels)
Also, it's kind of funny that the biggest masks are for small bowel

In [None]:
print("\n\n\n... EXAMPLE WITH A LARGE AMOUNT OF SEGMENTATION MASK ...\n")
examine_id("case134_day22_slice_0102")

## Case ID's

The host described the case id as:
<br>
"Each case in this competition is represented by multiple sets of scan slices (each set is identified by the day the scan took place). Some cases are split by time (early days are in train, later days are in test) while some cases are split by case - the entirety of the case is in train or test. The goal of this competition is to be able to generalize to both partially and wholly unseen cases."

In this following section we will observe the distribution of images per case id.<br>

In [None]:
fig = px.histogram(train_df, train_df.case_id.astype(str),
                   color="day_num_str",
                   title="<b>Distribution Of Images Per Case ID</b>",
                   labels={"x":"<b>Case ID</b>", "day_num_str": "<b>The Day The Scan Took Place</b>"},
                   text_auto=True,
                   width=2000)
fig.show()

When we colour by day, we can see that all cases are made up (mostly) of groups of **144**, or less frequently, **80**, images from different days.

Now lets see the distrebution according to the segmentation masks.

In [None]:
seg_masks = ['lb_seg_flag', 'sb_seg_flag', 'st_seg_flag']

for seg_mask in seg_masks:
    sub_df = train_df[train_df[seg_mask]]
    fig = px.histogram(sub_df, sub_df.case_id.astype(str), color="day_num_str",
                       title=f"<b>Distribution Of Images Per Case ID with {seg_mask}</b>",
                       labels={"x": "<b>Case ID</b>", "day_num_str": "<b>The Day The Scan Took Place</b>"},
                       text_auto=True, width=2000)
    fig.show()

When we look at the distrebution per segmentation mask we can see that there isn't any segmentation mask with the same count.
<p>
    Now, lets look at the distrebution of the case id with segmentation masks.
</p>

In [None]:
fig = px.histogram(train_df, train_df.case_id.astype(str), color="seg_str",
                   title=f"<b>Distribution Of Images Per Case ID and Segmentation masks/b>",
                   labels={"x": "<b>Case ID</b>", "day_num_str": "<b>The Day The Scan Took Place</b>"}, text_auto=True,
                   width=2000)
fig.show()

No mask appear to be the most dominant segmentation mask as we can estimate from the previuse section.<br>
Most fo the cases contain segmentation masks of Large bowel and small bowel.<br>
And the majoraty of the cases has about 700 images and in those 700 almost 400 has no segmentation mask.<br>
We can expect the model to handel well when both orgens are in the picture and identify both of there segmentation masks.

### Plot case id sequence data
Lets plot the sequence data to see movement of orgens in the pictures to see if there is some pattern.

In [None]:
def plot_case(case_id,day = None,df = train_df,_figsize = (20,30),num_cols =10,max_rows = 10):
    """
    Function to plot all images of a single case including segmentation masks
    """
    print(f"----Plotting images for case {case_id}----")
    
    # initialize
    case_df = df[df['case_id']==case_id]
    # Handle specific day
    if day is not None:
        _case_df = case_df[case_df.day_num == day]
        # check if there is no pictures in that day
        if len(_case_df<=0):
            print(f"there are no samples in the day specified")
        else:
            shrink_ratio = len(_case_df)/len(case_df) # save shrink ratio
            case_df = _case_df
            # change fig size to match the new ratio (original row,original col*shrink ratio)
            _figsize = (_figsize[0],int(1.25*_figsize[1]* shrink_ratio))
        del _case_df # clean up
    ex_num = len(case_df)
    
    # Get relevant data
    case_img_path = case_df.f_path.tolist()
    case_rle = [_rles for _rles in case_df[["lb_seg_rle", "sb_seg_rle", "st_seg_rle"]].values.tolist()]
    case_img_shapes = [(_w,_h) for _w,_h in case_df[["slice_w","slice_h"]].values.tolist()]
    all_overlays = [get_overlay(img_path, rle_strs, img_shape) for img_path, rle_strs, img_shape in zip(case_img_path, case_rle, case_img_shapes)]
    
    # Plot the images of the case
    num_rows = int(np.ceil(ex_num/num_cols))
    
    if num_rows > max_rows:
        num_rows = max_rows
    
    # Define the grid cells
    fig, axs = plt.subplots(ncols=num_cols, nrows = num_rows,figsize = _figsize,sharey = True)
    fig.suptitle(f'Case images - Case {case_id}')

    img = all_overlays.pop()
    row = 0
    col = 0
    while len(all_overlays) and row < num_rows:
        while len(all_overlays) and col < num_cols:
            axs[row,col].imshow(img,cmap = "gray",aspect = 'auto')
            axs[row,col].axis(False)
            axs[row,col].set_xticklabels([])
            axs[row,col].set_yticklabels([])
            axs[row,col].axis("off")
            img = all_overlays.pop()
            col += 1
        row +=1
        col = 0
    # adjust the height and width space and show figure
    plt.subplots_adjust(wspace=.001, hspace=.001)
    plt.show()
    

# Plot some cases
plot_case(7)


As shown in the cases the first and lasts picturs dosn't contaion any segmentation.
<br>Lets check the maximum and minimum images per case to see outliners

In [None]:
case_count = train_df.groupby(by=['case_id']).count().reset_index()[['case_id', "id"]]
mincase_id, min_img_per_case = case_count.iloc[case_count['id'].idxmin()]
maxcase_id, max_img_per_case = case_count.iloc[case_count['id'].idxmax()]
print(f"The MAX images per case is {max_img_per_case} in the case {mincase_id}")
print(f"The MIN images per case is {min_img_per_case} in the case {maxcase_id}")

# Clean up
del case_count

Lets plot the segmentation destribution of there segmentation maps

In [None]:
case_df = train_df[train_df['case_id'] == mincase_id]
fig = px.histogram(case_df, case_df.seg_str.astype(str), color='seg_str',
                   title=f"<b>Distribution Of Images for min Case ID and Segmentation masks</b>",
                   labels={"x": "<b>Case ID</b>", "day_num_str": "<b>The Day The Scan Took Place</b>"}, text_auto=True,
                   width=2000)
fig.show()
case_df = train_df[train_df['case_id'] == maxcase_id]
fig = px.histogram(case_df, case_df.seg_str.astype(str), color='seg_str',
                   title=f"<b>Distribution Of Images for max Case ID and Segmentation masks</b>",
                   labels={"x": "<b>Case ID</b>", "day_num_str": "<b>The Day The Scan Took Place</b>"}, text_auto=True,
                   width=2000)
fig.show()
# Clean up
del case_df

The main diffrece between them is the Stomach annotation.<br>
Manly the distribution correlated with the previuse analysis we did.

### Mask dataset creation, class overlap.

It's important to determine if the the masks overlap one another (**multilabel**) or not (**multiclass**). To do this, we will quickly create a dataset of **`npy`** files. During this creation process we will check for overlap.

To save the masks and images masks we will use npy type files for quickly reading and saving 
<br><center><img src="https://miro.medium.com/max/492/1*xwpjjSdZwiOMnPJtdp9L2w.png" width=50%></center><br><center>

In [None]:
def get_row_masks(row):
    _slice_shape = (row.slice_w, row.slice_h)

    if not pd.isna(row.lb_seg_rle):
        lb_mask = rle_decode(row.lb_seg_rle, _slice_shape, )
    else:
        lb_mask = np.zeros(_slice_shape)
    if not pd.isna(row.sb_seg_rle):
        sb_mask = rle_decode(row.sb_seg_rle, _slice_shape)
    else:
        sb_mask = np.zeros(_slice_shape)
    if not pd.isna(row.st_seg_rle):
        st_mask = rle_decode(row.st_seg_rle, _slice_shape)
    else:
        st_mask = np.zeros(_slice_shape)
    return lb_mask, sb_mask, st_mask


def is_overlap(_arr):
    return _arr.sum(axis=-1).max()>1


def get_mask_overlap(row, check_overlap=False):
    lb_mask, sb_mask, st_mask = get_row_masks(row)

    mask_arr = np.stack([lb_mask, sb_mask, st_mask], axis=-1).astype(np.uint8)
    np.save(f"./npy_files/{row.id}_mask", mask_arr)

    if check_overlap:
        if is_overlap(mask_arr):
            return np.where(mask_arr.sum(axis=-1) > 1, 1, 0).sum()
        else:
            return 0
    
if not os.path.isdir(NPY_DIR): os.makedirs(NPY_DIR, exist_ok=True)
train_df["seg_overlap_area"] = train_df.progress_apply(lambda x: get_mask_overlap(x, check_overlap=True), axis=1)

print("\n... LET'S EXAMINE THE IMAGE WITH THE HIGHEST AMOUNT OF OVERLAP ...\n")

examine_id(train_df[train_df.seg_overlap_area==train_df.seg_overlap_area.max()].id.values[0])

fig = px.histogram(train_df[train_df.seg_overlap_area>0], "seg_overlap_area", color="seg_combo_str", nbins=50,
                   log_y=True, title="<b>Distribution of Non-Zero Segmentation Overlaps <sub>(Count Is Logarithmic)</sub></b>",  
                   labels={"seg_overlap_area":"<b>Area of Mask Overlap</b>", 
                           "seg_combo_str":"<b>Segmentation Masks In Image</b>"})
fig.update_layout(legend=dict(
    yanchor="top",
    y=0.99,
    xanchor="right",
    x=0.995
))
fig.show()

### Main observation:

* There is overlap, and while it is not that common, some images exhibit a high degree of overlap.
* This means that we cannot frame the problem as simple categorical semantic segmentation.
* We must instead frame the problem as multi-label semantic segmentation
* This means our mask will take the form --> $W \times H \times 3$
    * Where the channel dimensions are binary masks for each respective segmentation type
    * This will allow for the masks to overlap


**Note On The Plotted Image Ubove:**
* In the examined image below we can see a section of the small bowel is completely inside of a larger section of larger bowel.
* This shows why treating this as multi-label semantic segmentation is so important!


### Pixel Values In Our Dataset

It's important to analyse the dataset because we will need to normalize the data to convert it into a format that is more expected for machine learning (uint8 (0-255) or float32 (0-1)). Without knowing the limits of the images, we may diminish the resolution of the data by accident when normalizing.


In [None]:
def get_image_vals(row):
    _img = cv2.imread(row.f_path, -1)
    _nonzero_px_count = np.count_nonzero(_img)
    
    row["nonzero_num_pxs"] = _nonzero_px_count
    row["max_px_value"] = _img.max()
    row["min_px_value"] = _img.min()
    row["mean_px_value"] = _img.mean()
    row["nonzero_mean_px_value"] = _img.sum()/_nonzero_px_count
    del _img
    return row

train_df = train_df.progress_apply(get_image_vals, axis=1)

print(f"\n\n\n... UPDATED TRAIN DATAFRAME ...\n")
display(train_df.head())
print("\n\n")

for _c in ["nonzero_num_pxs", "max_px_value", "min_px_value", "mean_px_value", "nonzero_mean_px_value"]:
    print(f"\n... STATS FOR COLUMN --> `{_c}`...")
    print(f"\t--> MIN  VAL: {train_df[_c].min():.1f}")
    print(f"\t--> MEAN VAL: {train_df[_c].mean():.1f}")
    print(f"\t--> MAX  VAL: {train_df[_c].max():.1f}")

Interestingly the maximum value in the dataset is equiavlent to less than half of an int16 or a quarter of a uint16.
* Max Value for UINT16
    * **65535**
* Max Value for INT16
    * **32767**
* Half of Max Value for INT16
    * **16384**
* Actual Max Value in the dataset
    * **15865**

### Identify Any Heuristics Or Rules Regarding Segmentation

Lets try to fined pattern or ruls for the given dataset.

In [None]:
train_df["slice_count"] = train_df.id.apply(lambda x: int(x.rsplit("_", 1)[-1]))

print("\n... CASE-ID/DAY-NUM SLICE INFORMATION ...\n")
train_df.groupby(["case_id", "day_num"])["slice_count"].max().value_counts()

In [None]:
slice_to_occurence_df = train_df.groupby("slice_count")[["lb_seg_flag", "sb_seg_flag", "st_seg_flag"]].sum().reset_index()
fig = px.bar(slice_to_occurence_df,
             x="slice_count", y=["lb_seg_flag", "sb_seg_flag", "st_seg_flag"],
             orientation="v",
             labels={
                "slice_count": "<b>Slice Number</b>",
                "value": "<b>Number Of Examples</b>",
             },
             title="<b>Number of Examples Per Example For Our 3 Organs</b>")

fig.update_layout(legend_title="<b>Organ Type Legend</b>")
fig.show()

print("\n... WHICH SLICES ARE ALWAYS BLANK (NO SEG) BY LABEL ...\n")
keep_slice_blank_map = {
    _sh_lbl: slice_to_occurence_df[slice_to_occurence_df[f"{_sh_lbl}_seg_flag"] == 0].slice_count.to_list() for _sh_lbl in ["lb", "sb", "st"]
}
keep_slice_blank_map

For a given **case-id** and **day number** there are two different amounts of scans present
* 144 slices --> 259 instances
* 80 slices ---> 15 instances

Some other observations about our training dataset
* There are no examples for slices number **1, 138, 139, 140, 141, 142, 143 or 144** that have any segmentation masks
* If we break it down by organ we get the following no-value slices for each respective organ
    * Large Bowel – **1, 138, 139, 140, 141, 142, 143, 144**
    * Small Bowel – **1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 138, 139, 140, 141, 142, 143, 144**
    * Stomach – **1, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144**

## Create a 3D GIF for case with mask

Because the images is a 3D model of a patiant orgens, a good way to look at the sequence data is visualize it by using 3D gif.

In [None]:
def create_animation(case_id, day_num, df=train_df):
    
    sub_df = df[(df.case_id==case_id) & (df.day_num==day_num)]
    
    f_paths  = sub_df.f_path.tolist()
    lb_rles  = sub_df.lb_seg_rle.tolist()
    sb_rles  = sub_df.sb_seg_rle.tolist()
    st_rles  = sub_df.st_seg_rle.tolist()
    slice_ws = sub_df.slice_w.tolist()
    slice_hs = sub_df.slice_h.tolist()
    
    animation_arr = np.stack([
        get_overlay(img_path=_f, rle_strs=(_lb, _sb, _st), img_shape=(_w, _h)) \
        for _f, _lb, _sb, _st, _w, _h in \
        zip(f_paths, lb_rles, sb_rles, st_rles, slice_ws, slice_hs)
    ], axis=0)
    
    fig = plt.figure(figsize=(8,8))
    
    plt.axis('off')
    im = plt.imshow(animation_arr[0])
    plt.title(f"3D Animation for Case {case_id} on Day {day_num}", fontweight="bold")
    
    def animate_func(i):
        im.set_array(animation_arr[i])
        return [im]
    plt.close()
    
    return animation.FuncAnimation(fig, animate_func, frames = animation_arr.shape[0], interval = 1000//12)


create_animation(case_id=115, day_num=0)

## Scan with errors
While observing those cases and looking in the form, we noted they have wrong segmentation masks

https://www.kaggle.com/competitions/uw-madison-gi-tract-image-segmentation/discussion/319963

Case 7 – Day 0
Case 81 – Day 30

In [None]:
problem_case_1 = 7
problem_day_1 = 0
problem_case_2 = 81
problem_day_2 = 30

print("\n... PROBLEM CASE NUMBER 1 ...\n")
create_animation(case_id=problem_case_1, day_num=problem_day_1)
print("\n... PROBLEM CASE NUMBER 2 ...\n")
create_animation(case_id=problem_case_2, day_num=problem_day_2)

## Remove broken scans

In [None]:
remove_ids = ["case7_day0", "case81_day30"]
for _id in remove_ids:
    train_df = train_df[~train_df.id.str.contains(_id)].reset_index(drop=True)

In [None]:
"APPROXIMATE CLASS WEIGHTING (CATEGORICAL ASSUMPTION)".lower().title()

### Approximate Class Weighting (Categorical Assumption)

Let us calculate a naive approximation of the weights of classes based on the frequency of occurence of various classes. For the purpose of this investigation we will treat the background as it's own class (the most common class probably).


In [None]:
# # Get total image area
train_df["img_px_area"] = train_df["slice_w"] * train_df["slice_h"]
train_df[["lb_seg_area", "sb_seg_area", "st_seg_area"]].fillna(0, inplace=True)
train_df["bg_area"] = (train_df["img_px_area"] - train_df[["lb_seg_area", "sb_seg_area", "st_seg_area"]].sum(axis=1)).astype(int)

print(f"\nALL TRAINING DATA PIXEL COUNT         : {train_df.img_px_area.sum()}")
print(f"BACKGROUND TRAINING DATA PIXEL COUNT  : {train_df.bg_area.sum()}")
print(f"LARGE BOWEL TRAINING DATA PIXEL COUNT : {train_df.lb_seg_area.sum()}")
print(f"SMALL BOWEL TRAINING DATA PIXEL COUNT : {train_df.sb_seg_area.sum()}")
print(f"STOMACH TRAINING DATA PIXEL COUNT     : {train_df.st_seg_area.sum()}\n")

print(f"\nALL TRAINING DATA PIXEL COUNT (%)         : %{100:.4f}")
print(f"BACKGROUND TRAINING DATA PIXEL COUNT (%)  : %{100 * train_df.bg_area.sum() / train_df.img_px_area.sum():.4f}")
print(f"LARGE BOWEL TRAINING DATA PIXEL COUNT (%) : %{100 * train_df.lb_seg_area.sum() / train_df.img_px_area.sum():.4f}")
print(f"SMALL BOWEL TRAINING DATA PIXEL COUNT (%) : %{100 * train_df.sb_seg_area.sum() / train_df.img_px_area.sum():.4f}")
print(f"STOMACH TRAINING DATA PIXEL COUNT (%)     : %{100 * train_df.st_seg_area.sum() / train_df.img_px_area.sum():.4f}\n")

From the resaults we can varify for each label:
* Class **0** - **Background**
  * Total Pixel Count In Training Dataset = **3113614754** 
* Class **1** - **Large Bowel**
  * Total Pixel Count In Training Dataset = **21827402**
* Class **2** - **Small Bowel**
  * Total Pixel Count In Training Dataset = **19898898**
* Class **3** - **Stomach**
  * Total Pixel Count In Training Dataset = **11064002**


In [None]:
# Remove unnecessary files
for dirname, _, filenames in os.walk(NPY_DIR):
    for filename in filenames:
        os.remove(os.path.join(dirname, filename))

# Modeling 

Before we start running ML algorithems to preform semantic segmentation, lets modify the data in a model format.
That means we need to:
1. Creat dataset folds - validation and training
2. Creat mask dataset

### Models constants

In [None]:
# Debuge mode 
DEBUGE = False
DEB_EXAMPLES = 20 # Debuge training amount
# specify number of folds for cross validation 
N_FOLDS = 4
IMAGE_SHAPE = SEG_SHAPE = (256,256)
# define batch sizes, only use them if were not in debuge mode
TR_BATCH = 32  # 64 and 128 are to big for memory in kaggle
VALID_BATCH = TR_BATCH * 2
# Data loader constant
NUM_WORKERS = 2

# Model configuration
ENCODER = 'efficientnet-b7'
NUM_CLASSES = 3  # Number of orgens.
DEVICE = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")  # Use GPU is avaliable
# set learning rates
LR = 2e-3
MIN_LR = 1e-6
# Epochs and weight decay
EPOCHS = 10
WD = 1e-6
# Maximum iterations
T_MAX = int(30000/TR_BATCH*EPOCHS)+50
# Number of iterations for the first restart
T_0 = 25

## Cross validation
Cross-Validation has two main steps: splitting the data into subsets (called folds) and rotating the training and validation among them. The splitting technique commonly has the following properties:

* Each fold has approximately the same size.
* Data can be randomly selected in each fold or stratified.
* All folds are used to train the model except one, which is used for validation. That validation fold should be rotated until all folds have become a validation fold once and only once.
* Each example is recommended to be contained in one and only one fold.

K-fold and CV are two terms that are used interchangeably. K-fold is just describing how many folds you want to split your dataset into. Many libraries use k=10 as a default value representing 90% going to training and 10% going to the validation set. The next figure describes the process of iterating over the picked ten folds of the dataset.

In [None]:
# User sklearn cross validation groups
gkf = GroupKFold(n_splits=N_FOLDS) 

### train only images with segmentation masks ###
# train_df = train_df[train_df.n_segs>0].reset_index(drop=True)
### train only images with segmentation masks end###

train_df["which_segs"] = train_df.lb_seg_flag.astype(int).astype(str)+\
                         train_df.sb_seg_flag.astype(int).astype(str)+\
                         train_df.st_seg_flag.astype(int).astype(str)

folds = []# list of indexes for validation and training folds.
for train_idxs, val_idxs in gkf.split(train_df["id"], train_df["which_segs"], train_df["case_id"]):
    folds.append([train_idxs,val_idxs])

print(f"Shape of folds: {np.shape(folds)}")

# lets print one test and validation set to see the data
print("\nFOLD 1: TRAIN DF\n\n")
train_df.iloc[folds[0][0]].head()

print("\n\n\n\nFOLD 1: VAL DF\n\n")
train_df.iloc[folds[0][1]].head()


## Mask dataset

We will create our masks to have a shape of $W×H×3$ where each channel is binary mask for a particular segmentation class in the order

* Channel 0 --> "Large Bowel"
* Channel 1 --> "Small Bowel"
* Channel 2 --> "Stomach"

We will first frame this problem as simple categorical segmentation and simply overlap values from 2-->0
i.e. if we have overlapping Stomach, and Small Bowel... the Small Bowel mask will overwrite the Stomach mask
i.e. if we have overlapping Large Bowel and Small Bowel... the Large Bowel mask will overwrite the Small Bowel mask.

We will save both version of this dataset so we can try experimenting later

At this point we have to determine the size of our dataset... as most images are fairly small, let's target a size of  $256×256$

In [None]:
def save_get_mask_path(row, output_dir, resize_to):
    """
    The function saves the masks in the output dir.
    1. creat segmentation mask
    2. Determine if the problem is multi class or multi label
    3. Save the mask with the correct shape.
    """
    lb_mask, sb_mask, st_mask = get_row_masks(row)

    _output_style = "multiclass" if "multiclass" in output_dir else "multilabel"
    if _output_style == "multiclass":
        mask_arr = st_mask * 3  # stomach = 3
        mask_arr = np.where(sb_mask == 1, 2, mask_arr)  # small bowel = 2
        mask_arr = np.where(lb_mask == 1, 1, mask_arr)  # large bowel = 1
    else:
        mask_arr = np.stack([lb_mask, sb_mask, st_mask], axis=-1)

    # resize to image shape
    mask_arr = cv2.resize(mask_arr, resize_to, interpolation=cv2.INTER_NEAREST).astype(np.uint8)
    mask_path = os.path.join(output_dir, f"{row.id}_mask")
    np.save(mask_path, mask_arr)
    return mask_path + ".npy"

# Create both multi label and multi class masks
# styles = ['multilabel','multiclass']
styles = ['multilabel']
for style in styles:
    _output_dir = f"/kaggle/working/{style}/npy_files"
    if not os.path.isdir(_output_dir): os.makedirs(_output_dir, exist_ok=True)
    train_df[f"{style}_mask_path"] = train_df.progress_apply(lambda _row: save_get_mask_path(_row, _output_dir, resize_to=SEG_SHAPE), axis=1)

## Load dataset before training
In this step we will load the data into 2 subsets for training.

We will build pytorch model dataset to match the model format.

Selected parameters:
* batch size = 32, we tried 64 and 128 but kaggle machine don't have enoght ram memory to upload that much information.
* validation batch = batch size*2, to test validation we don't realy need to devide to batches but to make sure that the memory will hold with this size we will take twice as much data in the validation batch.


In [None]:
def prepare_loaders(fold, debug=False):
    """
    Get pytorch train and validation data loaders(image paths with masks).
    """
    tr_idxs, val_idxs = folds[fold]  # get trian and validation indexes
    if debug:
        tr_idxs = tr_idxs[:DEB_EXAMPLES]
        val_idxs = val_idxs[:DEB_EXAMPLES]
    tr_df = train_df.iloc[tr_idxs].reset_index(drop=True)
    val_df = train_df.iloc[val_idxs].reset_index(drop=True)
    # Build folds datasets
    train_dataset = BuildDataset(tr_df, transforms=data_transforms['train'])
    valid_dataset = BuildDataset(val_df, transforms=data_transforms['valid'])

    train_loader = DataLoader(train_dataset, batch_size=TR_BATCH if not debug else DEB_EXAMPLES,
                              num_workers=NUM_WORKERS, shuffle=True, pin_memory=True, drop_last=False)
    valid_loader = DataLoader(valid_dataset, batch_size=VALID_BATCH if not debug else DEB_EXAMPLES,
                              num_workers=NUM_WORKERS, shuffle=False, pin_memory=True)

    return train_loader, valid_loader

### Image augmantation
Deep neural networks require a lot of training data to obtain good results and prevent overfitting. However, it often very difficult to get enough training samples. Multiple reasons could make it very hard or even impossible to gather enough data:

To make a training dataset, you need to obtain images and then label them. For example, you need to assign correct class labels if you have an image classification task. For an object detection task, you need to draw bounding boxes around objects. For a **semantic segmentation** task, you need to assign a correct class to each input image pixel. This process requires manual labor, and sometimes it could be very costly to label the training data. For example, to correctly label medical images, you need expensive domain experts.

Sometimes even collecting training images could be hard. There are many legal restrictions for working with healthcare data, and obtaining it requires a lot of effort. Sometimes getting the training images is more feasible, but it will cost a lot of money. For example, to get satellite images, you need to pay a satellite operator to take those photos. To get images for road scene recognition, you need an operator that will drive a car and collect the required data.

Basic augmentations techniques were used almost in all papers that describe the state-of-the-art models for image recognition.

AlexNet was the first model that demonstrated exceptional capabilities of using deep neural networks for image recognition. For training, the authors used a set of basic image augmentation techniques. They resized original images to the fixed size of 256 by 256 pixels, and then they cropped patches of size 224 by 224 pixels as well as their horizontal reflections from those resized images. Also, they altered the intensities of the RGB channels in images.

Successive state-of-the-art models such as Inception, ResNet, and EfficientNet also used image augmentation techniques for training.

To do this task we will use the open source packege **albumentations**.

#### Basic image augmantation configuration:
* Resize to correct image shape.
* Make horizantal flip with randomality of 0.5
* Shif scale and rotate image.
* Select one of transforms to apply between 
    *  GridDistortion - Blur the input image using a Generalized Normal filter with a randomly selected parameters. This transform also adds multiplicative noise to generated kernel before convolution.
    *  ElasticTransform - Elastic deformation of images as described in [Simard2003]_ (with modifications). Based on https://gist.github.com/ernestum/601cdf56d2b424757de5
* CoarseDropout of the rectangular regions in the image.

The validaion data only get resize to fit the model, thats because we need the test data to be as close as we can to reality.


In [None]:
data_transforms = {
    "train": A.Compose([
        A.Resize(*IMAGE_SHAPE, interpolation=cv2.INTER_NEAREST),
        A.HorizontalFlip(p=0.5),
#         A.VerticalFlip(p=0.5),
        A.ShiftScaleRotate(shift_limit=0.0625, scale_limit=0.05, rotate_limit=10, p=0.5),
        A.OneOf([
            A.GridDistortion(num_steps=5, distort_limit=0.05, p=1.0),
# #             A.OpticalDistortion(distort_limit=0.05, shift_limit=0.05, p=1.0),
            A.ElasticTransform(alpha=1, sigma=50, alpha_affine=50, p=1.0)
        ], p=0.25),
        A.CoarseDropout(max_holes=8, max_height=IMAGE_SHAPE[0]//20, max_width=IMAGE_SHAPE[1]//20,
                         min_holes=5, fill_value=0, mask_fill_value=0, p=0.5),
        ], p=1.0),
    
    "valid": A.Compose([
        A.Resize(*IMAGE_SHAPE, interpolation=cv2.INTER_NEAREST),
        ], p=1.0)
}


## First model training - Unet
Before preforming augmantation we will train the Unet base model and see the results
<br><center><img src="https://production-media.paperswithcode.com/methods/Screen_Shot_2020-07-07_at_9.08.00_PM_rpNArED.png" width=75%></center>
U-Net is an architecture for semantic segmentation. It consists of a contracting path and an expansive path. The contracting path follows the typical architecture of a convolutional network. It consists of the repeated application of two 3x3 convolutions (unpadded convolutions), each followed by a rectified linear unit (ReLU) and a 2x2 max pooling operation with stride 2 for downsampling. At each downsampling step we double the number of feature channels. Every step in the expansive path consists of an upsampling of the feature map followed by a 2x2 convolution (“up-convolution”) that halves the number of feature channels, a concatenation with the correspondingly cropped feature map from the contracting path, and two 3x3 convolutions, each followed by a ReLU. The cropping is necessary due to the loss of border pixels in every convolution. At the final layer a 1x1 convolution is used to map each 64-component feature vector to the desired number of classes. In total the network has 23 convolutional layers.

#### Pros:
- Performs well even with smaller data
- Can be used with imagenet pretrain models
#### Cons:
- Struggles with edge cases
- Semantic Difference in Skip Connection

### Build model database
Using pytorch api we will create a dataset to train.

In [None]:
class BuildDataset(torch.utils.data.Dataset):
    def __init__(self, df, label=True, transforms=None,_style = 'multilabel'):
        self.df         = df
        self.label      = label
        self.img_paths  = df['f_path'].tolist()
        self.msk_paths  = df[f"{_style}_mask_path"].tolist()
        self.transforms = transforms
        
    def __len__(self):
        return len(self.df)
    
    def __getitem__(self, index):
        img_path  = self.img_paths[index]
        img = []
        img = load_img(img_path)
        
        if self.label:
            msk_path = self.msk_paths[index]
            msk = load_msk(msk_path)
            if self.transforms:
                data = self.transforms(image=img, mask=msk)
                img  = data['image']
                msk  = data['mask']
            img = np.transpose(img, (2, 0, 1))
            msk = np.transpose(msk, (2, 0, 1))
            return torch.tensor(img), torch.tensor(msk)
        else:
            if self.transforms:
                data = self.transforms(image=img)
                img  = data['image']
            img = np.transpose(img, (2, 0, 1))
            return torch.tensor(img)

## Model configuration

For the model configuration we need to decide on:
1. Encoder  - we selected efficientnet-b7 to get a fast and efficient encoder.
2. Number of classes - as we know there are 3 classes as the number of orgens we need to detect.
3. Device - if a GPU is avaliable then we will use it.
4. Encoder weights - to use **transfer learning** we will initialize the network weights to the imagenet.

In [None]:
def build_model():
    unet_model = smp.Unet(
        encoder_name=ENCODER,      # choose encoder, e.g. mobilenet_v2 or efficientnet-b7
        encoder_weights="imagenet",     # use `imagenet` pre-trained weights for encoder initialization
        in_channels=3,                  # model input channels (1 for gray-scale images, 3 for RGB, etc.)
        classes=NUM_CLASSES,        # model output channels (number of classes in your dataset)
        activation=None,
    )
    unet_model.to(DEVICE) 
    return unet_model

def load_model(path):
    model = build_model()
    model.load_state_dict(torch.load(path))
    model.eval()
    return model

### Loss functions
For the loss function we aill use the multilabel mode.

For the first try we will use a combination of TverskyLoss and BCELoss 

#### TverskyLoss
This loss was introduced in "Tversky loss function for image segmentationusing 3D fully convolutional deep networks", retrievable here: https://arxiv.org/abs/1706.05721. It was designed to optimise segmentation on imbalanced medical datasets by utilising constants that can adjust how harshly different types of error are penalised in the loss function. From the paper:

... in the case of α=β=0.5 the Tversky index simplifies to be the same as the Dice coefficient, which is also equal to the F1 score. With α=β=1, Equation 2 produces Tanimoto coefficient, and setting α+β=1 produces the set of Fβ scores. Larger βs weigh recall higher than precision (by placing more emphasis on false negatives).

#### BCELoss

Binary cross-entropy is the loss function for binary classification with a single output unit, and categorical cross-entropy is the loss function for multiclass classification. In the PyTorch, the categorical cross-entropy loss takes in ground truth labels as integers, for example, y=2, out of three classes, 0, 1, and 2.

#### JaccardLoss - IoU
<br><center><img src="https://upload.wikimedia.org/wikipedia/commons/thumb/c/c7/Intersection_over_Union_-_visual_equation.png/300px-Intersection_over_Union_-_visual_equation.png" width=25%></center>


The metric that is required in the compatitaion is the Dice  metrice So we evaluate our preformence according to it.


In [None]:
# Optional loss functions:
JaccardLoss = smp.losses.JaccardLoss(mode='multilabel')
DiceLoss = smp.losses.DiceLoss(mode='multilabel')
BCELoss = smp.losses.SoftBCEWithLogitsLoss()
LovaszLoss = smp.losses.LovaszLoss(mode='multilabel', per_image=False)
TverskyLoss = smp.losses.TverskyLoss(mode='multilabel', log_loss=False)


def dice_coef(y_true, y_pred, thr=0.5, dim=(2, 3), epsilon=0.001):
    y_true = y_true.to(torch.float32)
    y_pred = (y_pred > thr).to(torch.float32)
    inter = (y_true * y_pred).sum(dim=dim)
    den = y_true.sum(dim=dim) + y_pred.sum(dim=dim)
    dice = ((2 * inter + epsilon) / (den + epsilon)).mean(dim=(1, 0))
    return dice


def iou_coef(y_true, y_pred, thr=0.5, dim=(2, 3), epsilon=0.001):
    y_true = y_true.to(torch.float32)
    y_pred = (y_pred > thr).to(torch.float32)
    inter = (y_true * y_pred).sum(dim=dim)
    union = (y_true + y_pred - y_true * y_pred).sum(dim=dim)
    iou = ((inter + epsilon) / (union + epsilon)).mean(dim=(1, 0))
    return iou


def criterion(y_pred, y_true):
    return 0.5 * BCELoss(y_pred, y_true) + 0.5 * TverskyLoss(y_pred, y_true)

### Train function
For training we define a train one epoch that trains the model, save the gradient and update the network values. 

In [None]:
# define one epoch train
def train_one_epoch(model, optimizer, scheduler, dataloader, device, epoch):
    model.train()
    scaler = amp.GradScaler()
    
    dataset_size = 0
    running_loss = 0.0
    
    pbar = tqdm(enumerate(dataloader), total=len(dataloader), desc='Train ')
    for step, (images, masks) in pbar:         
        images = images.to(device, dtype=torch.float)
        masks  = masks.to(device, dtype=torch.float)
        
        batch_size = images.size(0)
        
        with amp.autocast(enabled=True):
            y_pred = model(images)
            loss   = criterion(y_pred, masks)
            
        scaler.scale(loss).backward()
    
        if (step + 1) % 1 == 0:
            scaler.step(optimizer)
            scaler.update()

            # zero the parameter gradients
            optimizer.zero_grad()

            if scheduler is not None:
                scheduler.step()
                
        running_loss += (loss.item() * batch_size)
        dataset_size += batch_size
        
        epoch_loss = running_loss / dataset_size
        
        mem = torch.cuda.memory_reserved() / 1E9 if torch.cuda.is_available() else 0
        current_lr = optimizer.param_groups[0]['lr']
        pbar.set_postfix(train_loss=f'{epoch_loss:0.4f}',
                        lr=f'{current_lr:0.5f}',
                        gpu_mem=f'{mem:0.2f} GB')
    torch.cuda.empty_cache()
    gc.collect()
    
    return epoch_loss

### Validation function 
The validation function doesnt comput the gradient and only makes predictions with the test data.

In [None]:
@torch.no_grad() # dont compute gradient when testing validation set
def valid_one_epoch(model, dataloader, device, epoch):
    model.eval()
    
    dataset_size = 0
    running_loss = 0.0
    
    val_scores = []
    
    pbar = tqdm(enumerate(dataloader), total=len(dataloader), desc='Valid ')
    for step, (images, masks) in pbar:        
        images  = images.to(device, dtype=torch.float)
        masks   = masks.to(device, dtype=torch.float)
        
        batch_size = images.size(0)
        
        y_pred  = model(images)
        loss    = criterion(y_pred, masks)
        
        running_loss += (loss.item() * batch_size)
        dataset_size += batch_size
        
        epoch_loss = running_loss / dataset_size
        
        y_pred = nn.Sigmoid()(y_pred)
        val_dice = dice_coef(masks, y_pred).cpu().detach().numpy()
        val_jaccard = iou_coef(masks, y_pred).cpu().detach().numpy()
        val_scores.append([val_dice, val_jaccard])
        
        mem = torch.cuda.memory_reserved() / 1E9 if torch.cuda.is_available() else 0
        current_lr = optimizer.param_groups[0]['lr']
        pbar.set_postfix(valid_loss=f'{epoch_loss:0.4f}',
                        lr=f'{current_lr:0.5f}',
                        gpu_memory=f'{mem:0.2f} GB')
    val_scores  = np.mean(val_scores, axis=0)
    torch.cuda.empty_cache()
    gc.collect()
    
    return epoch_loss, val_scores

### Run training
The following function runs the training for number of epoches and with the arguments that she recived.

In [None]:
def run_training(model, optimizer, scheduler, device, num_epochs):
    if torch.cuda.is_available():
        print("cuda: {}\n".format(torch.cuda.get_device_name()))
    
    start = time.time()
    best_model_wts = copy.deepcopy(model.state_dict())
    best_dice      = -np.inf
    best_epoch     = -1
    history = defaultdict(list)
    
    for epoch in range(1, num_epochs + 1): 
        gc.collect()
        print(f'Epoch {epoch}/{num_epochs}', end='')
        train_loss = train_one_epoch(model, optimizer, scheduler, 
                                           dataloader=train_loader, 
                                           device=device, epoch=epoch)
        
        val_loss, val_scores = valid_one_epoch(model, valid_loader, 
                                                 device=DEVICE, 
                                                 epoch=epoch)
        val_dice, val_jaccard = val_scores
    
        history['Train Loss'].append(train_loss)
        history['Valid Loss'].append(val_loss)
        history['Valid Dice'].append(val_dice)
        history['Valid Jaccard'].append(val_jaccard)
    
        
        print(f'Valid Dice: {val_dice:0.4f} | Valid Jaccard: {val_jaccard:0.4f}')
        
        # deep copy the model
        if val_dice >= best_dice:
            print(f"Valid Score Improved ({best_dice:0.4f} ---> {val_dice:0.4f})")
            best_dice    = val_dice
            best_jaccard = val_jaccard
            best_epoch   = epoch
            run["Best Dice"]    = best_dice
            run["Best Jaccard"] = best_jaccard
            run["Best Epoch"]   = best_epoch
            best_model_wts = copy.deepcopy(model.state_dict())
            path = f"best_epoch-{fold:02d}.bin"
            torch.save(model.state_dict(), path)
            print(f"Model Saved")
            
        last_model_wts = copy.deepcopy(model.state_dict())
        path = f"last_epoch-{fold:02d}.bin"
        torch.save(model.state_dict(), path)
            
        print(); print()
    
    end = time.time()
    time_elapsed = end - start
    print('Training complete in {:.0f}h {:.0f}m {:.0f}s'.format(
        time_elapsed // 3600, (time_elapsed % 3600) // 60, (time_elapsed % 3600) % 60))
    print("Best Score: {:.4f}".format(best_jaccard))
    
    # load best model weights
    model.load_state_dict(best_model_wts)
    
    return model, history

### Optimizer
After choosing the optimizer we will add a scheduler to reduce learning rate.
Schesulers that we choose to expiramate are:
1. CosineAnnealingLR - https://pytorch.org/docs/stable/generated/torch.optim.lr_scheduler.CosineAnnealingLR.html
2. CosineAnnealingWarmRestarts - https://pytorch.org/docs/stable/generated/torch.optim.lr_scheduler.CosineAnnealingWarmRestarts.html
3. ReduceLROnPlateau - https://pytorch.org/docs/stable/generated/torch.optim.lr_scheduler.ReduceLROnPlateau.html
4. ExponentialLR - https://pytorch.org/docs/stable/generated/torch.optim.lr_scheduler.ExponentialLR.html

In [None]:
def fetch_scheduler(optimizer,scheduler_name):
    if scheduler_name == 'CosineAnnealingLR':
        scheduler = lr_scheduler.CosineAnnealingLR(optimizer,T_max=T_MAX, 
                                                   eta_min=MIN_LR)
    elif scheduler_name == 'CosineAnnealingWarmRestarts':
        scheduler = lr_scheduler.CosineAnnealingWarmRestarts(optimizer,T_0=T_0, 
                                                             eta_min=MIN_LR)
    elif scheduler_name == 'ReduceLROnPlateau':
        scheduler = lr_scheduler.ReduceLROnPlateau(optimizer,
                                                   mode='min',
                                                   factor=0.1,
                                                   patience=7,
                                                   threshold=0.0001,
                                                   min_lr=MIN_LR)
    elif scheduler_name == 'ExponentialLR':
        scheduler = lr_scheduler.ExponentialLR(optimizer, gamma=0.85)
    elif scheduler_name == None:
        return None
        
    return scheduler

### Download the model with initial weghits
In this section we will download the unet model using the build_model function that we have created.

After that we will set the optimizer and check the scheduler function.

In [None]:
# Check if model build sucseefully
unet_model = build_model()
optimizer = optim.Adam(unet_model.parameters(), lr=LR, weight_decay=WD)
scheduler = fetch_scheduler(optimizer, 'CosineAnnealingLR')

### Train the model!

In [None]:
scheduler_name = 'CosineAnnealingLR'
# Clean GPU
gc.collect()
torch.cuda.empty_cache()
run = {}
runing_folds = 2 #N_FOLDS takes to much time for kaggle machine with limit of 12 hours.
if DEBUGE: runing_folds = 1
for fold in range(runing_folds):
    print(f'#'*15)
    print(f'### Fold: {fold}')
    print(f'#'*15)
   
    train_loader, valid_loader = prepare_loaders(fold=fold, debug=DEBUGE)
    unet_model = build_model()
    optimizer = optim.Adam(unet_model.parameters(), lr=LR, weight_decay=WD)
    scheduler = fetch_scheduler(optimizer,scheduler_name)
    unet_model, history = run_training(
        unet_model, optimizer, scheduler,
        device=DEVICE,
        num_epochs=EPOCHS
    )
# Show model metrics
for key, value in run.items():
    print(key, ' : ', value)

## Visualize the training and validation loss and Dice

In [None]:
def plot_history(history, title, labels, subplot):
    plt.subplot(*subplot)
    plt.title(title)
    plt.xlabel("Epoch #")
    for label in labels:
        plt.plot(history[label], label=label)
    plt.legend()
    
def plot_fit_result(history):
    plt.figure(figsize=(20, 8))
    plot_history(history, "Training Loss on Dataset", ['Train Loss','Valid Loss'], (1, 2, 1))
    plot_history(history, "Dice and IoU", ['Valid Dice','Valid Jaccard'], (1, 2, 2))
    plt.show()

In [None]:
plot_fit_result(history)

### Make predictions
After training the model lets make prediction on a fold validation set and see the results.

In [None]:
tr_idxs, val_idxs = folds[0]  # get trian and validation indexes
val_df = train_df.iloc[val_idxs].reset_index(drop=True)

test_dataset = BuildDataset(val_df, label=False, 
                            transforms=data_transforms['valid'])
test_loader  = DataLoader(test_dataset, batch_size=5, 
                          num_workers=4, shuffle=False, pin_memory=True)
imgs = next(iter(test_loader))
imgs = imgs.to(DEVICE, dtype=torch.float)

preds = []
for fold in range(1):
    model = load_model(f"best_epoch-{fold:02d}.bin")
    with torch.no_grad():
        pred = model(imgs)
        pred = (nn.Sigmoid()(pred)>0.5).double()
    preds.append(pred)
    
imgs  = imgs.cpu().detach()
preds = torch.mean(torch.stack(preds, dim=0), dim=0).cpu().detach()

### Visualize prediction

In [None]:
def show_img(img, mask=None):
    clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8,8))
#     img = clahe.apply(img)
#     plt.figure(figsize=(10,10))
    plt.imshow(img, cmap='bone')
    
def plot_batch(imgs, msks, size=3):
    plt.figure(figsize=(5*5, 5))
    for idx in range(size):
        plt.subplot(1, 5, idx+1)
        img = imgs[idx,].permute((1, 2, 0)).numpy()*255.0
        img = img.astype('uint8')
        msk = msks[idx,].permute((1, 2, 0)).numpy()*255.0
        show_img(img, msk)
    plt.tight_layout()
    plt.show()  

plot_batch(imgs, preds, size=5)
