<a href="https://colab.research.google.com/github/sithin42/INT-PROSTATE-Contour-Stability/blob/main/3_StabilityAnalysis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Part 3 - Stability Analysis**

This notebook was designed to give an overall idea about the pipeline followed for the contour stability analysis

**Step 1:** Adjust the notebook for local and colab compatability

In [None]:
try:
  import google.colab
  IN_COLAB = True
except:
  IN_COLAB = False

ROOT_PATH = "./"
#Loading the example data from github
if IN_COLAB: 
  ROOT_PATH = "./INT-PROSTATE-Contour-Stability"
  !git clone https://github.com/sithin42/INT-PROSTATE-Contour-Stability.git
  import sys
  sys.path.append(ROOT_PATH)
  

**Step 2:** Install and import the packages

An important package that will be used in this notebook is "Pingouin"

Pingouin is an open-source statistical package written in Python 3 and based mostly on Pandas and NumPy. In this work, we will be using Pingouin to compute ICC(1,1)

More information can be found @ https://pingouin-stats.org/




In [None]:
#Requirements 
!pip install pandas
!pip install pingouin
!pip install seaborn


In [None]:
import os
from tqdm import tqdm
import pandas as pd
import pingouin as pg
import seaborn as sns
import numpy as np

from ipywidgets import widgets, interact
import matplotlib.pyplot as plt

import warnings
warnings.filterwarnings("ignore")


**Step 3:** Specify the augmentation scenario to be considered

`AUG_TYPE can take the following values ["in_plane","out_plane"]`

`BIAS_TYPE can take the following values ["random","static",""]`

*Note: When AUG_TYPE is "out_plane", BIAS_TYPE should be empty*

In [None]:
AUG_TYPE = "in_plane"
BIAS_TYPE = "random"

**Step 4:** In this step, we try to merge the GT radiomic features and augmented radiomic features to form a single data frame. 

In order to do this, we will be reading the contents from the `results/` folder. 

On Google Colab, this path doesn't exists bey default. Either you can upload the results folder generated by the notebook 2.RadiomicsFeatureExtractor 

If run locally, you can generate the results folder as you run the notebook 2.RadiomicsFeatureExtractor


then we will be using the features extracted for each augmentation scenarios which is stored inside the repository. It can be found at location: `INT-PROSTATE-Contour-Stability/results`

In [None]:
OUT_PATH = f"./results"

if not os.path.exists(OUT_PATH):
  if not IN_COLAB:
    ROOT_PATH = "./INT-PROSTATE-Contour-Stability"
    !git clone https://github.com/sithin42/INT-PROSTATE-Contour-Stability.git
  OUT_PATH = os.path.join(ROOT_PATH,"results")
  

In [None]:
assert AUG_TYPE in ["in_plane","out_plane","inout_plane"], "Invalid aug_type!"

if AUG_TYPE!="out_plane":
  assert BIAS_TYPE in ["random","systematic"], "Invalid bias_type!"
else:
  assert BIAS_TYPE=="", "For out_plane augmentation bias_type should be an empty string"

org_df = pd.read_csv(os.path.join(OUT_PATH,"org_feats.csv"))
aug_df = pd.read_csv(os.path.join(OUT_PATH,f"{AUG_TYPE}_{BIAS_TYPE}","aug_feats.csv"))

merged_df = pd.concat([org_df,aug_df],ignore_index=True)

merged_df.head()

**Step 5:** Compute ICC(1,1)

ICC(1,1): Intraclass correlation coefficient (ICC) is a widely used reliability index in test-retest, intrarater, and interrater reliability analyses. 

The two possible sources of variability modeled in ICC(1,1) are  “between-subject” and “within-subject” variability. In this work, “within-subject” variability corresponds to variations to segmentation as rated by different observers for the same target. The other source of variability is attributed to the intrinsic difference in radiomic feature values between patients within the population.

More information can be found @ https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4913118/pdf/main.pdf


In [None]:
NON_FEAT_COLS = ["diagnostics","id","judge","Unnamed","dice"]

In [None]:
def compute_icc(df, features):

  out_icc = {"feature":[],"icc_value":[],"ci_down":[],"ci_up":[]}
  pbar = tqdm(range(len(features)),position=0,desc="Computing ICC")

  for feature in features:

      icc = pg.intraclass_corr(data=df,targets='id',raters='judge',ratings=feature)
      
      icc_value = np.round(icc['ICC'][0],2)
      ci_down = np.round(icc['CI95%'][0][0],2)
      ci_up = np.round(icc['CI95%'][0][1],2)

      out_icc["feature"].append(feature)
      out_icc["icc_value"].append(icc_value if icc_value>=0 else 0)
      out_icc["ci_down"].append(ci_down if ci_down>=0 else 0)
      out_icc["ci_up"].append(ci_up if ci_up>=0 else 0)
      pbar.update()


  out_icc = pd.DataFrame.from_dict(out_icc)

  return out_icc


In [None]:
def getICC(df):

    features = list(df.columns.values)

    for column in df.columns:
        for ignore_column in NON_FEAT_COLS:
            if ignore_column in column:
                features.remove(column)
    
    out_icc = compute_icc(df,features)

    return out_icc, features
    


In [None]:
icc_df, features = getICC(merged_df)

icc_df.head()

**Step 6:** Visualization of stable features as clipped based on the stability threshold

Stable features were thresholded at the 95% confidence interval of the ICC estimate with a value equal or greater than the STABILITY_THRESHOLD. 

In [None]:
STABILITY_THRESHOLD = 0.90

In [None]:
def visualize(icc_df, threshold):

  df = icc_df.copy()
  df["hue"] = [0] * len(df)
  df.loc[df["ci_down"]>=0.9,["hue"]] = 1

  xerr = [df["icc_value"]-df["ci_down"],df["ci_up"]-df["icc_value"]]

  fig = plt.figure(figsize = (3,20))
  ax = fig.gca()

  sns.set_theme(style="whitegrid")
  g = sns.barplot(x="icc_value",y="feature",data=df, ax=ax,color='#2ca25f',hue='hue',dodge=False,xerr=xerr)
  g.legend_.remove()

  plt.margins(0,0)

  plt.savefig(f"./{AUG_TYPE}_{BIAS_TYPE}.png", bbox_inches = 'tight',
      pad_inches = 0,transparent=True,dpi=300)



In [None]:
df = icc_df.head(80)#Here we will be only visualizing the first 80 features, otherwise it takes too much time
visualize(df,STABILITY_THRESHOLD)