<a href="https://colab.research.google.com/github/pszemraj/ml4hc-s22-project01/blob/add-results-p1/notebooks/colab/ensemble/Compile_Trained_Results.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Analyze Trained Results

> PURPOSE: this notebook re-loads all of the generated `.h5` files in the model training notebook. It then generates predictions on the test sets, and saves all the predictions for each model as a column in a dataframe & exports that.

- TODO: add systematic metric computation for all the models

---


**Notes on required metrics:**

- For the binary one, report accuracy, AUROC and AUPRC.
- For the non-binary one, report accuracy.

**Notes on classification in general:**

- sklearn.metrics docs [here](https://scikit-learn.org/stable/modules/model_evaluation.html#classification-metrics)
- (_outside of requirements_) a useful guide on what to use in classification and when is [here](https://neptune.ai/blog/evaluation-metrics-binary-classification#:~:text=Simply%20put%20a%20classification%20metric,to%20classes%3A%20positive%20and%20negative.) I think it's great

# setup

In [2]:
#@markdown add auto-Colab formatting with `IPython.display`
from IPython.display import HTML, display
# colab formatting
def set_css():
    display(
        HTML(
            """
  <style>
    pre {
        white-space: pre-wrap;
    }
  </style>
  """
        )
    )

get_ipython().events.register("pre_run_cell", set_css)

In [1]:
!nvidia-smi

NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.



In [3]:
!pip install -U plotly orca kaleido -q
import plotly.express as px

[K     |████████████████████████████████| 27.7 MB 1.2 MB/s 
[K     |████████████████████████████████| 244 kB 78.7 MB/s 
[K     |████████████████████████████████| 79.9 MB 187 kB/s 
[K     |████████████████████████████████| 4.3 MB 43.0 MB/s 
[?25h

In [4]:
import numpy as np 
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec

from tensorflow.keras import optimizers, losses, activations, models
from keras.callbacks import ModelCheckpoint, EarlyStopping, LearningRateScheduler, ReduceLROnPlateau
from keras.layers import Dense, Input, Dropout, Convolution1D, MaxPool1D, GlobalMaxPool1D, GlobalAveragePooling1D,concatenate,Flatten,\
Dense,Dropout,LSTM,Masking,Bidirectional,Dropout,GRU,SimpleRNN,TimeDistributed, BatchNormalization, Activation, MaxPooling1D, GlobalMaxPooling1D, Conv1D
from keras.models import Sequential,Model
import h5py
from sklearn.metrics import f1_score,accuracy_score, roc_auc_score, average_precision_score, classification_report
from sklearn.model_selection import GridSearchCV
from sklearn.neural_network import MLPClassifier
import lightgbm as lgb
from xgboost import XGBClassifier
from sklearn.svm import SVC
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.utils import class_weight
import collections


In [5]:
#@title mount drive
from pathlib import Path
from google.colab import drive

drive_base_str = '/content/drive'
drive.mount(drive_base_str)


Mounted at /content/drive


In [6]:
#@markdown determine root
import os
from pathlib import Path
peter_base = Path('/content/drive/MyDrive/ETHZ-2022-S/ML-healthcare-projects/project1/')

if peter_base.exists() and peter_base.is_dir():
    path = str(peter_base.resolve())
else:
    # original
    path = '/content/drive/MyDrive/ETH/'

print(f"base drive dir is:\n{path}")

base drive dir is:
/content/drive/MyDrive/ETHZ-2022-S/ML-healthcare-projects/project1


## define folder for outputs

In [7]:
_out_dir_name = "Trained-Results-Analysis" #@param {type:"string"}

output_path = os.path.join(path, _out_dir_name)
os.makedirs(output_path, exist_ok=True)
print(f"notebook outputs will be stored in:\n{output_path}")

notebook outputs will be stored in:
/content/drive/MyDrive/ETHZ-2022-S/ML-healthcare-projects/project1/Trained-Results-Analysis


In [10]:
#@markdown create output directory for re-computed performance metrics

metric_comparison_dir = Path(output_path) / "single-model-performance"
metric_comparison_dir.mkdir(exist_ok=True)

print(f"computed model metrics will be stored in:\n\t{metric_comparison_dir.resolve()}")


computed model metrics will be stored in:
	/content/drive/MyDrive/ETHZ-2022-S/ML-healthcare-projects/project1/Trained-Results-Analysis/single-model-performance


##load data

In [11]:
def load_mitbih(base_path):
    df_train = pd.read_csv(os.path.join(base_path,"data/mitbih_train.csv"),
                           header=None)
    df_train = df_train.sample(frac=1)
    df_test = pd.read_csv(os.path.join(base_path,"data/mitbih_test.csv"),
                          header=None)

    Y = np.array(df_train[187].values).astype(np.int8)
    X = np.array(df_train[list(range(187))].values)[..., np.newaxis]

    Y_test = np.array(df_test[187].values).astype(np.int8)
    X_test = np.array(df_test[list(range(187))].values)[..., np.newaxis]

    return X, X_test, Y, Y_test

In [12]:
backup_peter = '/content/drive/MyDrive/ETHZ-2022-S/ML-healthcare-projects/project1/project-handouts/'

In [13]:
try:
    X, X_test, Y, Y_test = load_mitbih(base_path=path)
except Exception as e:
    print(f"unable to load data from base path in drive folder because:\n\t{e}")
    print(f"\ngoing to try backup:\n\t{backup_peter}")
    X, X_test, Y, Y_test = load_mitbih(base_path=backup_peter)

## load directories with trained models

In [14]:
import pprint as pp
project_root = Path(path)
weight_dirs = [d for d in project_root.iterdir() if d.is_dir() and "weight" in d.name]

pp.pprint(weight_dirs)

[PosixPath('/content/drive/MyDrive/ETHZ-2022-S/ML-healthcare-projects/project1/MITBIH_weights'),
 PosixPath('/content/drive/MyDrive/ETHZ-2022-S/ML-healthcare-projects/project1/PTB_weights'),
 PosixPath('/content/drive/MyDrive/ETHZ-2022-S/ML-healthcare-projects/project1/MITBIH_biGRU_weights')]


# get predictions for all MITBIH models

In [15]:
mit_out_path = Path(output_path) / "MIT_ensemble"
mit_out_path.mkdir(exist_ok=True)

In [16]:
MIT_df = pd.DataFrame(Y_test, columns=['actual_class'])
MIT_df.head()

Unnamed: 0,actual_class
0,0
1,0
2,0
3,0
4,0


In [17]:
mitbih_dirs = [d for d in weight_dirs if "mitbih" in d.name.lower()]
pp.pprint(mitbih_dirs)


[PosixPath('/content/drive/MyDrive/ETHZ-2022-S/ML-healthcare-projects/project1/MITBIH_weights'),
 PosixPath('/content/drive/MyDrive/ETHZ-2022-S/ML-healthcare-projects/project1/MITBIH_biGRU_weights')]


In [18]:
mitbih_fitted_models = []

for weight_dir in mitbih_dirs:

    model_paths = [f for f in weight_dir.iterdir() if f.is_file() and f.suffix == ".h5"]
    mitbih_fitted_models.extend(model_paths)

print(f"found {len(mitbih_fitted_models)} total models to compute preds for MIT")


found 18 total models to compute preds for MIT


### loop through all files, store filename as column

In [19]:
import gc
from tqdm.auto import tqdm
from keras.models import load_model

pbar = tqdm(desc="computing model predictions", 
            total=len(mitbih_fitted_models)
        )

for mpath in mitbih_fitted_models:

    try:
        model_name = mpath.stem
        src_dir = mpath.parent.name
        model = load_model(mpath)
        pred_test = model.predict(X_test)
        pred_test = np.argmax(pred_test, axis=-1)

        MIT_df[f"{model_name}_{src_dir}_preds"] = pred_test

        del model
        del pred_test
    except Exception as e:
        print(f"\nUnable to generate predictions for {mpath.name}, skipping")
        print(f"Error printout as follows:{e}")
    gc.collect()
    pbar.update(1)
pbar.close()

computing model predictions:   0%|          | 0/18 [00:00<?, ?it/s]

In [20]:

mit_df_base = mit_out_path / "MITBIH_testset_model_predictions"

MIT_df.to_csv(mit_df_base.with_suffix('.csv'), index=False)
MIT_df.to_excel(mit_df_base.with_suffix('.xlsx'), index=False)

In [21]:
MIT_df.head()

Unnamed: 0,actual_class,BIRNN10_mitbih_MITBIH_weights_preds,BILSTM187_mitbih_MITBIH_weights_preds,BILSTM187_ptb_MITBIH_weights_preds,BIRNN187_mitbih_MITBIH_weights_preds,BILSTM_mitbih_MITBIH_weights_preds,CNN_mitbih_MITBIH_weights_preds,LTSM_mitbih_MITBIH_weights_preds,BidirGRU_MITBIH_weights_preds,RNN_mitbih_MITBIH_weights_preds,GRU_mitbih_MITBIH_weights_preds,SIMPLE_RNN_mitbih_MITBIH_weights_preds,CNN_mitbih_MITBIH_biGRU_weights_preds,RNN_mitbih_MITBIH_biGRU_weights_preds,BidirGRU_MITBIH_biGRU_weights_preds,BidirGRU_BS=1024_MITBIH_biGRU_weights_preds,BILSTM10_mitbih_MITBIH_biGRU_weights_preds,BILSTM187_mitbih_MITBIH_biGRU_weights_preds,BIRNN187_mitbih_MITBIH_biGRU_weights_preds
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,0,4,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


### re-compute metrics for MITBIH



In [23]:
from sklearn.metrics import balanced_accuracy_score, roc_auc_score, accuracy_score
from sklearn.metrics import f1_score, matthews_corrcoef

mit_metric_cols = ["model_filename", "balanced_accuracy_score", "accuracy",
                   "f1_score", "matthews_corrcoef"]
MIT_metrics = pd.DataFrame(columns=mit_metric_cols)

trained_models = list(MIT_df.columns)
trained_models.remove('actual_class')

for trained_model in trained_models:

    y_true = MIT_df.actual_class.values
    y_pred = MIT_df[trained_model].values
    _bas = balanced_accuracy_score(MIT_df.actual_class, MIT_df[trained_model])
    _acc = accuracy_score(MIT_df.actual_class, MIT_df[trained_model])
    _f1 = f1_score(y_true, y_pred, average='macro')
    _mcc = matthews_corrcoef(y_true, y_pred)
    _this_result = [trained_model, _bas, _acc, _f1, _mcc]
    _df = pd.DataFrame(columns=mit_metric_cols)
    _df.loc[0] = _this_result
    MIT_metrics = pd.concat([MIT_metrics, _df], axis=0)


MIT_metrics.sort_values(
                        by='balanced_accuracy_score',
                        ascending=False,
                        inplace=True,
                    )
MIT_metrics.head(10)

Unnamed: 0,model_filename,balanced_accuracy_score,accuracy,f1_score,matthews_corrcoef
0,BidirGRU_BS=1024_MITBIH_biGRU_weights_preds,0.90385,0.987119,0.924779,0.957312
0,BILSTM187_mitbih_MITBIH_weights_preds,0.897573,0.985931,0.916335,0.953345
0,BILSTM187_mitbih_MITBIH_biGRU_weights_preds,0.885654,0.984561,0.910498,0.948679
0,GRU_mitbih_MITBIH_weights_preds,0.856317,0.978988,0.888228,0.929742
0,CNN_mitbih_MITBIH_biGRU_weights_preds,0.847873,0.975745,0.873976,0.918878
0,CNN_mitbih_MITBIH_weights_preds,0.832914,0.97442,0.86383,0.914186
0,LTSM_mitbih_MITBIH_weights_preds,0.82611,0.970857,0.852444,0.902062
0,SIMPLE_RNN_mitbih_MITBIH_weights_preds,0.753695,0.954961,0.771623,0.847488
0,BILSTM_mitbih_MITBIH_weights_preds,0.657343,0.952448,0.716062,0.835487
0,BIRNN187_mitbih_MITBIH_weights_preds,0.577934,0.93002,0.616283,0.754012


In [24]:
mit_metrics_base = metric_comparison_dir / "MITBIH_trained_model_metrics"
MIT_metrics.to_csv(mit_metrics_base.with_suffix('.csv'), index=False)
MIT_metrics.to_excel(mit_metrics_base.with_suffix('.xlsx'), index=False)

# PTB

In [25]:
def load_ptb(base_path):
    df_1 = pd.read_csv(os.path.join(base_path,"data/ptbdb_normal.csv"),
                           header=None)
    df_2 = pd.read_csv(os.path.join(base_path,"data/ptbdb_abnormal.csv"),
                          header=None)
    df = pd.concat([df_1, df_2])

    df_train, df_test = train_test_split(df, test_size=0.2, random_state=1337, stratify=df[187])

    Y = np.array(df_train[187].values).astype(np.int8)
    X = np.array(df_train[list(range(187))].values)[..., np.newaxis]

    Y_test = np.array(df_test[187].values).astype(np.int8)
    X_test = np.array(df_test[list(range(187))].values)[..., np.newaxis]
    

    return X, X_test, Y, Y_test

In [26]:
try:
    X_ptb, X_test_ptb, Y_ptb, Y_test_ptb = load_ptb(base_path=path)
except Exception as e:
    print(f"unable to load data from base path in drive folder because:\n\t{e}")
    print(f"\ngoing to try backup:\n\t{backup_peter}")
    X_ptb, X_test_ptb, Y_ptb, Y_test_ptb = load_ptb(base_path=backup_peter)

In [27]:
X_test_ptb.shape

(2911, 187, 1)

In [28]:
ptb_out_path = Path(output_path) / "PTB_ensemble"
ptb_out_path.mkdir(exist_ok=True)

In [29]:
PTB_df = pd.DataFrame(Y_test_ptb, columns=['actual_class'])
PTB_df.head()

Unnamed: 0,actual_class
0,0
1,1
2,0
3,1
4,1


In [30]:
PTB_df.shape

(2911, 1)

In [31]:
ptb_dirs = [d for d in weight_dirs if "ptb" in d.name.lower()]
pp.pprint(ptb_dirs)


[PosixPath('/content/drive/MyDrive/ETHZ-2022-S/ML-healthcare-projects/project1/PTB_weights')]


In [32]:
ptb_fitted_models = []

for weight_dir in ptb_dirs:

    model_paths = [f for f in weight_dir.iterdir() if f.is_file() and f.suffix == ".h5"]
    ptb_fitted_models.extend(model_paths)

print(f"found {len(ptb_fitted_models)} total models to compute preds for MIT")


found 5 total models to compute preds for MIT


### loop through all files, store filename as column

In [33]:
import gc
from tqdm.auto import tqdm
from keras.models import load_model

pbar = tqdm(desc="computing model predictions", 
            total=len(ptb_fitted_models)
        )

for mpath in ptb_fitted_models:

    try:
        model_name = mpath.stem
        model = load_model(mpath)
        PTB_pred_test= model.predict(X_test_ptb)
        pred_test = np.argmax(PTB_pred_test, axis=-1)

        PTB_df[f"{model_name}_preds"] = pred_test

        del model
        del pred_test
    except Exception as e:
        print(f"\nUnable to generate predictions for {mpath.name}, skipping")
        print(f"Error printout as follows:{e}")
    gc.collect()
    pbar.update(1)
pbar.close()

computing model predictions:   0%|          | 0/5 [00:00<?, ?it/s]

In [34]:

PTB_df_base = ptb_out_path / "ptb_testset_model_predictions"

PTB_df.to_csv(PTB_df_base.with_suffix('.csv'), index=False)
PTB_df.to_excel(PTB_df_base.with_suffix('.xlsx'), index=False)

In [35]:
PTB_df.head()

Unnamed: 0,actual_class,BILSTM187_ptb-2_preds,BidirGRU_ptb_preds,CNN_PTB_preds,GRU_ptb_preds,RNN_PTB_preds
0,0,0,0,0,0,1
1,1,1,1,1,1,1
2,0,0,0,0,0,1
3,1,1,1,1,1,1
4,1,1,1,1,1,1


## re-compute the metrics 


In [39]:
from sklearn.metrics import balanced_accuracy_score, roc_auc_score, accuracy_score

ptb_metric_cols = ["model_filename", "balanced_accuracy_score", "accuracy",
                    "roc_auc_score", "f1_score", "matthews_corrcoef"]
ptb_metrics = pd.DataFrame(columns=ptb_metric_cols)

ptb_trained_models = list(PTB_df.columns)
ptb_trained_models.remove('actual_class')

for trained_model in ptb_trained_models:

    y_true = PTB_df.actual_class.values
    y_pred = PTB_df[trained_model].values
    _bas = balanced_accuracy_score(y_true, y_pred)
    _acc = accuracy_score(y_true, y_pred)
    _rocauc = roc_auc_score(y_true, y_pred, )
    _f1 = f1_score(y_true, y_pred)
    _mcc = matthews_corrcoef(y_true, y_pred)
    _this_result = [trained_model, _bas, _acc, _rocauc, _f1, _mcc]
    _df = pd.DataFrame(columns=ptb_metric_cols)
    _df.loc[0] = _this_result
    ptb_metrics = pd.concat([ptb_metrics, _df], axis=0)

ptb_metrics.sort_values(
                        by='roc_auc_score',
                        ascending=False,
                        inplace=True,
                    )
ptb_metrics.head(10)

Unnamed: 0,model_filename,balanced_accuracy_score,accuracy,roc_auc_score,f1_score,matthews_corrcoef
0,GRU_ptb_preds,0.983929,0.98832,0.983929,0.991928,0.970833
0,BidirGRU_ptb_preds,0.98312,0.986603,0.98312,0.990725,0.966607
0,BILSTM187_ptb-2_preds,0.980836,0.983854,0.980836,0.988807,0.959857
0,CNN_PTB_preds,0.898384,0.918585,0.898384,0.943639,0.797071
0,RNN_PTB_preds,0.595476,0.726554,0.595476,0.82467,0.231881


In [40]:
ptb_metrics_base = metric_comparison_dir  / "ptb_trained_model_metrics"
ptb_metrics.to_csv(ptb_metrics_base.with_suffix('.csv'), index=False)
ptb_metrics.to_excel(ptb_metrics_base.with_suffix('.xlsx'), index=False)

---

_NOTE: ensembling tests moved to separate notebooks_


# Plots - Visual Comparison


**note that this just compares the individual models work. It does not include autoML baselines OR ensembles completed on the individual model predictions for a given dataset**



In [67]:
plot_height =  480#@param {type:"number"}
plot_width = int(plot_height * 1.61)

## MITBIH plots

In [46]:
MIT_metrics.columns

MIT_plot = MIT_metrics[MIT_metrics['balanced_accuracy_score'] >= 0.4]

In [68]:
mit_scatter = px.scatter(MIT_plot,
                         x='f1_score',
                         y='balanced_accuracy_score', 
                         color='model_filename',
                         title="MITBIH - F1 Score & Balanced Accuracy - Individual Models - Scatter (All)",
                         template='ggplot2', 
                         hover_data=list(MIT_plot.columns),
                         height=plot_height, width=plot_width,
                         )

mit_scatter.show()

In [69]:
MIT_top = MIT_plot[MIT_plot.balanced_accuracy_score >= 0.75]
mit_scatter_best = px.scatter(MIT_top,
                         x='f1_score',
                         y='balanced_accuracy_score', 
                         color='model_filename',
                         title="MITBIH - F1 Score & Balanced Accuracy - Individual Models - Scatter (Best)",
                         template='seaborn', 
                         hover_data=list(MIT_top.columns),
                         height=plot_height, width=plot_width,
                         )

mit_scatter_best.show()

In [71]:
#@title model performance - balanced acc

mit_t5_df = MIT_top.iloc[:5,:]

def fix_names(text:str):

    return text.replace('_weights_preds','')
mit_t5_df['balanced_accuracy_score'] = mit_t5_df['balanced_accuracy_score'].apply(round,args=(3,))

mit_t5_df['model_filename'] = mit_t5_df['model_filename'].apply(fix_names)
top5_mit = px.bar(mit_t5_df,
                  x='model_filename',
                    y='balanced_accuracy_score', 
                    color='model_filename',
                  template='seaborn',
                hover_data=list(mit_t5_df.columns),
                text='balanced_accuracy_score',
                title="Top Single-Model Performance on MITBIH - Balanced Accuracy",
                height=plot_height, width=plot_width,

)
top5_mit.show()



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



In [73]:
_top5base = metric_comparison_dir / 'MITBIH_top5_models'
top5_mit.write_html(_top5base.with_suffix('.html'))
top5_mit.write_image(_top5base.with_suffix('.png'))

## PTB plots


In [72]:
ptb_plot = ptb_metrics
ptb_plot['balanced_accuracy_score'] = ptb_plot['balanced_accuracy_score'].apply(round,args=(3,))
top5_ptb = px.bar(ptb_plot,
                  x='model_filename',
                    y='balanced_accuracy_score', 
                    color='model_filename',
                  template='ggplot2',
                  title="Top Single-Model Performance on PTB DB - ROC AUC",
                  hover_data=list(mit_t5_df.columns),
                  text='balanced_accuracy_score',
                height=plot_height, width=plot_width,

)
top5_ptb.show()

In [74]:
_top5base_PTB = metric_comparison_dir / 'PTBDB_top_models'
top5_ptb.write_html(_top5base_PTB.with_suffix('.html'))
top5_ptb.write_image(_top5base_PTB.with_suffix('.png'))

# End - Print where stuff is

In [76]:
#@markdown <font color="orange"> NOTE: here are printed where the outputs of this notebook are located </font>

print(f"TOP-LEVEL folder for outputs is:\n\t{Path(output_path).resolve()}")

print(f"plots and data for INDIVIDUALLY trained models are in\n\t{metric_comparison_dir.resolve()}")

TOP-LEVEL folder for outputs is:
	/content/drive/MyDrive/ETHZ-2022-S/ML-healthcare-projects/project1/Trained-Results-Analysis
plots and data for INDIVIDUALLY trained models are in
	/content/drive/MyDrive/ETHZ-2022-S/ML-healthcare-projects/project1/Trained-Results-Analysis/single-model-performance
