<a href="https://colab.research.google.com/github/ICBI/AIMAHEAD_GU/blob/main/Courses/ML_Concepts/Module_05_Neural_Networks/Module_05_Neural_Networks_Demo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<img src="https://raw.githubusercontent.com/ICBI/AIMAHEAD_GU_publicCourseData/main/AAlogo1.jpg" alt="Powered by" width="150"/>

# AI/ML for Healthcare Applications : Lab 5 Neural Networks Demo

Based on material from the Georgetown [Health Informatics and Data Science](https://healthinformatics.georgetown.edu) program and licensed under  [CC4.0](https://creativecommons.org/licenses/by/4.0/)


In this demo, we will explore how to use neural networks for classification using sklearn and a popular deep learning library: Keras/TensorFlow

Some imports

In [None]:
import pandas as pd
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
matplotlib.rcParams['figure.figsize'] = (10, 10)

from collections import Counter

from sklearn.datasets import make_classification
from sklearn.metrics import confusion_matrix
from mlxtend.plotting import plot_confusion_matrix


from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier
from sklearn import svm
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
from sklearn.model_selection import cross_validate

from sklearn.metrics import classification_report
from sklearn.metrics import accuracy_score, f1_score, roc_auc_score, precision_score, recall_score
from sklearn.metrics import roc_curve, precision_recall_curve, auc

## Helper functions

In [None]:
def get_predictions(predictions_proba, threshold=0.5):
  predictions = np.where(predictions_proba <= threshold, 0, 1)
  return predictions

#Function that calculates and print metrics
def show_metrics(testy, predictions):
  print('====================')
  accuracy = accuracy_score(testy, predictions)
  print('Accuracy: %.3f' % accuracy)
  recall = recall_score(testy, predictions)
  print('Recall: %.3f' % recall)
  precision = precision_score(testy, predictions)
  print('Precision: %.3f' % precision)
  f1 = f1_score(testy, predictions)
  print('F1: %.3f' % f1)
  print('====================')

#Function to plot ROC Curve
def plot_roc(testy, predictions, title):
    fpr, tpr, thresholds = roc_curve(testy, predictions)
    roc_auc = auc(fpr, tpr)
    print('AUROC: %.3f' % roc_auc)
    plt.plot(fpr, tpr, label='ROC curve (area = %0.2f)' % roc_auc)
    plt.plot([0, 1], [0, 1], '--')
    plt.xlim([0.0, 1.05])
    plt.ylim([0.0, 1.05])
    plt.xlabel('False Positive Rate')
    plt.ylabel('True Positive Rate')
    plt.title(title)
    plt.legend(loc="lower right")
    plt.show()

#Function to plot PR Curve
def plot_prc(testy, predictions, title):
    precision, recall, thresholds = precision_recall_curve(testy, predictions)
    auc_score = auc(recall, precision)
    plt.plot(recall,precision, label='PR curve (area = %0.2f)' % auc_score)
    plt.plot([0, 1], [0.5, 0.5], linestyle='--' )
    plt.xlabel('Recall')
    plt.ylabel('Precision')
    plt.xlim([0, 1.02])
    plt.ylim([0, 1.02])
    plt.title(title)
    plt.legend(loc="lower right")
    plt.show()

#Function to plot precision and recall vs all tresholds
def plot_prec_recall_vs_thresh(testy, predictions, title):
    precision, recall, thresholds = precision_recall_curve(testy, predictions)
    plt.plot(thresholds, precision[:-1], 'b--', label='precision')
    plt.plot(thresholds, recall[:-1], 'g--', label = 'recall')
    plt.xlabel('Threshold')
    plt.ylim([0,1])
    plt.legend(loc="lower right")
    plt.title(title)
    plt.show()

## Dataset

The dataset which will be using in this demo is the UCI Breast Cancer Wisconsin (Diagnostic) Data Set: https://archive.ics.uci.edu/ml/datasets/breast+cancer+wisconsin+(diagnostic)


Here, researchers obtained Fine Needle Aspirate (FNA) of breast mass and generated it’s digitized images. The Dataset contains instances describing the characteristics of the cell nuclei in those images. Every instance is marked with either of the two diagnosis: ‘M’ (Malignant) or ‘Benign’). Our Task is to train a Neural Network on this data to diagnose Breast Cancer given the characteristics mentioned above.

Attribute Information:

1) ID number

2) Diagnosis (M = malignant, B = benign)

3-32)

Ten real-valued features are computed for each cell nucleus:

a) radius (mean of distances from center to points on the perimeter)

b) texture (standard deviation of gray-scale values)

c) perimeter

d) area

e) smoothness (local variation in radius lengths)

f) compactness (perimeter^2 / area - 1.0)

g) concavity (severity of concave portions of the contour)

h) concave points (number of concave portions of the contour)

i) symmetry

j) fractal dimension ("coastline approximation" - 1)

Read the dataset using pandas

In [None]:
breast_cancer_file = "/content/drive/MyDrive/Work/HIDS_506_2022/lecture6_draft/data/breast_cancer.csv"

In [None]:
breast_cancer_df = pd.read_csv(breast_cancer_file)

FileNotFoundError: ignored

In [None]:
breast_cancer_df.shape

In [None]:
breast_cancer_df.head()

Features and Outcome

In [None]:
breast_cancer_df.columns

Let's check the diagnosis (outcome)

In [None]:
breast_cancer_df['diagnosis']

In [None]:
Counter(breast_cancer_df['diagnosis'])

We need to convert the outcome values to 1 (for M: Malignant) and 0 (for B: Benign)

In [None]:
breast_cancer_df['diagnosis'].replace(('M','B'),(1,0), inplace = True)

**Split the dataset into feature matrix (X) and outcome (y)**

In [None]:
y_df = breast_cancer_df['diagnosis']
X_df = breast_cancer_df.drop(columns=["diagnosis","id"])

In [None]:
Counter(y_df)

In [None]:
X_df.shape

Let's check some statistics about the features

In [None]:
breast_cancer_df.describe()

Need for scaling as different features have different ranges. More about this later.

**Train and test split**

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X_df, y_df, test_size = 0.2, random_state = 42)

In [None]:
X_train.shape, X_test.shape

In [None]:
Counter(y_train), Counter(y_test)

In [None]:
X_df.describe()

**Feature Scaling**

We will use sklearn `StandardScaler()` function to fit the scaler and transform the training and test.



https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html


In [None]:
from sklearn.preprocessing import StandardScaler

In [None]:
scaler = StandardScaler()

Important fit on only training data

Short reason: General principle: any thing you learn, must be learned from the model's training data.

Nice article: https://sebastianraschka.com/faq/docs/scale-training-test.html

In [None]:
scaler.fit(X_train)

Actual scaling: Transforming train and test

In [None]:
# Now apply the transformations to the data:

X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)

In [None]:
X_train.shape, X_test.shape

## Train and test Neural Network using sklearn

We will `MLPClassifier()`, which is an implementation of neural network in sklearn and helps users to define a neural network architecture.

https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html

In [None]:
from sklearn.neural_network import MLPClassifier

In [None]:
mlp = MLPClassifier(hidden_layer_sizes=(32,16), max_iter=500)

Fit

In [None]:
mlp.fit(X_train,y_train)

Predict

In [None]:
y_pred = mlp.predict(X_test)

Print classification report

In [None]:
print(classification_report(y_test, y_pred))

Some other evaluation metrics

Confusion matrix

In [None]:
cm = confusion_matrix(y_test, y_pred)

In [None]:
cm

In [None]:
plot_confusion_matrix(cm, show_absolute = True, show_normed = True)

get prediction probabilities to plot ROC and precision vs. recall

In [None]:
y_pred_prob = mlp.predict_proba(X_test)[:,1]

plot ROC Curve

In [None]:
plot_roc(y_test, y_pred_prob, 'ROC')

plot PR Curve

In [None]:
plot_prc(y_test, y_pred_prob, 'PRC')

plot precision and recall vs all thresholds

In [None]:
plot_prec_recall_vs_thresh(y_test, y_pred_prob, 'PRvThresh')

## Neural Network using Keras

We will use a popular package for deep learning Keras with TensorFlow backend

Learn more about Keras: https://keras.io/

Imports

In [None]:
import keras
import tensorflow as tf
import datetime, os
from keras.callbacks import TensorBoard

In [None]:
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout

There are many variants of deep neural networks.

We will start from the simplest one, feedforward neural network, which is similar to the neural network architecture shown in the following figure.

![nn](https://cdn-images-1.medium.com/max/1600/1*QVIyc5HnGDWTNX3m-nIm9w.png)

[Source] https://medium.com/@curiousily/tensorflow-for-hackers-part-iv-neural-network-from-scratch-1a4f504dfa8

In our first neural network model, we will construct the one that

- take the input and pass them into the 32-dimension first hidden layer,
- take the output of the first layer and pass them into the  16-dimension second layer,
- take the output of the second layer and pass them into the last layer for prediction,
- the output of the last layer is the prediction.

One more hidden layer than the above figure.

In keras, we use `Sequential()` as the skeleton of the neural network model, and sequentially add the layer on it.
After building the layers, we need to compile the model and defined the optimizer, loss function and evaluation metrics to optimize our model.
In this example, we use the optimizer called `adam`, to minimize the value of loss function `binary_crossentropy` (if you work on the regression problem, remember to change to `mse`), and judge by accuracy.

**1. Initialize the Neural Network**

In [None]:
clf = Sequential()

**2. Define the architecture**

In [None]:
# first hidden layer for input data
clf.add(Dense(units=32,
              kernel_initializer='uniform',
              activation='relu',
              input_dim=X_train.shape[1]))

In [None]:
# second hidden layer
clf.add(Dense(units=8,
              kernel_initializer='uniform',
              activation='relu'))

In [None]:
#Adding dropout to prevent overfitting
dropout = 0.3
clf.add(Dropout(dropout))

In [None]:
# the last  layer for output
clf.add(Dense(units=1,
              kernel_initializer='uniform',
              activation='sigmoid'))

**3. Compile/build the network**

In [None]:
clf.compile(optimizer='adam',
            loss='binary_crossentropy',
            metrics=['accuracy'])

check the architecture

In [None]:
clf.summary()

Better visualization

In [None]:
from keras.utils.vis_utils import plot_model

In [None]:
plot_model(clf, show_shapes=True, show_layer_names=True)

**4. Train the model**

Not required just to visualize the training

In [None]:
%load_ext tensorboard
logdir = os.path.join("logs", datetime.datetime.now().strftime("%Y%m%d-%H%M%S"))
tensorboard_callback = tf.keras.callbacks.TensorBoard(logdir)

In [None]:
EPOCHS = 100
BATCH_SIZE = 16

In [None]:
history = clf.fit(X_train,
                  y_train,
                  validation_split = 0.2,
                  batch_size = BATCH_SIZE,
                  epochs = EPOCHS,
                  callbacks=[tensorboard_callback])

### Visualizing Loss using Tensor board

In [None]:
%tensorboard --logdir logs

In [None]:
plt.style.use('ggplot')

def plot_history(history):
    acc = history.history['accuracy']
    val_acc = history.history['val_accuracy']
    loss = history.history['loss']
    val_loss = history.history['val_loss']
    x = range(1, len(acc) + 1)

    plt.figure(figsize=(12, 5))
    plt.subplot(1, 2, 1)
    plt.plot(x, acc, 'b', label='Training acc')
    plt.plot(x, val_acc, 'r', label='Validation acc')
    plt.title('Training and validation accuracy')
    plt.legend()
    plt.subplot(1, 2, 2)
    plt.plot(x, loss, 'b', label='Training loss')
    plt.plot(x, val_loss, 'r', label='Validation loss')
    plt.title('Training and validation loss')
    plt.legend()

In [None]:
plot_history(history)

Make predictions and test

In [None]:
# Predicting the Test set results
y_pred_prob = clf.predict(X_test)

In [None]:
y_pred_prob[:5]

In [None]:
y_pred = (clf.predict(X_test) > 0.5).astype("int32")

In [None]:
y_pred[:5]

Evaluation metrics

In [None]:
plot_confusion_matrix(confusion_matrix(y_test, y_pred),
                      show_absolute = True,
                      show_normed = True)

In [None]:
print(classification_report(y_test,y_pred))

Make network robust and prevent overfitting?

Use Dropout layer: https://keras.io/api/layers/regularization_layers/dropout/

**The END** <br>
**Authors: Dr. Samir Gupta, Dr. Matthew McCoy & ICBI AIM-AHEAD Team**

<img src="https://raw.githubusercontent.com/ICBI/AIMAHEAD_GU_publicCourseData/main/HIDSLOGO.AA1.jpg" alt="Powered by" width="500"/>