# Lab 6 - Convolutional Neural Networks
by Rebecca Kuhlman, Sam Yao, and Michael Amberg



Business Understanding
Identifying the type of brain tumor a patient has is an important step in figuring out the treatment plan of a patient. They can be diagnosed via MRI imaging, leading to interest in using machine learning to diagnose the patient. Having a second opinion on brain tumor diagnoses would help improve patient care and outcomes, and lessen stress on doctors. A machine learning model could also speed up analysis time and pick out which patients are in need of urgent treatment.
In this dataset, there is glioma, meningioma, and pituitary tumors, as well as MRI images with no tumors. Glioma tumors are usually malignant, while meningioma and pituitary tumors are usually benign. Different types of tumors are made of different types of cells and have a location where they are most likely to be located. More information can be found at: https://www.mayoclinic.org/diseases-conditions/brain-tumor/symptoms-causes/syc-20350084
There are many other types of tumors that future algorithms will be need to address. The majority of other types of tumors are more common in children, while the set we are dealing with are all adult brain images.
Because the model deals with health conditions that have extreme affects on the patient, model accuracy is extremely important. Furthermore, accuracy must fine-tuned to avoid fatal misdiagnosis. While incorrectly marking a patient with a benign tumor as malignant is wasteful, the adverse affects are minimal. Inversely, misdiagnosing a malignant tumor as benign may have fatal effects for the patient. Therefore, the designed model must minimize the rate of false negatives with accuracy of 95% or more.
It should be noted that the majority of misdiagnose of brain tumors happen before a brain scan or related test is ordered. https://paulandperkins.com/brain-tumors/

## Preparation

[1.5 points] 
Choose and explain what metric(s) you will use to evaluate your algorithm’s performance. You should give a detailed argument for why this (these) metric(s) are appropriate on your data. That is, why is the metric appropriate for the task (e.g., in terms of the business case for the task). Please note: rarely is accuracy the best evaluation metric to use. Think deeply about an appropriate measure of performance.

Because we will be dealing with identifying brain tumors, we want to use Recall. The equation for Recall takes into account False Negatives, which would be very bad if you falsely cleared someone of brain tumors, but they did in fact have a tumor. This will be a high stakes identification, so at the very least our recall score should be 85% accuarate to be deployed.

[1.5 points] Choose the method you will use for dividing your data into training and testing (i.e., are you using Stratified 10-fold cross validation? Shuffle splits? Why?). Explain why your chosen method is appropriate or use more than one method as appropriate. Convince me that your cross validation method is a realistic mirroring of how an algorithm would be used in practice.

We could try Stratified 10 k-fold validation, because it seemed to be effective from the results we had in the last lab. I have no idea if this would extend to image data, but we could give it a try.
Stratified 10 k-fold validation is most effective with small amounts of imbalanced data. We have to think about balance a lot in our data as our tumor categories will have differing likelihoods, and we have a lot of different types of MRI photos.
In a deployment setting, different tumors (or when we are actually getting a tumor) will come up at different rates. There are many types of tumors with different subcategories, we will only be training our program for 3 types of brain tumors. Our program must be robust under these uneven circumstances. Stratified 10 k-fold validation would be one way to address this.

https://www.analyseup.com/python-machine-learning/stratified-kfold.html
https://www.aans.org/en/Patients/Neurosurgical-Conditions-and-Treatments/Brain-Tumors

## Modeling

[1.5 points]  Setup the training to use data expansion in Keras (also called data augmentation). Explain why the chosen data expansion techniques are appropriate for your dataset. You can use the keras ImageGenerator as a pre-processing step OR in the optimization loop. You can also use the Keras-cv augmenter (a separate package: https://keras.io/keras_cv/ Links to an external site.)

In [1]:
import pandas as pd
import tensorflow as tf
import keras
import numpy as np

df_training = keras.utils.image_dataset_from_directory("./Training",
                                                      image_size=(512, 512))
df_training = df_training.astype(np.int32)

Found 1876 files belonging to 4 classes.


AttributeError: '_BatchDataset' object has no attribute 'astype'

In [4]:
tf.config.list_physical_devices('GPU')

[]

In [2]:
import pandas as pd
import numpy as np
import os
from PIL import Image  # Utilized Source [2]
import matplotlib.pyplot as plt


# This method creates the data, whether training or testing, in the form we desire
# Uses code from source [2] to create the training datasets
def create_dataset(img_folder):
    # Read through all files in "./Training"
    img_data_array = []
    class_name = []

    for dir1 in os.listdir(img_folder):
        for file in os.listdir(os.path.join(img_folder, dir1)):
            image_path = os.path.join(img_folder, dir1, file)
            image = np.array(Image.open(image_path).convert("L").resize((512, 512)))

            image = image.reshape((1, 262144))  #Vectorizes each image
            image = image.astype('float64')
            image /= 255  #Normalize Values
            img_data_array.append(image[0])
            class_name.append(dir1)
    # return array with training data.
    img_data_array = np.asarray(img_data_array, dtype=np.ndarray)
    return img_data_array, class_name


df_training, training_classes = create_dataset("./Training")
df_testing, testing_classes = create_dataset("./Testing")

MemoryError: Unable to allocate 3.66 GiB for an array with shape (1876, 262144) and data type object

In [None]:
import tensorflow.keras as keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Reshape
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten
from tensorflow.keras.layers import Conv2D, MaxPooling2D
NUM_CLASSES = 4

# make a 3 layer keras MLP
mlp = Sequential()
mlp.add( Flatten() ) # make images flat for the MLP input
mlp.add( Dense(input_dim=1, units=30,
               activation='relu') )
mlp.add( Dense(units=15, activation='relu') )
mlp.add( Dense(NUM_CLASSES) )
mlp.add( Activation('softmax') )

mlp.compile(loss='mean_squared_error',
              optimizer='rmsprop',
              metrics=['recall'])

mlp.fit(df_training,
        batch_size=32, epochs=150,
        shuffle=True, verbose=0)

In [None]:
# make a CNN with conv layer and max pooling
cnn = Sequential()
cnn.add( Conv2D(filters=16, kernel_size= (2, 2), padding='same',
                input_shape=(8,8,1),
               ) )

cnn.add( MaxPooling2D(pool_size=(2, 2)) )
cnn.add( Activation('relu') )
# add one layer on flattened output
cnn.add( Flatten() )
cnn.add( Dense(NUM_CLASSES) )
cnn.add( Activation('softmax') )

cnn.summary()

 Explain why the chosen data expansion techniques are appropriate for your dataset. :
 We used __ data expansion cause __

    Create a convolutional neural network to use on your data using Keras. Investigate at least two different convolutional network architectures (and investigate changing some parameters of each architecture such as the number of filters--at minimum have two variations of each network for a total of four models trained). Use the method of train/test splitting and evaluation metric that you argued for at the beginning of the lab. Visualize the performance of the training and validation sets per iteration (use the "history" parameter of Keras). Be sure that models converge.
    [1.5 points] Visualize the final results of the CNNs and interpret/compare the performances. Use proper statistics as appropriate, especially for comparing models.
    [1 points] Compare the performance of your convolutional network to a standard multi-layer perceptron (MLP) using the receiver operating characteristic and area under the curve. Use proper statistical comparison techniques.



In [None]:
from sklearn import metrics as mt
from matplotlib import pyplot as plt
import seaborn as sns
%matplotlib inline

def compare_mlp_cnn(cnn, mlp, X_test, y_test, labels='auto'):
    plt.figure(figsize=(15,5))
    if cnn is not None:
        yhat_cnn = np.argmax(cnn.predict(X_test), axis=1)
        acc_cnn = mt.accuracy_score(y_test,yhat_cnn)
        plt.subplot(1,2,1)
        cm = mt.confusion_matrix(y_test,yhat_cnn)
        cm = cm/np.sum(cm,axis=1)[:,np.newaxis]
        sns.heatmap(cm, annot=True, fmt='.2f',xticklabels=labels,yticklabels=labels)
        plt.title('CNN: '+str(acc_cnn))

    if mlp is not None:
        yhat_mlp = np.argmax(mlp.predict(X_test), axis=1)
        acc_mlp = mt.accuracy_score(y_test,yhat_mlp)
        plt.subplot(1,2,2)
        cm = mt.confusion_matrix(y_test,yhat_mlp)
        cm = cm/np.sum(cm,axis=1)[:,np.newaxis]
        sns.heatmap(cm,annot=True, fmt='.2f',xticklabels=labels,yticklabels=labels)
        plt.title('MLP: '+str(acc_mlp))

compare_mlp_cnn(cnn,mlp,X_test,y_test)


Exceptional Work (1 points total)
Use transfer learning to pre-train the weights of your initial layers of your CNN. Compare the performance when using transfer learning to training without transfer learning (i.e., compare to your best model from above) in terms of classification performance.