# Brain Tumor Detection (Exploratory Project)

*Disclaimer: this project is for the purpose of my own learning and holds an exploratory value; I have been learning and practicing ML concepts and techniques for a little while, and, when I came across this image dataset on Kaggle, I decided to give this image classification task a go!*

There certainly is (a huge) room for improvement (I say this often, but I am only a beginner), and if you get an idea or would like to recommend something, do get in touch with me! I am always happy to learn from others: there lies the *beauty of sharing*.

Back to this notebook: the dataset is composed of 253 images, 98 of which are classified as 'no', i.e. not being from patients with a brain tumor. In terms of the context, there are not much details on Kaggle. This is a 2-year old dataset, that you can find here: https://www.kaggle.com/navoneel/brain-mri-images-for-brain-tumor-detection 

I have done some preprocessing on the images, and created a hdf5 file out of it to use here, since I had learned to do computer vision-related tasks this way. The file can be found in the GitHub repo containing this Notebook, and you will also find the images there (renamed for consistency, e.g. N1.jpg is an image with label = 0/No Tumor, and Y63.jpg is an image with label = 1/Tumor).

Refer to the end of this notebook for a detailed Acknowledgement Section (a little taste: this notebook may look similar to you, and that is because it's in part taken from the 'Residual Network' Programming Assignment of the Deep Learning Specialisation! More details, again, at the bottom of this notebook).

Here, I will use a Residual Network architecture to build a binary classifier, with y = 0 (no brain tumor in the image) and y = 1 (brain tumor in the image).

In [12]:
import numpy as np
from keras import layers
from keras.layers import Input, Add, Dense, Activation, ZeroPadding2D, BatchNormalization, Flatten, Conv2D, AveragePooling2D, MaxPooling2D, GlobalMaxPooling2D
from keras.models import Model, load_model
from keras.utils import layer_utils
from keras.utils.data_utils import get_file
from keras.applications.imagenet_utils import preprocess_input
from utils import *
from keras.initializers import glorot_uniform

import keras.backend as K
K.set_image_data_format('channels_last')
K.set_learning_phase(1)

# Building the Residual Networks

## 1. The Identity Block

In [13]:
def identity_block(X, f, filters, stage, block):
    """
    Implementation of the identity block 
    
    Arguments:
    X -- input tensor of shape (m, n_H_prev, n_W_prev, n_C_prev)
    f -- integer, specifying the shape of the middle CONV's window for the main path
    filters -- python list of integers, defining the number of filters in the CONV layers of the main path
    stage -- integer, used to name the layers, depending on their position in the network
    block -- string/character, used to name the layers, depending on their position in the network
    
    Returns:
    X -- output of the identity block, tensor of shape (n_H, n_W, n_C)
    """
    
    # defining name basis
    conv_name_base = 'res' + str(stage) + block + '_branch'
    bn_name_base = 'bn' + str(stage) + block + '_branch'
    
    # retrieve Filters
    F1, F2, F3 = filters
    
    # save the input value 
    X_shortcut = X
    
    # First component of main path
    X = Conv2D(filters = F1, kernel_size = (1, 1), strides = (1,1), padding = 'valid', name = conv_name_base + '2a', kernel_initializer = glorot_uniform(seed = 0))(X)
    X = BatchNormalization(axis = 3, name = bn_name_base + '2a')(X)
    X = Activation('relu')(X)
    
    # second component of main path
    X = Conv2D(filters = F2, kernel_size = (f, f), strides = (1,1), padding = 'same', name = conv_name_base + '2b', kernel_initializer = glorot_uniform(seed = 0))(X)
    X = BatchNormalization(axis = 3, name = bn_name_base + '2b')(X)
    X = Activation('relu')(X)

    # third component of main path
    X = Conv2D(filters = F3, kernel_size = (1, 1), strides = (1,1), padding = 'valid', name = conv_name_base + '2c', kernel_initializer = glorot_uniform(seed = 0))(X)
    X = BatchNormalization(axis = 3, name = bn_name_base + '2c')(X)

    # final step: Add shortcut value to main path, and pass it through a RELU activation 
    X = Add()([X, X_shortcut])
    X = Activation('relu')(X)
    
    return X

## 2. The Convolutional Block

In [14]:
def convolutional_block(X, f, filters, stage, block, s = 2):
    """
    Implementation of the convolutional block as defined in Figure 4
    
    Arguments:
    X -- input tensor of shape (m, n_H_prev, n_W_prev, n_C_prev)
    f -- integer, specifying the shape of the middle CONV's window for the main path
    filters -- python list of integers, defining the number of filters in the CONV layers of the main path
    stage -- integer, used to name the layers, depending on their position in the network
    block -- string/character, used to name the layers, depending on their position in the network
    s -- Integer, specifying the stride to be used
    
    Returns:
    X -- output of the convolutional block, tensor of shape (n_H, n_W, n_C)
    """
    
    # defining name basis
    conv_name_base = 'res' + str(stage) + block + '_branch'
    bn_name_base = 'bn' + str(stage) + block + '_branch'
    
    # retrieve Filters
    F1, F2, F3 = filters
    
    # save the input value
    X_shortcut = X


    ##### MAIN PATH #####
    # First component of main path 
    X = Conv2D(F1, (1, 1), strides = (s,s), padding = 'valid', name = conv_name_base + '2a', kernel_initializer = glorot_uniform(seed=0))(X)
    X = BatchNormalization(axis = 3, name = bn_name_base + '2a')(X)
    X = Activation('relu')(X)

    # Second component of main path 
    X = Conv2D(filters = F2, kernel_size = (f, f), strides = (1,1), padding = 'same', name = conv_name_base + '2b', kernel_initializer = glorot_uniform(seed=0))(X)
    X = BatchNormalization(axis = 3, name = bn_name_base + '2b')(X)
    X = Activation('relu')(X)

    # Third component of main path
    X = Conv2D(filters = F3, kernel_size = (1, 1), strides = (1,1), padding = 'valid', name = conv_name_base + '2c', kernel_initializer = glorot_uniform(seed=0))(X)
    X = BatchNormalization(axis = 3, name = bn_name_base + '2c')(X)
    
    ##### SHORTCUT PATH ####
    X_shortcut = Conv2D(filters = F3, kernel_size = (1, 1), strides = (s,s), padding = 'valid', name = conv_name_base + '1', kernel_initializer = glorot_uniform(seed=0))(X_shortcut)
    X_shortcut = BatchNormalization(axis = 3, name = bn_name_base + '1')(X_shortcut)

    # Final step: Add shortcut value to main path, and pass it through a RELU activation 
    X = Add()([X, X_shortcut])
    X = Activation('relu')(X)
    
    return X

## Putting It Together: Building the Model

In [15]:
def ResNet50(input_shape = (128, 128, 3), classes = 2):
    """
    Implementation of the popular ResNet50 the following architecture:
    CONV2D -> BATCHNORM -> RELU -> MAXPOOL -> CONVBLOCK -> IDBLOCK*2 -> CONVBLOCK -> IDBLOCK*3
    -> CONVBLOCK -> IDBLOCK*5 -> CONVBLOCK -> IDBLOCK*2 -> AVGPOOL -> TOPLAYER

    Arguments:
    input_shape -- shape of the images of the dataset
    classes -- integer, number of classes

    Returns:
    model -- a Model() instance in Keras
    """
    
    # Define the input as a tensor with shape input_shape
    X_input = Input(input_shape)

    
    # Zero-Padding
    X = ZeroPadding2D((3, 3))(X_input)
    
    # Stage 1
    X = Conv2D(64, (7, 7), strides = (2, 2), name = 'conv1', kernel_initializer = glorot_uniform(seed=0))(X)
    X = BatchNormalization(axis = 3, name = 'bn_conv1')(X)
    X = Activation('relu')(X)
    X = MaxPooling2D((3, 3), strides=(2, 2))(X)

    # Stage 2
    X = convolutional_block(X, f = 3, filters = [64, 64, 256], stage = 2, block='a', s = 1)
    X = identity_block(X, 3, [64, 64, 256], stage=2, block='b')
    X = identity_block(X, 3, [64, 64, 256], stage=2, block='c')
    
    # Stage 3 
    X = convolutional_block(X, f = 3, filters = [128, 128, 512], stage = 3, block = 'a', s = 2)
    X = identity_block(X, f = 3, filters = [128, 128, 512], stage = 3, block = 'b')
    X = identity_block(X, f = 3, filters = [128, 128, 512], stage = 3, block = 'c')
    X = identity_block(X, f = 3, filters = [128, 128, 512], stage = 3, block = 'd')

    # Stage 4 
    X = convolutional_block(X, f = 3, filters = [256, 256, 1024], stage = 4, block = 'a', s = 2)
    X = identity_block(X, f = 3, filters = [256, 256, 1024], stage = 4, block = 'b')
    X = identity_block(X, f = 3, filters = [256, 256, 1024], stage = 4, block = 'c')
    X = identity_block(X, f = 3, filters = [256, 256, 1024], stage = 4, block = 'd')
    X = identity_block(X, f = 3, filters = [256, 256, 1024], stage = 4, block = 'e')
    X = identity_block(X, f = 3, filters = [256, 256, 1024], stage = 4, block = 'f')

    # Stage 5 
    X = convolutional_block(X, f = 3, filters = [512, 512, 2048], stage = 5, block = 'a', s = 2)
    X = identity_block(X, f = 3, filters = [512, 512, 2048], stage = 5, block = 'b')
    X = identity_block(X, f = 3, filters = [512, 512, 2048], stage = 5, block = 'c')

    # AVGPOOL 
    X = AveragePooling2D(pool_size = (2, 2), name = 'avg_pool')(X)

    # Output layer 
    # Binary Classification (1=Yes, 0=No)
    X = Flatten()(X)
    X = Dense(classes, activation='sigmoid', name='fc' + str(classes), kernel_initializer = glorot_uniform(seed=0))(X)
    
    # create model
    model = Model(inputs = X_input, outputs = X, name='ResNet50')

    return model

In [16]:
# creating the model
model = ResNet50()  # using the default arguments

**Quick Note**: considering the substential data imbalance in the dataset (No = 98 out of a total of 253 images), I decided to use the AUC as the evaluation metric for the model (instead of the typical 'accuracy'). Other metrics could have been chosen, and other techniques to overcome this biasing circumstance exist (such as resampling strategies), which I leave for now for another time. A useful tutorial link has been added to the *Aknowledgments* section, for further considerations and discussion on the matter.

In [17]:
# compiling the model
# Note: Because of the umbalance in the training dataset (discussed earlier),
# I will use other evaluation metrics later on (AUC is an example)
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=[tf.keras.metrics.AUC()])

# Loading the dataset

In [18]:
X_train_orig, Y_train_orig, X_test_orig, Y_test_orig = load_dataset()

In [19]:
# Normalize image vectors
X_train = X_train_orig/255.
X_test = X_test_orig/255.

In [20]:
# Convert training and test labels to one hot matrices
Y_train = convert_to_one_hot(Y_train_orig, 2).T
Y_test = convert_to_one_hot(Y_test_orig, 2).T

In [21]:
print ("number of training examples = " + str(X_train.shape[0]))
print ("number of test examples = " + str(X_test.shape[0]))
print ("X_train shape: " + str(X_train.shape))
print ("Y_train shape: " + str(Y_train.shape))
print ("X_test shape: " + str(X_test.shape))
print ("Y_test shape: " + str(Y_test.shape))

number of training examples = 202
number of test examples = 51
X_train shape: (202, 128, 128, 3)
Y_train shape: (202, 2)
X_test shape: (51, 128, 128, 3)
Y_test shape: (51, 2)


# Fitting the Model

In [22]:
model.fit(X_train, Y_train, epochs = 20, batch_size = 32)

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.callbacks.History at 0x7f88847122d0>

# Model Evaluation

In [23]:
# predictions
preds = model.evaluate(X_test, Y_test)



In [24]:
# AUC 
print ("Loss = " + str(preds[0]))
print ("Test AUC = " + str(preds[1]))

Loss = 1.1505577634362614
Test AUC = 0.9640340805053711


In [28]:
# saving the model
model.save('ResNet50')

# Some Information on the Model

In [29]:
# detailed info about the model
model.summary()

Model: "ResNet50"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_2 (InputLayer)            (None, 128, 128, 3)  0                                            
__________________________________________________________________________________________________
zero_padding2d_2 (ZeroPadding2D (None, 134, 134, 3)  0           input_2[0][0]                    
__________________________________________________________________________________________________
conv1 (Conv2D)                  (None, 64, 64, 64)   9472        zero_padding2d_2[0][0]           
__________________________________________________________________________________________________
bn_conv1 (BatchNormalization)   (None, 64, 64, 64)   256         conv1[0][0]                      
___________________________________________________________________________________________

# Acknowledgements

- I'm beyond grateful for the amazing teachers at deeplearning.ai and Coursera, including the leading AI educator: Andrew Ng.

- Deep Learning Specialisation Course 4 (CNN): Week 2 Programming Assignment 2 on RNN

- [this](https://github.com/feiyuhuahuo/create-a-hdf5-data-set-for-deep-learning) awesome repo that contains a tutorial on how to create a hdf5 dataset for deep learning and simple image classification tasks like the one in this notebook

- [this](https://github.com/tensorflow/tensorflow/issues/9829) life-saving thread, without which my Kernel kept dying for 'seemingly' no reason

- [online resource](https://www.kdnuggets.com/2017/06/7-techniques-handle-imbalanced-data.html) on techniques for handling imbalanced data

- the [source](https://www.kaggle.com/navoneel/brain-mri-images-for-brain-tumor-detection) of the image dataset, of course!

- for the StackOverFlow Community in general, for being such a huge support, every single time!