# Detecting Glioma, Meningioma, and Pituitary Brain Tumors Using Convolutional Neural Networks

Advancements in medical image classification hold the potential to streamline the tumor detection process. While not unerring, technology-based classification offers a preliminary categorization of tumors, facilitating quicker decision making for professionals.

In this project we will harness the power of Convolutional Neural Networks to classify human brain MRI images into four classes: glioma, meningioma, pituitary, and no tumor. We will explore a basic CNN and several pre-trained CNNs to find the most robust classifier for brain tumor detection.


<details>
  <summary> More on Brain Tumors </summary>

Gliomas:
- Gliomas are tumors that arise from glial cells, which are supportive cells in the brain. Symptoms of gliomas vary depending on the location and size of the tumor. Treatment options include surgery, radiation therapy, and chemotherapy.

Meningiomas:
- Meningiomas are tumors that originate in the meninges, the layer of tissue covering the brain and spinal cord. These tumors are usually benign and slow-growing, but they can cause issues if they grow or press on nearby structures.

Pituitary Tumors:
- Pituitary tumors are growths that develop in the pituitary gland, a small gland at the base of the brain that regulates hormone production. Symptoms of pituitary tumors can include hormonal imbalances, changes in growth, and visual disturbances.


</details>



# Loading the Data

In [59]:
# imports

# OS libs
import os
# import shutil
# import itertools
import pathlib
from PIL import Image

# Data processing tools
import cv2
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
sns.set_style('whitegrid')
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix , classification_report
from sklearn.preprocessing import LabelEncoder
from tensorflow.keras.utils import to_categorical

# Deep learning libraries
import tensorflow as tf
from tensorflow import keras
import tensorflow as tf
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D , MaxPooling2D , Flatten , Activation , Dense , Dropout , BatchNormalization
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.optimizers import Adam , Adamax
from tensorflow.keras import regularizers
from tensorflow.keras.applications import ResNet50
from tensorflow.keras.layers import Input, Dense, GlobalAveragePooling2D
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.applications import InceptionV3
from tensorflow.keras.applications import EfficientNetB0


#Warnings
import warnings
warnings.filterwarnings('ignore')

In [60]:
train_data_path = 'Training'
test_data_path = 'Testing'


# loading the training and testing data into a data frame
def load_data(data_path):
    filepaths = []
    labels = []

    tumor_classes = os.listdir(data_path)

    for tumor_class in tumor_classes:
        class_dir = os.path.join(data_path, tumor_class)

        for file in os.listdir(class_dir):
            filepaths.append(os.path.join(class_dir, file))
            labels.append(tumor_class)

    return pd.DataFrame({'filepaths': filepaths, 'labels': labels})

train_df = load_data(train_data_path)
test_df = load_data(test_data_path)


In [61]:
# looking at the image categories
train_df['labels'].unique()
test_df['labels'].unique()

array(['pituitary', 'notumor', 'glioma', 'meningioma'], dtype=object)

In [62]:
# checking the distribution of images
label_counts = test_df['labels'].value_counts()
print(label_counts)


notumor       405
meningioma    306
pituitary     300
glioma        300
Name: labels, dtype: int64


In [63]:
# splitting testing data into validation data and testing data
valid_df,test_df=train_test_split(test_df,train_size=0.5,shuffle=True,random_state=42)

In [64]:
print(len(train_df))
print(len(valid_df))
print(len(test_df))

5712
655
656


In [65]:
# @title
import plotly.express as px
from plotly.subplots import make_subplots
import plotly.graph_objects as go

# Assuming 'train_df', 'valid_df', and 'test_df' are your DataFrames
class_distribution_train = train_df['labels'].value_counts().reset_index()
class_distribution_train.columns = ['Tumor Class', 'Count']

class_distribution_valid = valid_df['labels'].value_counts().reset_index()
class_distribution_valid.columns = ['Tumor Class', 'Count']

class_distribution_test = test_df['labels'].value_counts().reset_index()
class_distribution_test.columns = ['Tumor Class', 'Count']

# Create subplots with three bar charts side by side
fig = make_subplots(rows=1, cols=3, subplot_titles=('Training Data', 'Validation Data', 'Test Data'))

# Define a color map for each tumor class
color_map = {'notumor': 'blue', 'meningioma': 'orange', 'pituitary': 'green', 'glioma': 'red'}

# Add bar chart for training data
fig.add_trace(
    go.Bar(
        x=class_distribution_train['Tumor Class'],
        y=class_distribution_train['Count'],
        marker_color=class_distribution_train['Tumor Class'].map(color_map),
        text=class_distribution_train['Count'],
        hoverinfo='text+x+y',
    ),
    row=1, col=1
)

# Add bar chart for validation data
fig.add_trace(
    go.Bar(
        x=class_distribution_valid['Tumor Class'],
        y=class_distribution_valid['Count'],
        marker_color=class_distribution_valid['Tumor Class'].map(color_map),
        text=class_distribution_valid['Count'],
        hoverinfo='text+x+y',
    ),
    row=1, col=2
)

# Add bar chart for test data
fig.add_trace(
    go.Bar(
        x=class_distribution_test['Tumor Class'],
        y=class_distribution_test['Count'],
        marker_color=class_distribution_test['Tumor Class'].map(color_map),
        text=class_distribution_test['Count'],
        hoverinfo='text+x+y',
    ),
    row=1, col=3
)

# Update layout
fig.update_layout(title_text='Distribution of Tumor Classes in Training, Validation, and Test Data', showlegend=False)

# Show the plot
fig.show()

In [66]:
# loading the images as arrays to feed into NN
def load_images_and_labels(data_df):
    images = []
    labels = []

    img_size = (256, 256)

    for path, label in zip(data_df['filepaths'], data_df['labels']):
        img = cv2.imread(path)
        img = cv2.resize(img, img_size)
        images.append(img)
        labels.append(label)

    return np.array(images), np.array(labels)

# loading train images
train_images, train_labels = load_images_and_labels(train_df)

# loading test images
test_images, test_labels = load_images_and_labels(test_df)

# loading valid images
valid_images, valid_labels = load_images_and_labels(valid_df)


For nueral networks, we need to one hot encode the target variables. This means instead of representing our data using text labels, we use 0s and 1s. This is what we try to achieve:



<div style="display: flex; justify-content: center; align-items: center; height: 200px;">
  <img src="https://drive.google.com/uc?id=100PPZ96PKkpp75LT1IWtQvaaTkC7aPRW" width="400">
</div>

We indicate the presense of tumors with a 1.



In [67]:
# one-hot encoding the target variables
# instead of representing everything as labels, we represent them as 0s and 1s.
label_encoder=LabelEncoder()
label_encoder.fit(train_labels)

# encodes categorical data as numbers
train_labels_encoded = label_encoder.transform(train_labels)
valid_labels_encoded = label_encoder.transform(valid_labels)
test_labels_encoded = label_encoder.transform(test_labels)
num_classes = len(label_encoder.classes_)

# one-hot encode the encoded labels
train_labels_encoded = to_categorical(train_labels_encoded, num_classes=num_classes)
valid_labels_encoded = to_categorical(valid_labels_encoded, num_classes=num_classes)
test_labels_encoded= to_categorical(test_labels_encoded, num_classes=num_classes)


In [68]:
# checking shape of arrays
print(train_labels_encoded.shape)
print(valid_labels_encoded.shape)
print(test_labels_encoded.shape)

(5712, 4)
(655, 4)
(656, 4)


# Evaluation Metrics
We must choose metrics to evaluate the performance of our models.
Here are the evaluation metrics we will consider:

**Accuracy**
- The amount of correct predictions divided by the number of total predictions. Note that accuracy alone can be misleading in evaluating how well our classifier recognizes observations that fall into minority classes.
- A simple example. Suppose we have a group of images where 90% of the images fall into the no tumor class and 10% fall into the tumor class. If we have a classifier that predicts that every observation falls into the no tumor class, that classifier then has 90% accuracy. However, this is still a poor classifier. It cannot recognize observations that fall into the no tumor class, which are the most critical observations.
- Looking at precision and recall can help us address these issues.

**Precision**
- How often our model is correct when it makes a prediction that an observation falls into a certain class.
- Ex: If we have 90% precision in the pituitary tumor class, that means that 90% of the images that were classified as pituitary tumors were actually pituitary tumors.

**Recall**
- How good our model is at detecting the true positives of a class.
- Ex: If we have 90% recall in the pituitary tumor class, then the classifier was able to identify 90% of the pituitary tumor images.


*In our problem domain, **we favor a classifier with high recall in the tumor classes**. Why is this? Missing a true positive (a tumor case) carries more significant consequences than misclassifying a non tumor case as a tumor case. Prioritizing high recall in the tumor classes will allow us to capture as many true tumor cases as possible. While this approach may lead to lower precision in the tumor classes– the classifier may assign non-tumor cases to tumor classes more liberally– , we will be able to minimize the risk of overlooking true tumor cases.*



**A Simplified Introduction to the Convolutional Model Using Conv2D layers and Max Pooling**
-  Click [here](https://docs.google.com/document/d/1rzPfFhG4TEHhzyGpa0jx905BPoGSowLiE98WUArPvhM/edit?usp=sharing) for a simplified introduction to 2D convolutional networks


# Types of Models
We will explore 4 CNN types.
- 1) A basic three layered CNN with global max pooling
- 2) Pretrained ResNet50
- 3) Pretrained InceptionNet
- 4) Pretrained EfficientNet

To learn more about the pretrained models, click [here](https://docs.google.com/document/d/1tONf-wTolvjdriQJEn178vQQC4XlJ6PoGlZtUC6LbvA/edit?usp=sharing).



### Basic CNN model
- For our simplest model, we use three conv2D layers with 32 filters and max pooling. We then flatten the 3D output from the Conv2D layers and pass the flattened 1D output into the fully connected dense layers. Finally, we use softmax activation to convert the raw output of the previous dense layer into probabilities for each class. We assign the observation to the class with the highest probability.

 Click [here](https://docs.google.com/document/d/1rzPfFhG4TEHhzyGpa0jx905BPoGSowLiE98WUArPvhM/edit?usp=sharing) to learn more about the basic CNN model.

In [69]:
# creating layers
cnn_model = tf.keras.Sequential([
    # Conv2D layers and max pooling
    layers.experimental.preprocessing.Rescaling(1./255),
    layers.Conv2D(filters=32, kernel_size=3, activation='relu'),
    layers.MaxPooling2D(pool_size=(2, 2)),
    layers.Conv2D(32, 3, activation='relu'),
    layers.MaxPooling2D(),
    layers.Conv2D(32, 3, activation='relu'),
    layers.MaxPooling2D(),
    # Flattens 3 dimensional output of Conv2D layer to a one dimensional vector that can be passed into the next fully connected dense layer
    layers.Flatten(),
    # Adds a dense layer with 128 units using ReLU activation to introduce non-linearity
    # Each of the 128 units captures a specific pattern from the previous layer and produces 1 output
    layers.Dense(128, activation='relu'),
    # Computes probability over the four classes, should sum to 1
    layers.Dense(4, activation='softmax')
])



In [70]:
# compiling model
cnn_model.compile(
    optimizer='adam',
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

In [71]:
# training models
epochs = 10

history = cnn_model.fit(
    train_images,  # training data
    train_labels_encoded,  # training labels
    epochs=epochs,
    validation_data=(valid_images, valid_labels_encoded),  # Validation data and labels as a tuple
)


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [72]:
cnn_model.summary()

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 rescaling_2 (Rescaling)     (None, 256, 256, 3)       0         
                                                                 
 conv2d_97 (Conv2D)          (None, 254, 254, 32)      896       
                                                                 
 max_pooling2d_7 (MaxPooling  (None, 127, 127, 32)     0         
 2D)                                                             
                                                                 
 conv2d_98 (Conv2D)          (None, 125, 125, 32)      9248      
                                                                 
 max_pooling2d_8 (MaxPooling  (None, 62, 62, 32)       0         
 2D)                                                             
                                                                 
 conv2d_99 (Conv2D)          (None, 60, 60, 32)       

In [73]:
# checking loss and accuracy
test_loss, test_accuracy = cnn_model.evaluate(test_images, test_labels_encoded)

# print the test loss and accuracy
print(f'Test Loss: {test_loss:.4f}')
print(f'Test Accuracy: {test_accuracy * 100:.2f}%')


Test Loss: 0.2112
Test Accuracy: 95.43%


In [74]:
y_pred_cnn=cnn_model.predict(test_images)



In [75]:
test_labels_true = [label_encoder.classes_[np.argmax(label)] for label in test_labels_encoded]
pred_labels_cnn = [label_encoder.classes_[np.argmax(pred)] for pred in y_pred_cnn]


In [76]:
# creating a classification report and a data frame
report_cnn = classification_report(test_labels_true, pred_labels_cnn)
report_dict_cnn = classification_report(test_labels_true, pred_labels_cnn, output_dict=True)
report_df_cnn = pd.DataFrame(report_dict_cnn).transpose()

In [77]:
(report_df_cnn)

Unnamed: 0,precision,recall,f1-score,support
glioma,0.879747,0.965278,0.92053,144.0
meningioma,0.942857,0.862745,0.901024,153.0
notumor,0.995169,0.990385,0.992771,208.0
pituitary,0.986755,0.986755,0.986755,151.0
accuracy,0.954268,0.954268,0.954268,0.954268
macro avg,0.951132,0.951291,0.95027,656.0
weighted avg,0.955695,0.954268,0.95413,656.0


In [78]:
import plotly.graph_objects as go
import pandas as pd

# Assuming your DataFrame is named report_df_cnn
# Extract precision, recall, and tumor types
precision = report_df_cnn['precision'][:-3].astype(float)
recall = report_df_cnn['recall'][:-3].astype(float)
tumor_types = ['glioma', 'meningioma', 'notumor', 'pituitary']

# Create a DataFrame
data = {'Precision': precision, 'Recall': recall, 'Tumor Type': tumor_types}
df = pd.DataFrame(data)

# Create scatter plot with hoverable labels
fig = go.Figure()
colors = ['blue', 'orange', 'green', 'red']
for i in range(len(tumor_types)):
    fig.add_trace(go.Scatter(
        x=[recall[i]],  # Swap x and y axes
        y=[precision[i]],  # Swap x and y axes
        mode='markers',
        name=tumor_types[i],
        text=[f'{tumor_types[i]}<br>Precision: {precision[i]:.4f}<br>Recall: {recall[i]:.4f}'],
        hoverinfo='text+x+y',
        marker=dict(size=10),
        line=dict(color=colors[i], width=1)
    ))

# Add annotations above each point
for i in range(len(tumor_types)):
    fig.add_annotation(
        x=recall[i],
        y=precision[i] + 0.005,  # Adjust the y-offset for the label position
        text=tumor_types[i],
        showarrow=False,
        font=dict(size=13),
    )

# Update layout
fig.update_layout(
    title='Precision vs Recall CNN',
    xaxis=dict(title='Recall'),  # Update x-axis title
    yaxis=dict(title='Precision'),  # Update y-axis title
)

# Show the plot
fig.show()



CNN Results:
- Note that we will not evaluate our model based on its ability to detect no tumors!
- Our simple CNN seems to be a relatively effective classifier! It has high recall and precision for pituitary tumors (over 99%!).
- It was also able to detect 95% of the meningioma tumors from the test images. The precision for this class is lower: only 88% of the images the CNN classified as meningioma were actually meningioma tumors. However, since our recall is decently high, we will take the loss!
- This model had the lowest recall for glioma tumors, just short of 88%. While this number may seem relatively high, an 88% recall means that for every 100 people with glioma tumors, 12 people go undetected! That is a costly, potentially life-threatening mistake!
- Let us see if our other models produce more favorable results.

### ResNet50
- Residual Networks (ResNets) are designed to address challenges in training deep neural networks, particularly the vanishing gradient problem. Resnets mitigate this issue through the use of shortcut connections that allow gradients to flow directly to earlier layers without passing through all intermediate layers.
- The building block of a ResNet is the residual block. This block includes both a shortcut path and a main path. The shortcut path bypasses the convolutional layers and directly connects the input to the output, while the main path processes the input through the convolutional layers. The output of the main path F(x) is then added to the output of the shortcut path (x), yielding the output of the residual block y=F(x)+x.
- The goal is to make the output of the main path F(x) as close to zero as possible so that the output y approximates the input x. The network learns to adjust the weights in the main path to minimize F(x). Consequently, the output y becomes primarily influenced by the shortcut path, which involves fewer layers, preventing the issue of the vanishing gradients.


In [79]:
from tensorflow.keras.applications.resnet50 import preprocess_input
preprocessed_r50_train_images = preprocess_input(train_images)
preprocessed_r50_valid_images = preprocess_input(valid_images)
preprocessed_r50_test_images = preprocess_input(test_images)

We begin by preprocessing the data. Luckily, each of the models we are using has its own preprocesssing function.

In [80]:
base_model = ResNet50(weights='imagenet', include_top=False, input_shape=(256, 256, 3))


From keras, we import ResNet50, which is a residual network with 50 layers.
- weights='imagenet': specifies that we are using the pre-trained weights of the model trained on the ImageNet dataset
- include_top=False: indicates that we do not want to use the final dense layer of the ResNet model, which is responsible for classifying images into the original ImageNet categories
 - Since we are using the ResNet50 model for classifying images for our own task, we will include our own top layer.
- input_shape=(256,256,3): specifies the input data shape

In [81]:
# prevents the weights from being updated at each layer
# want to leverage the knowledge captured from pre-trained model
for layer in base_model.layers:
    layer.trainable=False

Now we build on top of the ResNet model to specify the network for our specific task

In [82]:
# adding global pooling layer to reduce spatial dimensions and provide global summary of the features
# avg pooling computes the average of each feature map in the last conv layer
x = GlobalAveragePooling2D()(base_model.output)
# adding a fully connected layer with 128 units and relu activation
x = Dense(128, activation='relu')(x)
# a final softmax layer with four units
output_resnet = Dense(4, activation='softmax')(x)


In [85]:
# creates a new model by specifying inputs and outputs
resnet50 = Model(inputs=base_model.input, outputs=output_resnet)

# configures leanring process by specifying the optimizer, loss function, and metrics
resnet50.compile(optimizer=Adam(lr=0.001), loss='categorical_crossentropy', metrics=['accuracy'])

# train images, train_labels_encodes= input data
# epochs= number of timesj the model will iterate over the entire dataset
# validation data: used to evaluate the performance of the model after each epoch
resnet50.fit(preprocessed_r50_train_images,
             train_labels_encoded,
             epochs=10,
             validation_data=(preprocessed_r50_valid_images, valid_labels_encoded))

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7f859289aac0>

**resnet50 = Model(inputs=base_model.input, outputs=predictions)**
- Create a new model resnet50 by specifying inputs and outputs
  - input=base_model.input: new model resnet50 has the same inputs as base_model (256x256x3)
  - output=predictions: output tensor obtained by passing the the input through the original resnet layers and the layers we added on top of the base model (GlobalAveragePooling2D, Dense layer, and softmax layer)

**resnet50.compile(optimizer=Adam(lr=0.001), loss='categorical_crossentropy', metrics=['accuracy'])**
- configures the learning process for the model by specifying optimizer, loss function, and metrics
  - optimizer= Adam: using a common optimizer for classification problem that updataes the weights to reduce the loss
  - loss='categorical crossentropy': common loss function used for multi-class classification problems
     - measures differnce between predicted porbability distributiona and true distribution
  - metrics=['accuracy']
    - reports the accuracy during training

In [86]:
# evaluates model on the given dataset
resnet50.evaluate(preprocessed_r50_test_images, test_labels_encoded)



[0.12943577766418457, 0.9649389982223511]

In [87]:
# uses resnet to predict the image class
y_pred_res50=resnet50.predict(preprocessed_r50_test_images)
# assigns the image to be the class with the largest probability
pred_labels_res50 = [label_encoder.classes_[np.argmax(pred)] for pred in y_pred_res50]




In [88]:
# creates a classification report of the data
report_res50 = classification_report(test_labels_true, pred_labels_res50)
report_dict_res50= classification_report(test_labels_true, pred_labels_res50, output_dict=True)
report_df_res50= pd.DataFrame(report_dict_res50).transpose()

In [89]:
report_df_res50

Unnamed: 0,precision,recall,f1-score,support
glioma,1.0,0.875,0.933333,144.0
meningioma,0.89759,0.973856,0.934169,153.0
notumor,1.0,1.0,1.0,208.0
pituitary,0.961538,0.993377,0.977199,151.0
accuracy,0.964939,0.964939,0.964939,0.964939
macro avg,0.964782,0.960558,0.961175,656.0
weighted avg,0.967262,0.964939,0.964764,656.0


In [90]:
import plotly.graph_objects as go
import pandas as pd

# Assuming your DataFrame is named report_df_cnn
# Extract precision, recall, and tumor types
precision = report_df_res50['precision'][:-3].astype(float)
recall = report_df_res50['recall'][:-3].astype(float)
tumor_types = ['glioma', 'meningioma', 'notumor', 'pituitary']

# Create a DataFrame
data = {'Precision': precision, 'Recall': recall, 'Tumor Type': tumor_types}
df = pd.DataFrame(data)

# Create scatter plot with hoverable labels
fig = go.Figure()
colors = ['blue', 'orange', 'green', 'red']
for i in range(len(tumor_types)):
    fig.add_trace(go.Scatter(
        x=[recall[i]],  # Swap x and y axes
        y=[precision[i]],  # Swap x and y axes
        mode='markers',
        name=tumor_types[i],
        text=[f'{tumor_types[i]}<br>Precision: {precision[i]:.4f}<br>Recall: {recall[i]:.4f}'],
        hoverinfo='text+x+y',
        marker=dict(size=10),
        line=dict(color=colors[i], width=1)
    ))

# Add annotations above each point
for i in range(len(tumor_types)):
    fig.add_annotation(
        x=recall[i],
        y=precision[i] + 0.006,  # Adjust the y-offset for the label position
        text=tumor_types[i],
        showarrow=False,
        font=dict(size=13),
    )

# Update layout
fig.update_layout(
    title='Precision vs Recall ResNet50',
    xaxis=dict(title='Recall'),  # Update x-axis title
    yaxis=dict(title='Precision'),  # Update y-axis title
)

# Show the plot
fig.show()


ResNet results
- Similar to our basic CNN, the ResNet has high recall in the pituitary class, over 99%.
- The model also produces high recall in the meningioma class-- over 96%.
- ResNet also yields a higher recall in the glioma class-- around 93%.
- While not as important, all models also yield relatively high precision in all classes-- with a minimum precision of 93.
- Note that for both classifiers, pituitary tumors are the easiest to detect, while gliomas are more difficult.


### InceptionNet
- InceptionNet, also known as GoogleNet, uses inception modules, which employ parallel filters of different sizes at the same layer to capture features at various spatial scales. An inception model typically consists of 1x1, 3x3, and 5x5 convolutional filters followed by a max pooling layer. The filters are applied in parallel, and their outputs are concatenated before being passed to the next module in the network.
- The core concept behind the inception modules is to leverage filters of different sizes to capture information at both global and local scales: larger kernels are useful for information that is distributed globally, while smaller kernels are better for information that is distributed locally.
- By combining filters of various sizes, the InceptionNet network can represent a wide range of features, making it ideal for computer vision applications.


In [91]:
from tensorflow.keras.applications.inception_v3 import preprocess_input
preprocessed_iv3_train_images = preprocess_input(train_images)
preprocessed_iv3_valid_images = preprocess_input(valid_images)
preprocessed_iv3_test_images = preprocess_input(test_images)


Using specific preprocessor for iv3

In [92]:
# using the same steps as for ResNet
# this time we are using the pretrained InceptionV3 model
base_model = InceptionV3(weights='imagenet', include_top=False, input_shape=(256, 256, 3))
for layer in base_model.layers:
    layer.trainable=False
x = GlobalAveragePooling2D()(base_model.output)
x = Dense(128, activation='relu')(x)
output_iv3 = Dense(4, activation='softmax')(x)


In [93]:
# creating an iv3 model with base_model inputs and the tailored output
iv3 = Model(inputs=base_model.input, outputs=output_iv3)

# compiling the model
iv3.compile(optimizer=Adam(lr=0.001),
            loss='categorical_crossentropy',
            metrics=['accuracy'])

# fitting model
iv3.fit(preprocessed_iv3_train_images,
       train_labels_encoded,
       epochs=10,
       validation_data=(preprocessed_iv3_valid_images,valid_labels_encoded))

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7f855286cb50>

In [94]:
# evalutating model
iv3.evaluate(preprocessed_iv3_test_images, test_labels_encoded)



[0.22605623304843903, 0.9283536672592163]

In [95]:
# letting the model make predictions
y_pred_iv3=iv3.predict(preprocessed_iv3_test_images)
pred_labels_iv3 = [label_encoder.classes_[np.argmax(pred)] for pred in y_pred_iv3]




In [96]:
# creating a classification report and a storing the output in a dataframe
report_iv3 = classification_report(test_labels_true, pred_labels_iv3)
report_dict_iv3= classification_report(test_labels_true, pred_labels_iv3, output_dict=True)
report_df_iv3= pd.DataFrame(report_dict_iv3).transpose()

In [97]:
report_df_iv3

Unnamed: 0,precision,recall,f1-score,support
glioma,0.974576,0.798611,0.877863,144.0
meningioma,0.8125,0.934641,0.869301,153.0
notumor,0.990476,1.0,0.995215,208.0
pituitary,0.940789,0.94702,0.943894,151.0
accuracy,0.928354,0.928354,0.928354,0.928354
macro avg,0.929585,0.920068,0.921568,656.0
weighted avg,0.934039,0.928354,0.928275,656.0


In [98]:
import plotly.graph_objects as go
import pandas as pd

# Assuming your DataFrame is named report_df_iv3
# Extract precision, recall, and tumor types
precision = report_df_iv3['precision'][:-3].astype(float)
recall = report_df_iv3['recall'][:-3].astype(float)
tumor_types = ['glioma', 'meningioma', 'notumor', 'pituitary']

# Create a DataFrame
data = {'Precision': precision, 'Recall': recall, 'Tumor Type': tumor_types}
df = pd.DataFrame(data)

# Create scatter plot with hoverable labels
fig = go.Figure()
colors = ['blue', 'orange', 'green', 'red']
for i in range(len(tumor_types)):
    fig.add_trace(go.Scatter(
        x=[recall[i]],  # Swap x and y axes
        y=[precision[i]],  # Swap x and y axes
        mode='markers',
        name=tumor_types[i],
        text=[f'{tumor_types[i]}<br>Precision: {precision[i]:.4f}<br>Recall: {recall[i]:.4f}'],
        hoverinfo='text+x+y',
        marker=dict(size=10),
        line=dict(color=colors[i], width=1)
    ))

# Add annotations above each point with dynamic y-offset
y_offsets = [0.025, 0.025, 0.05, 0.03]  # Adjust these values based on your preference
for i in range(len(tumor_types)):
    fig.add_annotation(
        x=recall[i],
        y=precision[i] + y_offsets[i],  # Dynamic y-offset
        text=tumor_types[i],
        showarrow=False,
        font=dict(size=13),
    )

# Update layout
fig.update_layout(
    title='Precision vs Recall Inception V3',
    xaxis=dict(title='Recall'),  # Update x-axis title
    yaxis=dict(title='Precision'),  # Update y-axis title
)

# Show the plot
fig.show()


Inception Net Results:
- Here, our model is best able to detect meningiomas (96% recall). Note that the recall is significantly lower-- only 82%.
- The model is able to detect pituitary tumors with 91% recall and 97% precision.
- The model also has the poorest recall for gliomas-- only 85%.
- So far the inception network is the poorest classifier for our task.

## Efficient Net
- EfficientNet employs a compound scaling approach to efficiently scale up models by maintaining a fixed ratio across three key dimesions: width, depth, and image resolution. This scaling method uniformly scales all three dimensions with a fixed ratio, contributing to a harmonious adjustment of model size.  
- The compound scaling is controlled by a single parameter phi, which determines the model size. Larger phi values tend to result in larger and more powerful models.
- This technique has resulted in increased performance in various computer vision tasks.



In [99]:
from tensorflow.keras.applications import EfficientNetB0  # You can choose the specific EfficientNet variant you need
from tensorflow.keras.applications.efficientnet import preprocess_input

preprocessed_eB0_train_images = preprocess_input(train_images)
preprocessed_eB0_valid_images = preprocess_input(valid_images)
preprocessed_eB0_test_images = preprocess_input(test_images)


In [100]:
# using the same steps as ResNet and IV3
# this time using EfficientNetB0
base_model = EfficientNetB0(weights='imagenet', include_top=False, input_shape=(256, 256, 3))
for layer in base_model.layers:
    layer.trainable=False
x = GlobalAveragePooling2D()(base_model.output)
x = Dense(128, activation='relu')(x)
output_eff_net = Dense(4, activation='softmax')(x)


In [101]:
# creating an eff_net model with base_model inputs and the tailored output
eff_net = Model(inputs=base_model.input, outputs=output_eff_net)

# compiling the model
eff_net.compile(optimizer=Adam(lr=0.001),
            loss='categorical_crossentropy',
            metrics=['accuracy'])

# fitting model
eff_net.fit(preprocessed_eB0_train_images,
       train_labels_encoded,
       epochs=10,
       validation_data=(preprocessed_eB0_valid_images,valid_labels_encoded))


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7f8560a073d0>

In [102]:
# evaluating the model
eff_net.evaluate(preprocessed_eB0_test_images, test_labels_encoded)



[0.07534898817539215, 0.9695122241973877]

In [103]:
# letting the model make predictions
y_pred_effB0=eff_net.predict(preprocessed_eB0_test_images)
pred_labels_effB0 = [label_encoder.classes_[np.argmax(pred)] for pred in y_pred_effB0]




In [104]:
# creating a classification report
report_effB0 = classification_report(test_labels_true, pred_labels_effB0)
report_dict_effB0= classification_report(test_labels_true, pred_labels_effB0, output_dict=True)
report_df_effB0= pd.DataFrame(report_dict_effB0).transpose()

In [105]:
report_df_effB0

Unnamed: 0,precision,recall,f1-score,support
glioma,0.931973,0.951389,0.941581,144.0
meningioma,0.935897,0.954248,0.944984,153.0
notumor,1.0,0.995192,0.99759,208.0
pituitary,1.0,0.966887,0.983165,151.0
accuracy,0.969512,0.969512,0.969512,0.969512
macro avg,0.966968,0.966929,0.96683,656.0
weighted avg,0.970116,0.969512,0.969706,656.0


In [106]:
import plotly.graph_objects as go
import pandas as pd

# Assuming your DataFrame is named report_df_cnn
# Extract precision, recall, and tumor types
precision = report_df_effB0['precision'][:-3].astype(float)
recall = report_df_effB0['recall'][:-3].astype(float)
tumor_types = ['glioma', 'meningioma', 'notumor', 'pituitary']

# Create a DataFrame
data = {'Precision': precision, 'Recall': recall, 'Tumor Type': tumor_types}
df = pd.DataFrame(data)

# Create scatter plot with hoverable labels
fig = go.Figure()
colors = ['blue', 'orange', 'green', 'red']
for i in range(len(tumor_types)):
    fig.add_trace(go.Scatter(
        x=[recall[i]],  # Swap x and y axes
        y=[precision[i]],  # Swap x and y axes
        mode='markers',
        name=tumor_types[i],
        text=[f'{tumor_types[i]}<br>Precision: {precision[i]:.4f}<br>Recall: {recall[i]:.4f}'],
        hoverinfo='text+x+y',
        marker=dict(size=10),
        line=dict(color=colors[i], width=1)
    ))

# Add annotations above each point
for i in range(len(tumor_types)):
    fig.add_annotation(
        x=recall[i],
        y=precision[i] + 0.004,  # Adjust the y-offset for the label position
        text=tumor_types[i],
        showarrow=False,
        font=dict(size=13),
    )

# Update layout
fig.update_layout(
    title='Precision vs Recall Efficient Net',
    xaxis=dict(title='Recall'),  # Update x-axis title
    yaxis=dict(title='Precision'),  # Update y-axis title
)

# Show the plot
fig.show()

Efficient Net Results:
- This model yields over 97% recall in the pituitary class with perfect precision.
- It also yields relatively high recalls for both meningioma and gliomas (96% and 95.6% respectively).
- The precision levels of all three are over 93%.
- This is our most balanced classifier! While it has slightly lower recall in the pituitary class (97% compared to 98%), it has the highest recall in the glioma class (95%!). It also has comparable meningiomas recall levels 96%.


# Conclusions
For easier comparison, we create a dataframe with the classification reports of all four models.

In [107]:
reports=[report_dict_cnn,report_dict_res50,report_dict_iv3,report_dict_effB0]
report_names=['CNN','ResNet50','IV3','EffB0']

In [108]:
all_data=[]
for  i,report in enumerate (reports):
        report_df = pd.DataFrame(report).transpose()
        report_df.reset_index(inplace=True)
        report_df = report_df.rename(columns={'index': 'labels'})
        model_name = report_names[i]
        report_df['Model Name'] = model_name

        pivot_df=report_df.pivot(index='Model Name',columns='labels')
        pivot_df.columns = [f'{col[0]} ({col[1]})' if col[1] else col[0] for col in pivot_df.columns]
        columns=list(pivot_df.columns)

        pivot_df.columns=columns
        columns_to_drop=['precision (accuracy)','recall (accuracy)','recall (accuracy)','f1-score (accuracy)','support (macro avg)','support (weighted avg)']
        final_df=pivot_df.drop(columns=columns_to_drop)
        final_df.rename(columns={'support (accuracy)': 'Accuracy'}, inplace=True)
        final_df = final_df[['Accuracy'] + [col for col in final_df.columns if col != 'Accuracy']]
        all_data.append(final_df)



In [109]:
all_data_df=pd.concat(all_data)
all_data_df.reset_index(drop=False, inplace=True)


In [110]:
all_data_df

Unnamed: 0,Model Name,Accuracy,precision (glioma),precision (macro avg),precision (meningioma),precision (notumor),precision (pituitary),precision (weighted avg),recall (glioma),recall (macro avg),...,f1-score (glioma),f1-score (macro avg),f1-score (meningioma),f1-score (notumor),f1-score (pituitary),f1-score (weighted avg),support (glioma),support (meningioma),support (notumor),support (pituitary)
0,CNN,0.954268,0.879747,0.951132,0.942857,0.995169,0.986755,0.955695,0.965278,0.951291,...,0.92053,0.95027,0.901024,0.992771,0.986755,0.95413,144.0,153.0,208.0,151.0
1,ResNet50,0.964939,1.0,0.964782,0.89759,1.0,0.961538,0.967262,0.875,0.960558,...,0.933333,0.961175,0.934169,1.0,0.977199,0.964764,144.0,153.0,208.0,151.0
2,IV3,0.928354,0.974576,0.929585,0.8125,0.990476,0.940789,0.934039,0.798611,0.920068,...,0.877863,0.921568,0.869301,0.995215,0.943894,0.928275,144.0,153.0,208.0,151.0
3,EffB0,0.969512,0.931973,0.966968,0.935897,1.0,1.0,0.970116,0.951389,0.966929,...,0.941581,0.96683,0.944984,0.99759,0.983165,0.969706,144.0,153.0,208.0,151.0


In [111]:
def create_scatter_plot(df, class_name):
    # Create a scatter plot using plotly express
    fig = px.scatter(df, x=f"recall ({class_name})", y=f"precision ({class_name})", color='Model Name', text='Model Name',
                     title=f"Precision vs. Recall {class_name}", labels={f"recall ({class_name})": f"Recall ({class_name})", f"precision ({class_name})": f"Precision ({class_name})"})

    # Add hover information, increase marker size, and move the text up
    fig.update_traces(
        mode='markers+text',
        marker=dict(size=10),  # Adjust the marker size
        textposition='top center',
        hoverinfo='text'
    )

    # Remove the legend
    fig.update_layout(showlegend=False)

    # Show the plot
    fig.show()
# Call the function for each class
class_names = ['glioma', 'meningioma', 'pituitary', 'notumor']
for class_name in class_names:
    create_scatter_plot(all_data_df, class_name)


Which models detect a certain tumor type the best (ranked from highest to lowest recall)?
Glioma: EffB0, CNN, ResNet50, and IV3
Meningioma: ResNet50, IV3, EffB0
Pituitary: CNN, ResNet50, EffB0


These are not entirely satisfying results! It seems that different models have different strengths in which tumors they are best able to classify.

To get a more singular measurment, let us average the recall of all three tumor types for each model.

In [112]:
# create two new columns-- tumor precision and tumor recall for a birds view perspective
all_data_df['tumor precision'] = all_data_df[['precision (glioma)', 'precision (pituitary)', 'precision (meningioma)']].mean(axis=1)
all_data_df['tumor recall'] = all_data_df[['recall (glioma)', 'recall (pituitary)', 'recall (meningioma)']].mean(axis=1)


In [113]:
fig = px.scatter(all_data_df, x='tumor recall', y='tumor precision', color='Model Name',
                 labels={'tumor precision': 'Average Tumor Precision', 'tumor recall': 'Average Tumor Recall'},
                 title='Average Tumor Precision vs Average Tumor Recall of the 3 tumor Classes',
                 hover_name='Model Name')

# Display the plot
fig.show()


- **Our best classifier based on average tumor recall is the efficient net (96% recall and precision)**.
- ResNet is a close second, with a slightly lower recall and a slightly higher precision.
- The basic CNN is not so bad: 94 percent average tumor recall and 95% average tumor precision.
- However the recall for IV3 is significantly worse, 90%!

**So which classifier is the best?**
- It's probably easier to determine which classifier is not the best-- Inception V3. This classifier has significantly lower average tumor recall due to the low recall in the glioma class. The other classifiers are a bit more difficult to determine. The best classifiers depend on which disease we are trying to detect!
- CNNS and EffB0s are best for glioma, ResNet50 is the best for meningiomas, and ResNets and CNNs are the best for pituitary.
- *So our choice of which classifier depends on the type of tumor we are screening for.*
- However if we want a certain answer, based on the macro tumor recall, EffB0 is the best classifier.
- Here we have found classifiers that are able to detect the tumor class for each with a minimum of 95% recall. Not too bad!