# MRI Brain Tumor Classification

### Introduction

This is a personal project to practice image classification using tensorflow. The project should be run in WSL to utilize GPU acceleration to speed up training. 

The following __[dataset](https://www.kaggle.com/datasets/masoudnickparvar/brain-tumor-mri-dataset/data)__ is used for this project. The training and testing data is already separated by the creator of the dataset, so I will be using that split.


### Package Imports

In [None]:
import tensorflow as tf

import pathlib
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt

#### Packages Used
* Tensorflow for neural network
* pathlib for accessing dataset location
* PIL.Image for image processing
* Numpy for algebra operations
* Matplotlib.pyplot for data visualization

### Image Preprocessing

In [None]:
data_dir = pathlib.Path('Dataset')
train_dir = (data_dir / 'Training')
test_dir = data_dir / 'Testing'

batch_size = 64
img_height = 256
img_width = 256

train_ds = tf.keras.utils.image_dataset_from_directory(
    train_dir,
    seed=000,
    color_mode='grayscale',
    image_size=(img_height, img_width),
    batch_size=batch_size)

val_ds = tf.keras.utils.image_dataset_from_directory(
    test_dir,
    seed=000,
    color_mode='grayscale',
    image_size=(img_height, img_width),
    batch_size=batch_size)

num_classes = len(train_ds.class_names)

AUTOTUNE = tf.data.AUTOTUNE
train_ds = train_ds.cache().prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)

### Model Training

In [None]:
model = tf.keras.Sequential([
  tf.keras.layers.Rescaling(1./255),
  tf.keras.layers.Conv2D(32, 3, activation='relu'),
  tf.keras.layers.MaxPooling2D(),
  tf.keras.layers.Conv2D(32, 3, activation='relu'),
  tf.keras.layers.MaxPooling2D(),
  tf.keras.layers.Conv2D(32, 3, activation='relu'),
  tf.keras.layers.MaxPooling2D(),
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dense(num_classes)
])

model.compile(
  optimizer='adam',
  loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
  metrics=['accuracy'])

model.fit(
  train_ds,
  validation_data=val_ds,
  epochs=10
)

This was run directly through windows and did not have the GPU speedup that WSL would provide. With the speedup a single epoch lasts around 11s on my machine, so around a 6x speed-up. 

This model ended with a 96.64% accuracy on the test set. 

### Results

In [None]:
class_names = ["Glioma", "Meningioma", "No Tumor", "Pituitary"]

plt.figure(figsize=(10, 10))
for images, labels in val_ds.take(1):
    for i in range(9):
        ax = plt.subplot(3, 3, i + 1)
        plt.imshow(images[i].numpy())
        plt.title(class_names[labels[i]])
        plt.axis("off")
        plt.colorbar()

plt.show()

Here are some example images from the verification dataset. They are grayscale, but are represented here with colors for higher contrast.

In [None]:
probability_model = tf.keras.Sequential([model, tf.keras.layers.Softmax()])
predictions = probability_model.predict(val_ds)

class_names_short = ['G', 'M', 'N', 'P']

j = 0
color = ['red', 'red', 'red', 'red']

for images, labels in val_ds.take(20):
    print(class_names[labels[j]])
    print(class_names[np.argmax(predictions[j])])

    figure, (axis1, axis2) = plt.subplots(1, 2)
    plt.title(class_names[labels[j]])
    axis1.imshow(images[j].numpy())
    #print(class_names[labels[0]])
    #print(type(images[j].numpy()))
    index_max = np.argmax(predictions[j])
    color[labels[j].numpy()] = 'blue'
    axis2.bar(class_names_short, predictions[j], width=0.3, color = color)
    plt.xlabel("Prediction")
    plt.ylabel("Certainty")
    plt.title("Tumor Predictions")
    plt.show()
    j+=1
    for i in range(len(color)):
        color[i] = 'red'
