# **Brain Tumor MRI : Classification Assignment**

---

This is a hands-on assignment in **medical image analysis** using deep learning. The objective is to build a robust classification model to accurately identify the type of brain tumor from MRI scans. This is a crucial task that can assist medical professionals in diagnosis.

## **Dataset Overview**

The dataset, contains over 7,000 brain MRI images. It is well-structured and is divided into two primary folders, `Training` and `Testing`. Each of these folders contains four subfolders, representing the different classes of tumors and healthy brains:

* `glioma` : Images of tumors originating from glial cells.
* `meningioma` : Images of tumors arising from the meninges.
* `notumor` : Images of healthy brains with no tumor present.
* `pituitary` : Images of tumors located in the pituitary gland.

The images are in JPEG format and vary in size and resolution, which will require preprocessing before being used for model training.

## **Assignment Objectives**

The main objective of this assignment is to build and evaluate a **Convolutional Neural Network (CNN)** for multi-class image classification. The key goals are:

1.  **Data Loading and Preprocessing**  
Effectively load images from the directory structure and perform necessary preprocessing steps, such as resizing, converting to grayscale, and normalizing pixel values.
2.  **Model Architecture**  
Design and implement a CNN model suitable for image classification.
3.  **Model Training**  
Train the CNN model on the `Training` dataset.
4.  **Model Evaluation**  
Evaluate the trained model's performance on the `Testing` dataset using a confusion matrix and classification report.
5.  **Bonus (If time permits)**  
Implement data augmentation techniques to improve model generalization.

## **Key Concepts**

* **Deep Learning**  
Understanding the fundamentals of neural networks and their application in image analysis.
* **Convolutional Neural Networks (CNNs)**  
Learning how convolutional, pooling, and dense layers work together to extract features from images.
* **Image Preprocessing**  
Knowledge of techniques to prepare image data for a neural network.
* **Multi-class Classification**  
Training a model to predict one of four possible classes.
* **Model Evaluation**  
Using appropriate metrics to assess model performance on unseen data.

# **Initial Setup**

In [None]:
!pip install ipython-autotime

In [None]:
#
# Magic command to get the elpased time for each cell
#
%load_ext autotime

In [None]:
#
# Import Libraries
#
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

import os
import zipfile
from PIL import Image

import random

from tensorflow.keras import datasets, layers, models
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.callbacks import EarlyStopping
from sklearn.metrics import classification_report, confusion_matrix

from google.colab import files

#
# Suppress the UserWarning from Pillow (PIL)
#
import warnings
warnings.filterwarnings("ignore", category = UserWarning)

In [None]:
#
# Set Pandas display format for floats
#
pd.options.display.float_format = '{:.2f}'.format

#
# Set NumPy's print options to suppress scientific notation and set precision.
# 'suppress  = True' - Prevents the use of scientific notation.
# 'precision = 2'    - Sets the number of decimal places to display.
#
np.set_printoptions(suppress = True, precision = 2)

In [None]:
#
# Upload the dataset 'BrainTumourMRI.zip'
#
# Absolute Filename = /content/BrainTumourMRI.zip
#
uploadOutput = files.upload()

# **Load Data**
**Note**  
Before running this cell, please ensure you have uploaded the `BrainTumourMRI.zip` file to your working directory.

This script first defines the directory structure and the categories for the dataset. It then proceeds to load the images and their corresponding labels for both the training and testing sets. The images are loaded as raw PIL (Pillow) objects and their labels are stored as integers representing the category index.

In [None]:
#
# Extract zip file
#

#
# Define the path to the zipped dataset file
#
ZIP_FILE_PATH  = '/content/BrainTumourMRI.zip'
EXTRACTION_DIR = '/content/'

#
# Unzip the dataset file
#
try:
  with zipfile.ZipFile(ZIP_FILE_PATH, 'r') as zip_ref:
    print(f"Extracting contents from '{ZIP_FILE_PATH}'...")
    zip_ref.extractall(EXTRACTION_DIR)

  print("Extraction complete!")
except FileNotFoundError:
  print(f"Error: The file '{ZIP_FILE_PATH}' was not found.")
  print("Please ensure the file is uploaded to the specified path.")
  exit()

In [None]:
#
# Load Data
#
EXTRACTED_DIR = 'Brain Tumour MRI'
BASE_DATA_DIR = os.path.join(EXTRACTION_DIR, EXTRACTED_DIR)
CATEGORIES    = ['glioma', 'meningioma', 'notumor', 'pituitary']

print(BASE_DATA_DIR)
print(CATEGORIES)

In [None]:
#
# Function to load the raw images and labels
#
def loadRawData(dataType):
    #
    # Loads raw PIL Image objects and their labels from a specified directory
    #
    # Parameters:
    #     dataType (str): Either 'Training' or 'Testing'
    #
    # Returns:
    #     tuple: A tuple containing two lists (images, labels)
    #
    images = []
    labels = []

    dataPath = os.path.join(BASE_DATA_DIR, dataType)

    #
    # Check, if the data path directory exists
    #
    if not os.path.isdir(dataPath):
      print(f"Directory not found: {dataPath}. Skipping.")
      return images, labels

    #
    # Check and load images and labels for categories
    #
    for category in CATEGORIES:
      categoryPath = os.path.join(dataPath, category)
      labelIndex   = CATEGORIES.index(category)

      #
      # Check, if the data path directory exists
      #
      if not os.path.isdir(categoryPath):
        print(f"Directory not found: {categoryPath}. Skipping.")
        continue # Continue with the next category

      #
      # Loop through the category and load every JPEG file
      #
      for fileName in os.listdir(categoryPath):
        #
        # Check, if it is a JPEG file
        #
        if fileName.endswith('.jpg'):
          filePath = os.path.join(categoryPath, fileName)

          try:
            #
            # Open the image as a raw PIL object
            #
            thisImage = Image.open(filePath)
            images.append(thisImage)
            labels.append(labelIndex)
          except Exception as e:
            print(f"Could not load image {filePath}: {e}")

    return images, labels

In [None]:
#
# Load Training Data
#
DATATYPE = 'Training'
X_train, y_train = loadRawData(DATATYPE)

#
# Load Testing Data
#
DATATYPE = 'Testing'
X_test, y_test = loadRawData(DATATYPE)

print("Dataset Loading Summary")
print("=======================")
print(f"Number of training images loaded : {len(X_train)}")
print(f"Number of training labels loaded : {len(y_train)}")
print(f"Number of testing images loaded  : {len(X_test)}" )
print(f"Number of testing labels loaded  : {len(y_test)}" )

In [None]:
X_train[:5]

In [None]:
X_test[:5]

In [None]:
y_train[:5]

In [None]:
y_test[:5]

# **Visualisation**  
Now that the data is loaded, it is good practice to visualize some samples to ensure that everything is correct. The following blocks will display a random image from each of the four categories, along with its label.


In [None]:
#
# First image
#
X_train[0]

In [None]:
#
# Create a dictionary to store one random image per category for visualization
#
randomImages = {}

#
# OPTIMIZATION
# Create a more efficient lookup table for each category
#
# This avoids iterating through all labels multiple times. Instead, we
# build this mapping once and then use it for fast lookups
#
# This is a much more performant way to handle large datasets
#
categoryIndices = {category: [] for category in CATEGORIES}

for i, label in enumerate(y_train):
    categoryName = CATEGORIES[label]
    categoryIndices[categoryName].append(i)

#
# Loop through our predefined CATEGORIES list to ensure we get an image
# for each one
#
for category, indices in categoryIndices.items():
  #
  # If the list of indices for the current category is not empty,
  # we select a random index from it.
  #
  # If indices is a non-empty list, Python considers it "truthy" and the 'if'
  # is executed
  #
  if indices:
    randomIndex = random.choice(indices)
    randomImages[category] = X_train[randomIndex] # Populate the dictionary

#
# Create a figure to display the images
#
plt.figure(figsize = (15, 8))
plt.suptitle("Random Sample Images from Each Category", fontsize = 18, fontweight = 'bold')
plt.tight_layout(pad = 3.0) # Adjust padding between subplots

#
# Loop through the dictionary and display each image
#
for i, (category, image) in enumerate(randomImages.items()):
    ax = plt.subplot(1, len(CATEGORIES), i + 1)
    ax.imshow(image, cmap = 'gray') # Brain MRIs are typically grayscale
    ax.set_title(f"Label: {category}\nShape: {image.size}", fontsize = 12)
    ax.axis('off') # Hide the axes

plt.show()

# **Pre-Processing**
**This also includes Normalisation**

In [None]:
#
# Define image size
#
IMAGE_SIZE = 128

In [None]:
#
# Function to pre-process the images
#
# NOTE - This dataset is not the same format as cifar10, which is already
#        an numpy array
#
def preProcessImages(imageList):
  #
  # Preprocesses a list of PIL Image objects for model training
  #
  # Parameters:
  #   imageList (list): A list of PIL Image objects
  #
  # Returns:
  #   np.array: A NumPy array of pre-processed images
  #

  #
  # Create an empty list to hold the pre-processed images
  #
  processedImages = []

  for thisImage in imageList:
    #
    # Resize the image to a consistent size (128x128 pixels)
    #
    # This is a critical step because CNNs require all input images to have
    # the exact same dimensions. Our raw images vary in size.
    #
    imageResized = thisImage.resize((IMAGE_SIZE, IMAGE_SIZE))

    #
    # Convert the image to a single-channel grayscale format ('L')
    #
    # An MRI scan is fundamentally a grayscale image
    # The different shades of gray represent varying tissue densities, which is
    # what allows a doctor (or a CNN) to see a tumor
    #
    # If we convert these images to a binary "black and white" format,
    # we would lose all of that rich, subtle information and the model would
    # likely perform very poorly
    #
    # This is the crucial fix for the ValueError. Some of our images are
    # likely in RGB format, which would result in a shape of (128, 128, 3).
    # By converting them all to grayscale, we ensure a consistent shape of
    # (128, 128) for every image before we add the channel dimension.
    #
    imageGrayscale = imageResized.convert('L')

    #
    # Convert the resized PIL Image object into a NumPy array
    #
    # This is necessary because TensorFlow models operate on NumPy arrays,
    # not on PIL Image objects. This is the main difference from datasets
    # like CIFAR-10 which are already provided in NumPy array format.
    #
    imageArray = np.array(imageGrayscale)

    #
    # Expand dimensions to add the channel axis
    #
    # Our images are grayscale, with a shape of (128, 128). CNNs, especially
    # in frameworks like TensorFlow, expect a channel dimension
    # For grayscale images, the shape should be (128, 128, 1)
    # This line adds that last dimension.
    #
    imageArray = np.expand_dims(imageArray, axis = -1)

    #
    # Add the processed image to the list
    #
    processedImages.append(imageArray)

  #
  # Convert the list of arrays to a single NumPy array.
  #
  # We now have a list of individual image arrays, which we need to stack
  # into a single, cohesive NumPy array to feed into the model.
  #
  processedImages = np.array(processedImages)

  #
  # Normalize the pixel values to a range of [0, 1]
  #
  # This is a standard and crucial step in deep learning. Normalizing the
  # data helps the model converge faster and improves its overall performance
  #
  # We are dividing by 255.0 because the pixel values are in the range of
  # 0 to 255
  #
  # We can also use 255 but 255.0 is considered to be best practice
  #
  return processedImages / 255.0



In [None]:
#
# Preprocess the training and testing images
# It matches with what we have seen in cifar10 images
#
X_train_processed = preProcessImages(X_train)
X_test_processed  = preProcessImages(X_test)

print("Data Preprocessing Summary")
print("==========================")
print(f"Shape of preprocessed training images : {X_train_processed.shape}")
print(f"Shape of preprocessed testing images  : {X_test_processed.shape}")

In [None]:
#
# One-Hot Encode the labels
#
# We are using 'to_categorical' to convert our integer labels (e.g., 0, 1, 2, 3)
# into a one-hot encoded format (e.g., [1,0,0,0], [0,1,0,0], etc.).
#
# This is a requirement for the 'categorical_crossentropy' loss function
# which is commonly used for multi-class classification
#
# The alternative, 'sparse_categorical_crossentropy', would allow us to skip
# this step, but 'to_categorical' makes the output shape of the model more
# explicit and is a common practice and also we've done this in cifar10
#
# For our four categories:
#   glioma     (which we represent as 0) becomes [1., 0., 0., 0.]
#   meningioma (which is 1)              becomes [0., 1., 0., 0.]
#   notumor    (which is 2)              becomes [0., 0., 1., 0.]
#   pituitary  (which is 3)              becomes [0., 0., 0., 1.]
#
y_train_encoded = to_categorical(y_train, num_classes = len(CATEGORIES))
y_test_encoded  = to_categorical(y_test,  num_classes = len(CATEGORIES))

print("Data Preprocessing Summary")
print("==========================")
print(f"Shape of one-hot encoded training labels : {y_train_encoded.shape}")
print(f"Shape of one-hot encoded testing labels  : {y_test_encoded.shape}")


In [None]:
#
# Print the shapes of the preprocessed data to confirm
#
print("Data Preprocessing Summary")
print("==========================")
print(f"Shape of preprocessed training images    : {X_train_processed.shape}")
print(f"Shape of one-hot encoded training labels : {y_train_encoded.shape}")
print(f"Shape of preprocessed testing images     : {X_test_processed.shape}")
print(f"Shape of one-hot encoded testing labels  : {y_test_encoded.shape}")

# **Model Building - Part 01**

In [None]:
#
# Define the CNN architecture
#
NUMBER_OF_NEURONS = 512
numberOfClasses   = len(CATEGORIES)

cnnModelInitial = models.Sequential(
                                     [
                                       #
                                       # First Convolutional Block
                                       # Convolution -> Activation -> Pooling
                                       #
                                       # This block learns to detect basic features like edges and curves
                                       #
                                       # The Conv2D layer applies 32 filters of size 3x3 to the input images
                                       # 'relu' is the activation function that introduces non-linearity
                                       # 'input_shape' is specified for the first layer to define the
                                       # dimensions of our input images (128x128 pixels, 1 channel for grayscale)
                                       #
                                       layers.Conv2D(
                                                      filters     = 32,
                                                      kernel_size = (3, 3),
                                                      activation  = 'relu',
                                                      input_shape = (IMAGE_SIZE, IMAGE_SIZE, 1)
                                                    ),
                                       #
                                       # MaxPooling2D downsamples the feature maps, reducing the dimensionality
                                       #
                                       # This helps in reducing computation and preventing overfitting
                                       # It takes the maximum value in each 2x2 window.
                                       #
                                       # NOTE - MaxPooling2D and MaxPool2D are exactly the same
                                       #        MaxPool2D belongs to old Keras and now it's an
                                       #        alias for MaxPooling2D
                                       #
                                       layers.MaxPooling2D((2, 2)),

                                       #
                                       # Second Convolutional Block
                                       # Convolution -> Activation -> Pooling
                                       #
                                       # This block learns more complex features. We double the number of filters
                                       # to allow the model to learn a richer representation.
                                       #
                                       layers.Conv2D(
                                                      filters     = 64,
                                                      kernel_size = (3, 3),
                                                      activation  = 'relu'
                                                    ),
                                       layers.MaxPooling2D((2, 2)),

                                       #
                                       # Third Convolutional Block
                                       # Convolution -> Activation -> Pooling
                                       #
                                       # We add a third block with 128 filters to capture even more abstract features.
                                       #
                                       layers.Conv2D(
                                                      filters     = 128,
                                                      kernel_size = (3, 3),
                                                      activation  = 'relu'
                                                    ),
                                       layers.MaxPooling2D((2, 2)),

                                       #
                                       # Classification Head: Flatten -> Dense -> Output
                                       #

                                       #
                                       # Flatten Layer
                                       #
                                       # This layer converts the 2D feature maps from the previous layers into
                                       # a 1D vector. This is necessary because the dense layers expect
                                       # a one-dimensional input.
                                       #
                                       layers.Flatten(),

                                       #
                                       # Dense Layers / Fully Connected Layers
                                       #
                                       # The dense layers perform the actual classification based on the features
                                       # extracted by the convolutional blocks.
                                       #
                                       # The first dense layer has 512 neurons with 'relu' activation.
                                       # It acts as a hidden layer.
                                       #
                                       layers.Dense(
                                                     NUMBER_OF_NEURONS,
                                                     activation = 'relu'
                                                   ),

                                       #
                                       # Output Layer
                                       #
                                       # This is the final layer.
                                       # It has 4 neurons, one for each of our tumor categories
                                       #
                                       # The 'softmax' activation function converts the raw output
                                       # scores into a probability distribution, where the sum of
                                       # probabilities for all categories equals 1.
                                       # The highest probability indicates the model's
                                       # predicted class.
                                       #
                                       layers.Dense(
                                                     numberOfClasses,
                                                     activation = 'softmax'
                                                   )
                                     ]
                                   )

In [None]:
#
# Compile the CNN Model
#

#
# 'adam' is a popular and efficient optimizer.
# 'categorical_crossentropy' is the standard loss function for multi-class
# classification problems with one-hot encoded labels.
# 'accuracy' is the metric we'll use to monitor performance during training.
#
cnnModelInitial.compile(
                         optimizer = 'adam', # 'Adam'/'ADAM' works too, but like 'relu' lowercase is the standard
                         loss      = 'categorical_crossentropy',
                         metrics   = ['accuracy']
                       )

In [None]:
#
# Print the model summary
#
# This gives us a detailed view of the model's architecture, including the
# output shape and the number of parameters for each layer.
#
cnnModelInitial.summary()

# **Model Training (Fitting)**

In [None]:
#
# Define key training parameters
#
EPOCHS     = 20 # Number of times the entire dataset is passed through the network
BATCH_SIZE = 32 # Number of samples processed before the model's weights are updated

In [None]:
print(f"Starting model training for {EPOCHS} epochs...")
print("===============================================")

#
# Fit the model to the training data
#
# model.fit() executes the training process
#
# The input arguments are:
# - X_train_processed : The preprocessed image data (features).
# - y_train_encoded   : The one-hot encoded labels (targets).
# - epochs            : The total number of iterations over the training set.
# - batch_size        : The number of samples per gradient update.
# - validation_split  : We reserve 20% of the training data to monitor
#                       the model's performance on unseen data during training,
#                       which helps detect overfitting early.
# - verbose           : Sets the level of detail displayed during training (1 = progress bar).
#
trainingHistoryInitial = cnnModelInitial.fit(
                                              X_train_processed,
                                              y_train_encoded,
                                              epochs           = EPOCHS,
                                              batch_size       = BATCH_SIZE,
                                              validation_split = 0.2, # Use 20% of training data for validation
                                              verbose          = 1
                                            )

print("Model training complete!")
print("Training history is stored in the 'trainingHistoryInitial' variable")

In [None]:
#
# Displaying a brief summary of the final training results
#
print(trainingHistoryInitial.history)
print("======================")
print("Final Training Summary")
print("======================")
print(f"Final Training Loss       : {trainingHistoryInitial.history['loss'][-1]:.4f}"        )
print(f"Final Training Accuracy   : {trainingHistoryInitial.history['accuracy'][-1]:.4f}"    )
print(f"Final Validation Loss     : {trainingHistoryInitial.history['val_loss'][-1]:.4f}"    )
print(f"Final Validation Accuracy : {trainingHistoryInitial.history['val_accuracy'][-1]:.4f}")

## **Observation on Model**


---


These results show a **classic and severe case of overfitting**, meaning the model has perfectly learned the training examples but is failing to generalize that knowledge to new, unseen data.

---

### **Key Observation: The Large Performance Gap (High Variance)**

The critical issue is the massive difference between **Training Accuracy** and **Validation Accuracy**:

- **Training Accuracy:** `1.0000` (100%)  
- **Validation Accuracy:** `0.9116` (91.16%)  

A 100% training accuracy indicates the model has essentially **memorized the entire training set**, including noise or irrelevant details.  
When tested on validation data, performance drops by nearly **9 percentage points**, showing high variance and poor generalization.

---

### **Interpretation of Loss Values**

The loss values confirm the severity of the overfitting:

- **Training Loss:** `0.0001` (near zero)  
  → The model is extremely confident in its predictions on the training set.  

- **Validation Loss:** `0.6109` (high)  
  → Even when the model predicts correctly, it is **not confident** on unseen data.  
  → When it predicts wrong, it is often **spectacularly wrong**.  

This contrast between training and validation losses clearly demonstrates poor generalization ability.

---

### **Conclusion**

The model is **over-complicated** or has been **trained too long** (too many epochs) relative to the dataset size/complexity.

---

### **Next Steps to Fix Overfitting**

1. **Implement Early Stopping**  
   - Stop training automatically when `val_loss` begins to rise.  
   - Save weights from the **best epoch**.  

2. **Add Dropout Layers**  
   - Introduce dropout after convolution/pooling layers.  
   - Forces the network to learn more **robust, generalizable features**.  

3. **Data Augmentation**  
   - Expand dataset variability with random **rotations, flips, and shifts**.  
   - Makes the model more robust to real-world variations.  

In [None]:
#
# Evaluate model for comparison
#
lossInitial, accuracyInitial = cnnModelInitial.evaluate(
                                                         X_test_processed,
                                                         y_test_encoded,
                                                         verbose = 1
                                                       )

print("Initial Model Evaluation")
print("========================")
print(f"Test Loss    : {lossInitial:.4f}"    )
print(f"Test Accuracy: {accuracyInitial:.4f}")

# **Model Building - Part 02**

In [None]:
#
# Define the CNN architecture
#
NUMBER_OF_NEURONS = 512
numberOfClasses   = len(CATEGORIES)

cnnModelRegularised = models.Sequential(
                                         [
                                           #
                                           # First Convolutional Block
                                           # Convolution -> Activation -> Pooling
                                           #
                                           # This block learns to detect basic features like edges and curves
                                           #
                                           # The Conv2D layer applies 32 filters of size 3x3 to the input images
                                           # 'relu' is the activation function that introduces non-linearity
                                           # 'input_shape' is specified for the first layer to define the
                                           # dimensions of our input images (128x128 pixels, 1 channel for grayscale)
                                           #
                                           layers.Conv2D(
                                                          filters     = 32,
                                                          kernel_size = (3, 3),
                                                          activation  = 'relu',
                                                          input_shape = (IMAGE_SIZE, IMAGE_SIZE, 1)
                                                        ),
                                           #
                                           # MaxPooling2D downsamples the feature maps, reducing the dimensionality
                                           #
                                           # This helps in reducing computation and preventing overfitting
                                           # It takes the maximum value in each 2x2 window.
                                           #
                                           # NOTE - MaxPooling2D and MaxPool2D are exactly the same
                                           #        MaxPool2D belongs to old Keras and now it's an
                                           #        alias for MaxPooling2D
                                           #
                                           layers.MaxPooling2D((2, 2)),

                                           #
                                           # NEW REGULARIZATION: Dropout Layer
                                           # Randomly sets 25% of the input units to 0 at each update
                                           # during training. This prevents neurons from co-adapting,
                                           # forcing the network to learn more robust features.
                                           #
                                           layers.Dropout(0.25),

                                           #
                                           # Second Convolutional Block
                                           # Convolution -> Activation -> Pooling
                                           #
                                           # This block learns more complex features. We double the number of filters
                                           # to allow the model to learn a richer representation.
                                           #
                                           layers.Conv2D(
                                                          filters     = 64,
                                                          kernel_size = (3, 3),
                                                          activation  = 'relu'
                                                        ),
                                           layers.MaxPooling2D((2, 2)),

                                           #
                                           # NEW REGULARIZATION: Second Dropout Layer
                                           #
                                           layers.Dropout(0.25),

                                           #
                                           # Third Convolutional Block
                                           # Convolution -> Activation -> Pooling
                                           #
                                           # We add a third block with 128 filters to capture even more abstract features.
                                           #
                                           layers.Conv2D(
                                                          filters     = 128,
                                                          kernel_size = (3, 3),
                                                          activation  = 'relu'
                                                        ),
                                           layers.MaxPooling2D((2, 2)),

                                           #
                                           # Classification Head: Flatten -> Dense -> Output
                                           #

                                           #
                                           # Flatten Layer
                                           #
                                           # This layer converts the 2D feature maps from the previous layers into
                                           # a 1D vector. This is necessary because the dense layers expect
                                           # a one-dimensional input.
                                           #
                                           layers.Flatten(),

                                           #
                                           # Dense Layers / Fully Connected Layers
                                           #
                                           # The dense layers perform the actual classification based on the features
                                           # extracted by the convolutional blocks.
                                           #
                                           # The first dense layer has 512 neurons with 'relu' activation.
                                           # It acts as a hidden layer.
                                           #
                                           layers.Dense(
                                                         NUMBER_OF_NEURONS,
                                                         activation = 'relu'
                                                       ),

                                           #
                                           # Output Layer
                                           #
                                           # This is the final layer.
                                           # It has 4 neurons, one for each of our tumor categories
                                           #
                                           # The 'softmax' activation function converts the raw output
                                           # scores into a probability distribution, where the sum of
                                           # probabilities for all categories equals 1.
                                           # The highest probability indicates the model's
                                           # predicted class.
                                           #
                                           layers.Dense(
                                                         numberOfClasses,
                                                         activation = 'softmax'
                                                       )
                                         ]
                                       )

In [None]:
#
# Compile the CNN Model
#

#
# 'adam' is a popular and efficient optimizer.
# 'categorical_crossentropy' is the standard loss function for multi-class
# classification problems with one-hot encoded labels.
# 'accuracy' is the metric we'll use to monitor performance during training.
#
cnnModelRegularised.compile(
                             optimizer = 'adam',                     # 'Adam'/'ADAM' works too, but like 'relu' lowercase is the standard
                             loss      = 'categorical_crossentropy', # Standard for multi-class classification
                             metrics   = ['accuracy']
                           )

In [None]:
#
# Print the model summary
#
# This gives us a detailed view of the model's architecture, including the
# output shape and the number of parameters for each layer.
#
cnnModelRegularised.summary()

# **Model Training (Fitting) and Early Stopping**

In [None]:
#
# Define key training parameters
#
EPOCHS     = 20 # Number of times the entire dataset is passed through the network
BATCH_SIZE = 32 # Number of samples processed before the model's weights are updated

In [None]:
#
# NEW REGULARIZATION: Early Stopping Callback
#
# This callback is essential for mitigating overfitting. It monitors the
# 'val_loss' (validation loss) and stops training if it doesn't improve
# for a specified number of epochs ('patience'). This ensures we save the
# model state from the epoch with the best generalization performance.
#
earlyStoppingCallback = EarlyStopping(
                                       monitor              = 'val_loss', # The metric to watch
                                       patience             = 3,          # Number of epochs with no improvement after which training will be stopped
                                       restore_best_weights = True,       # Restore model weights from the epoch with the best 'val_loss'
                                       verbose              = 1           # Log when stopping occurs
                                     )


In [None]:
print(earlyStoppingCallback)

In [None]:
print(f"Starting model training for {EPOCHS} epochs...")
print("===============================================")

#
# Fit the model to the training data
#
# model.fit() executes the training process
#
# The input arguments are:
# - X_train_processed : The preprocessed image data (features).
# - y_train_encoded   : The one-hot encoded labels (targets).
# - epochs            : The total number of iterations over the training set.
# - batch_size        : The number of samples per gradient update.
# - validation_data   : We explicitly pass our prepared testing data
#                       (X_test_processed, y_test_encoded) to monitor
#                       generalization.
# - callbacks         : Monitors the 'val_loss' (validation loss) and stops
#                       training if it doesn't improve
# - verbose           : Sets the level of detail displayed during training (1 = progress bar).
#
# We include the new EarlyStopping callback here.
# The 'epochs' parameter is set to 10, but the callback may stop it even sooner
# if overfitting starts quickly
#

#
# Train the CNN Model
#
# Note on validation_split: It is NOT used because we pass separate testing data
# to the 'validation_data' parameter, which is the preferred method here.
#
trainingHistoryRegularised = cnnModelRegularised.fit(
                                                      X_train_processed,
                                                      y_train_encoded,
                                                      epochs           = EPOCHS,                             # Set to 20. However, the EarlyStopping callback will decide the final number of epochs
                                                      batch_size       = BATCH_SIZE,
                                                      validation_data  = (X_test_processed, y_test_encoded), # Use test set for validation
                                                      callbacks        = [earlyStoppingCallback],            # Add the EarlyStopping callback
                                                      verbose          = 1                                   # Display training progress line-by-line
                                                    )

print("Model training complete!")
print("Training history is stored in the 'trainingHistoryRegularised' variable")

In [None]:
#
# Displaying a brief summary of the final training results
#
print(trainingHistoryRegularised.history)
print("======================")
print("Final Training Summary")
print("======================")
print(f"Final Training Loss       : {trainingHistoryRegularised.history['loss'][-1]:.4f}"        )
print(f"Final Training Accuracy   : {trainingHistoryRegularised.history['accuracy'][-1]:.4f}"    )
print(f"Final Validation Loss     : {trainingHistoryRegularised.history['val_loss'][-1]:.4f}"    )
print(f"Final Validation Accuracy : {trainingHistoryRegularised.history['val_accuracy'][-1]:.4f}")

## **Observation on Model**


---


The final training summary for the **regularized model** is extremely positive, indicating a successful resolution to the initial overfitting problem and achieving **high performance** for the brain tumor classification task.

---

### **Key Metrics and Interpretation**

- **Final Training Accuracy:** `0.9890` (98.90%)  
  → Confirms the model has effectively mastered the training dataset.  

- **Final Validation Accuracy:** `0.9710` (97.10%)  
  → The most crucial metric. Accuracy this high on unseen data is **outstanding** for a medical classification task and shows strong reliability.  

- **Final Training Loss:** `0.0364`  
  → Very low, suggesting confident predictions on the training set.  

- **Final Validation Loss:** `0.1315`  
  → Higher than training loss but still very low in absolute terms, indicating good generalization.  

---

### **Conclusion on Generalization**

The primary observation is the successful achievement of **strong generalization with minimal overfitting**:

- **High Absolute Accuracy:** Both training and validation accuracies are above **97%**.  
- **Minimal Generalization Gap:** Training accuracy (98.90%) vs validation accuracy (97.10%) shows a gap of only **1.8 percentage points**.  
- This confirms that **Dropout layers** and **Early Stopping** worked as intended, preventing overfitting and ensuring robust performance on new samples.  

---

### **Final Note**
This model is considered **production-ready** in terms of performance metrics.


# **Model Evaluation and Visualization**

In [None]:
#
# Evaluate the model on the testing set
#
# This provides the final, unbiased performance metrics.
#
lossRegularised, accuracyRegularised = cnnModelRegularised.evaluate(
                                                                     X_test_processed,
                                                                     y_test_encoded,
                                                                     verbose = 1
                                                                   )

print("Regularisex Model Evaluation")
print("============================")
print(f"Test Loss    : {lossRegularised:.4f}"    )
print(f"Test Accuracy: {accuracyRegularised:.4f}")

In [None]:
#
# Plot Model Comparison (Loss)
#
plt.figure(figsize = (15, 6))
plt.subplot(1, 2, 1)
plt.plot(trainingHistoryInitial.history['loss'],         label = 'Initial Model Training Loss',     linestyle = '--', color = 'red' )
plt.plot(trainingHistoryInitial.history['val_loss'],     label = 'Initial Model Validation Loss',                     color = 'red' )
plt.plot(trainingHistoryRegularised.history['loss'],     label = 'Regularized Model Training Loss', linestyle = '--', color = 'blue')
plt.plot(trainingHistoryRegularised.history['val_loss'], label = 'Regularized Model Validation Loss',                 color = 'blue')
plt.title('Training vs. Validation Loss (Model Comparison)')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.grid(True)

#
# Plot Model Comparison (Accuracy)
#
plt.subplot(1, 2, 2)
plt.plot(trainingHistoryInitial.history['accuracy'],         label = 'Initial Model Training Acc',     linestyle = '--', color = 'red' )
plt.plot(trainingHistoryInitial.history['val_accuracy'],     label = 'Initial Model Validation Acc',                     color = 'red' )
plt.plot(trainingHistoryRegularised.history['accuracy'],     label = 'Regularized Model Training Acc', linestyle = '--', color = 'blue')
plt.plot(trainingHistoryRegularised.history['val_accuracy'], label = 'Regularized Model Validation Acc',                 color = 'blue')
plt.title('Training vs. Validation Accuracy (Model Comparison)')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.grid(True)

plt.show()


## **Observation on Model Performance**

---

### **Exceptional Performance and Generalization**

The results of the final model evaluation are **outstanding** and confirm that the Convolutional Neural Network (CNN) is highly effective and reliable for the brain tumor classification task.

* **Test Accuracy: 0.9718 (97.18%)**  
    This is an extremely high score, indicating that the model correctly classifies the tumor type (or the absence of a tumor) in over **97 out of 100** independent test images. For a medical image analysis task, this level of accuracy is highly desirable.

* **Test Loss: 0.1085**  
    The loss value is remarkably low. A low categorical cross-entropy loss signifies that the model's predictions are not just correct, but are made with **very high confidence**. The predicted probabilities for the correct class are close to 1.0, while the probabilities for the incorrect classes are close to 0.0.

### **Key Conclusion: Excellent Generalization**

The most crucial observation is the confirmation of **excellent generalization**. The test set result of $97.18\%$ demonstrates that:

1.  **The model is robust.** It performs equally well on data it has never encountered before.
2.  **There is no significant overfitting.** The model has successfully learned the core features of the tumor types rather than memorizing noise or specific artifacts from the training set.

This model is performing at a high standard, indicating a successful conclusion to the modeling phase.

In [None]:
#
# Plot Training History (Accuracy and Loss)
#
plt.figure(figsize = (12, 5))

#
# Plot Training and Validation Accuracy
#
plt.subplot(1, 2, 1)
plt.plot(trainingHistoryRegularised.history['accuracy'],     label = 'Training Accuracy'  )
plt.plot(trainingHistoryRegularised.history['val_accuracy'], label = 'Validation Accuracy')
plt.title('Accuracy Over Epochs')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.grid(True)

#
# Plot Training and Validation Loss
#
plt.subplot(1, 2, 2)
plt.plot(trainingHistoryRegularised.history['loss'],     label = 'Training Loss'  )
plt.plot(trainingHistoryRegularised.history['val_loss'], label = 'Validation Loss')
plt.title('Loss Over Epochs')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.grid(True)

plt.show()

## Analysis of Training History (Accuracy and Loss)

---

### **Accuracy Over Epochs (Left Plot)**

The accuracy plot shows the model's performance on both the training and validation datasets across 10 epochs.

* **Rapid Initial Learning:** The model exhibits **rapid learning** in the first 3-4 epochs, with both training and validation accuracy increasing sharply from around $0.70$ to over $0.90$. This indicates that the Convolutional Neural Network (CNN) is highly effective at extracting relevant features from the Brain MRI images.

* **High Saturation Point:** Both accuracies stabilize at a very high level, with the **Training Accuracy** reaching nearly $1.00$ (or $100\%$) and the **Validation Accuracy** peaking around $0.98$. Achieving $\sim 98\%$ on unseen validation data is an excellent result for this task.

* **Minor Overfitting/Variance:** After **Epoch 6**, the curves begin to diverge slightly. The training accuracy remains extremely high and flat, while the validation accuracy shows **more variance** (a slight drop followed by a rise in Epochs 8 and 9). This divergence suggests **minor overfitting** beginning in the later epochs. The model is still improving its fit to the training data without necessarily improving its generalization ability further on the validation set.

### **Loss Over Epochs (Right Plot)**

The loss plot, specifically using *categorical cross-entropy*, tracks how close the model's predictions are to the true labels.

* **Consistent Decrease:** Both the **Training Loss** and **Validation Loss** decrease quickly and consistently up to **Epoch 6**. This perfectly mirrors the rapid increase in accuracy, confirming that the model is converging well.

* **Optimal Point and Rebound:** The **Validation Loss** reaches its minimum point around **Epoch 6** (a value just above $0.10$). Crucially, after this point (Epochs 7, 8, and 9), the Validation Loss begins to **increase and becomes volatile** while the Training Loss continues its slow decline.

* **Clear Sign of Overfitting:** The increasing validation loss coupled with flat or continuing decreasing training loss is the **textbook sign of overfitting**. After Epoch 6, the model is likely learning noise or highly specific features unique to the training set, which harms its performance on the generalization set.

### **Overall Recommendation**

Based on the combined plots:

1.  **Stop Training Early:** The **ideal stopping point** for this model appears to be **Epoch 6 or 7**, where the validation loss is at its minimum and validation accuracy is near its peak.
2.  **Excellent Model:** Despite the minor overfitting after Epoch 6, the overall training process was highly successful, resulting in a model with a **peak validation accuracy of approximately $98\%$**.

# **Model Prediction**

In [None]:
#
# Make predictions
#
y_PredictedProbalities = cnnModelRegularised.predict(X_test_processed)

#
# Convert probabilities to class indices (0, 1, 2, 3)
# argmax returns the index of the highest probability
#
# y_PredictedClasses = [np.argmax(element) for element in y_PredictedProbalities]
y_PredictedClasses = np.argmax(y_PredictedProbalities, axis = 1)

#
# Get true class indices from the one-hot encoded array
#
y_TrueClasses = np.argmax(y_test_encoded, axis = 1)

print("Shape of Predicted Classes :", y_PredictedClasses.shape)
print("Shape of True Classes      :", y_TrueClasses.shape)

In [None]:
y_TrueClasses[:5]

In [None]:
y_PredictedClasses[:5]

## **Evaluation Metrics**

### **Error Analysis : Displaying Misclassified Images**
This section identifies all instances where the model's prediction does not match the true label and plots the first few examples.

In [None]:
#
# Identify Incorrect Predictions
#
# We use boolean indexing to find where the predicted classes
# are NOT EQUAL to the true classes. The result is a boolean array.
#
FIRST_N_MISMATCH = 10
incorrectIndices = np.where(y_PredictedClasses != y_TrueClasses)[0]
print(f"Total misclassified images found: {len(incorrectIndices)}")

#
# Select a subset of the first n incorrect indices to plot
#
# We limit this to prevent the output from becoming too long
#
plotIndices   = incorrectIndices[:FIRST_N_MISMATCH]
numberOfPlots = len(plotIndices)

if numberOfPlots > 0:
  #
  # Create a figure to display the misclassified images
  #
  plt.figure(figsize = (16, 4 * ((numberOfPlots + 3) // 4))) # Dynamic sizing
  plt.suptitle(f"First {numberOfPlots} Misclassified Images", fontsize = 18, fontweight = 'bold')
  plt.tight_layout(pad = 3.0, rect = [0, 0.03, 1, 0.95])     # Adjust overall padding

  #
  # Loop through the selected incorrect indices and plot the data
  #
  for i, index in enumerate(plotIndices):
    #
    # Retrieve the necessary data for the misclassified image
    # X_test holds the original PIL image data needed for visualization
    #
    originalImage  = X_test[index]
    trueLabel      = CATEGORIES[y_TrueClasses[index]]
    predictedLabel = CATEGORIES[y_PredictedClasses[index]]

    ax = plt.subplot((numberOfPlots + 3) // 4, 4, i + 1)
    ax.imshow(originalImage, cmap = 'gray') # Display the original raw image

    #
    # Set the title to clearly show the error
    #
    titleText = f"True: {trueLabel}\nPredicted: {predictedLabel}"

    #
    # Change the text color to red to highlight the error
    #
    ax.set_title(titleText, fontsize = 10, color = 'red', fontweight = 'bold')
    ax.axis('off')

    # plt.show() # Commented out to avoid display issues
else:
  print("No misclassified images to display (This is unlikely with real data)")

### **Classification Report**

In [None]:
#
# Generate the Classification Report
#
print("="*55)
print("Classification Report")
print("="*55)
print(classification_report(y_TrueClasses, y_PredictedClasses, target_names = CATEGORIES))
print("="*55)

### **Confusion Matrix**

In [None]:
#
# Generate the Confusion Matrix
#
confusionMatrix = confusion_matrix(y_TrueClasses, y_PredictedClasses)

#
# Plot the Confusion Matrix using a heatmap for better visualization
#
plt.figure(figsize = (10, 8))

sns.heatmap(
             confusionMatrix,
             annot       = True,
             fmt         = 'd',
             cmap        = 'coolwarm',
             linewidths  = .5,
             xticklabels = CATEGORIES,
             yticklabels = CATEGORIES
           )
plt.title('Confusion Matrix', fontsize = 16)
plt.ylabel('True Label')
plt.xlabel('Predicted Label')

plt.show() # Comment out if there are any display issues

### **Observation Confusion Matrix**

The confusion matrix provides a detailed breakdown of the model's performance on the test set for the four classes: **glioma, meningioma, notumor, and pituitary**.  

- **Rows = True Labels (actual classes)**  
- **Columns = Predicted Labels (model's predictions)**  

---

### **Exceptional Performance on Healthy and Pituitary Cases**

The model demonstrates **near-perfect accuracy** in two key areas:

- **Notumor (404 / 405)**  
  - 404 correctly classified  
  - 1 misclassified (as meningioma)  
  - Extremely reliable at identifying a healthy brain  

- **Pituitary (299 / 300)**  
  - 299 correctly classified  
  - 1 misclassified (as meningioma)  
  - Very strong identification of pituitary tumors  

These high True Positive rates suggest that the features for **Notumor** and **Pituitary** are highly distinct and easily learned by the CNN.

---

### **Primary Area of Confusion: Meningioma vs. Glioma**

Most classification errors occur between the **glioma** and **meningioma** classes:

- **Meningioma → Glioma (20 cases)**  
  - Largest single error type  
  - Indicates difficulty distinguishing meningioma from glioma features  

- **Glioma → Meningioma (6 cases)**  
  - Fewer misclassifications compared to the reverse direction  

- **Total Meningioma Misclassifications = 28**  
  - 20 misclassified as glioma  
  - 3 misclassified as notumor  
  - 5 misclassified as pituitary  
  - The fact that **20 out of 28 errors** are meningioma → glioma highlights the critical weakness.  

---

### **Conclusion and Key Takeaway**
The overall classifier is highly accurate, with an excellent ability to rule out tumors (**notumor class performance**) and identify **pituitary tumors.** The performance of the model would be significantly improved by addressing the specific feature overlap causing the confusion between meningioma and glioma.

- The classifier is **highly accurate overall**  
- Excellent ability to rule out tumors (**Notumor class**)  
- Outstanding identification of **Pituitary tumors**  
- **Main challenge:** Overlap between **Meningioma** and **Glioma** features  

**Improvement Opportunity:** Focus on better feature extraction or domain-specific augmentation to reduce confusion between **meningioma** and **glioma**
