# Brain Tumor MRI Image Classification



##### **Project Type**    - Classification
##### **Contribution**    - Individual

# **Project Summary -**

Brain tumor diagnosis is a critical medical task that requires accurate and timely interpretation of MRI scans. Manual diagnosis can be time-consuming and prone to human error, especially when distinguishing between different types of tumors. This project aims to build a deep learning-based system to classify brain MRI images into four categories: **Glioma**, **Meningioma**, **Pituitary Tumor**, and **No Tumor**.

We started by exploring and preprocessing the dataset, which included MRI scans labeled according to tumor type. Preprocessing steps included image resizing, normalization, and augmentation to enhance model performance and reduce overfitting. The dataset was split into training, validation, and test sets to ensure unbiased model evaluation.

Two models were developed:

1. **Custom CNN** – A convolutional neural network built from scratch using layers like Conv2D, MaxPooling, Dropout, and Dense. This model served as a baseline and helped us understand the effectiveness of a simple architecture.

2. **MobileNetV2 (Transfer Learning)** – A lightweight pretrained model from ImageNet. We fine-tuned the top layers and added a custom classification head suitable for our 4-class problem. This model leveraged pretrained weights and significantly improved performance.

Both models were trained using **EarlyStopping** and **ModelCheckpoint** to prevent overfitting and retain the best-performing versions. Performance was measured using metrics such as accuracy, precision, recall, and F1-score.

* The **Custom CNN** achieved 71% test accuracy, with relatively lower recall on meningioma cases.
* **MobileNetV2** outperformed the custom model with **81% accuracy**, strong precision and recall across all tumor types, and better generalization.

To make the model accessible, we deployed the MobileNetV2 model using **Streamlit**, a Python-based web framework. The app allows users to upload brain MRI images and view predictions along with confidence scores. The interface is simple, intuitive, and displays both the uploaded image and the predicted tumor type (e.g., Glioma, Pituitary).

We tested the app manually with multiple MRI images, and it successfully predicted most cases with high confidence. This streamlines the diagnostic process and supports healthcare professionals with a tool that’s fast and reliable.

In summary, this project showcases how deep learning, especially **transfer learning**, can be applied effectively to real-world medical imaging problems. With further improvements like larger datasets or tumor localization, this tool could become even more helpful in clinical settings.

# **GitHub Link -**

https://github.com/Vignesha-S/brain-tumor-mri-classification

# **Problem Statement**


Brain tumors are among the most fatal forms of cancer. Early detection and accurate classification are crucial for timely treatment. However, manual analysis of MRI scans is time-consuming and prone to human error.
This project focuses on building an automated classification system using deep learning that can assist healthcare professionals by identifying tumor types from MRI images with high accuracy and confidence.

# ***Let's Begin !***

## ***1. Understand the Dataset***

### Import Libraries

In [None]:
# Import Libraries
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import random
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.applications import MobileNetV2
from tensorflow.keras.models import Model, Sequential
from tensorflow.keras.layers import Dense, Conv2D, MaxPooling2D, GlobalAveragePooling2D, Flatten, Dropout, BatchNormalization
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
from tensorflow.keras.optimizers import Adam
from sklearn.metrics import classification_report, confusion_matrix

### Mount Google Drive & Define Paths and Parameters

In [None]:
# Load Dataset
from google.colab import drive
drive.mount('/content/drive')

# Define dataset directories
train_dir = '/content/drive/MyDrive/Labmentix_internship/Brain_Tumor_Image_Classification/dataset/train'
val_dir = '/content/drive/MyDrive/Labmentix_internship/Brain_Tumor_Image_Classification/dataset/valid'
test_dir = '/content/drive/MyDrive/Labmentix_internship/Brain_Tumor_Image_Classification/dataset/test'

IMG_SIZE = (224, 224)
BATCH_SIZE = 32
NUM_CLASSES = 4

## ***2. Data Visualization***

#### Chart - 1

In [None]:
# Class Distribution Plot

image = ImageDataGenerator(rescale=1./255).flow_from_directory(train_dir, target_size=IMG_SIZE)
labels = list(image.class_indices.keys())

# Count images per class
train_counts = pd.Series(image.classes).value_counts().sort_index()

plt.figure(figsize=(10, 6))
plt.bar(labels, train_counts)
plt.title('Class Distribution in Training Set')
plt.xlabel('Tumor Class')
plt.ylabel('Image Count')
plt.show()

##### 1. Why did you pick the specific chart?

We used a bar chart to visualize the number of images in each class (glioma, pituitary, meningioma, no_tumor) to quickly assess class balance in the training dataset. Bar charts are ideal for comparing category-wise frequencies, making them a clear choice for understanding how the data is distributed across different tumor categories.

##### 2. What is/are the insight(s) found from the chart?

- The Glioma class has the highest number of training samples (approx. 550+).

- The Pituitary and Meningioma classes follow with moderate counts.

- The No Tumor class has the lowest number of images (approx. 300–350).
This shows a moderate class imbalance, with the Glioma class being the most represented and No Tumor the least.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes, these insights are important for both model performance and business impact:

*Positive Impact:*

Understanding the class distribution allows us to proactively apply strategies like data augmentation or class weighting. This helps ensure that all tumor types, especially the underrepresented ones, are equally learned by the model. In the medical domain, balanced detection across all tumor types is essential for **accurate diagnosis and patient safety.**

*Negative Growth if Ignored:*

If the class imbalance is not addressed, the model may become biased toward the more common classes (e.g., Glioma), and fail to detect less frequent but critical conditions like No Tumor. This can lead to **false positives** or **missed diagnoses**, which are unacceptable in a healthcare scenario.

#### Chart - 2

In [None]:
# Sample MRI Images from Each Class

plt.figure(figsize=(12, 10))
for label, idx in image.class_indices.items():
  folder = os.path.join(train_dir, label)
  img_file = random.choice(os.listdir(folder))
  img_path = os.path.join(folder, img_file)
  img = plt.imread(img_path)
  plt.subplot(2, 2, idx + 1)
  plt.imshow(img)
  plt.title(label)
  plt.axis('off')
plt.tight_layout()
plt.suptitle("Sample MRI Images from Each Class", fontsize=16)
plt.show()

##### 1. Why did you pick the specific chart?

We used this image grid to display one sample MRI image from each tumor class. This chart helps us visually inspect the dataset and understand what kind of variations exist across different tumor types. In image classification tasks, it's important to get a feel for the visual **complexity, contrast**, and **features** that models might learn.

##### 2. What is/are the insight(s) found from the chart?

- Each tumor class has **distinct visual features** (size, shape, location, and intensity patterns in the MRI).

- The **No Tumor** class typically has more uniform brain structures, while **tumor classes** show clear irregularities.

- Some classes like **Glioma** and **Meningioma** appear visually similar in some cases, which could be challenging for the model.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

*Positive Impact:*

These insights help us understand what the model will be learning from.

It tells us that **image resolution, contrast, and preprocessing** are important for success.

It also guides the need for **data augmentation** to teach the model invariance to position, brightness, etc.

*Possible Negative Growth if Ignored:*

If the images are not **preprocessed consistently**, or if class-specific features are subtle and not augmented well, the model may struggle to generalize.

Misclassifying visually similar tumors like Glioma and Meningioma could delay diagnosis or cause incorrect treatment, leading to **clinical risks**.

## ***3. Data Pre-processing & Augmentation***

### 1. Image Preprocessing (Normalization & Resizing)

In [None]:
# Preprocessing for validation and test sets (rescaling)
val_test_datagen = ImageDataGenerator(rescale=1./255)

##### We used the *ImageDataGenerator* class to normalize all image pixel values from the range [0, 255] to [0, 1]. Additionally, all images are resized to a fixed shape of 224×224 to ensure uniformity and compatibility with the model input layer.

### 2. Data Augmentation

In [None]:
# Data augmentaion + rescaling for training set
train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=20,
    width_shift_range=0.1,
    height_shift_range=0.1,
    zoom_range=0.2,
    horizontal_flip=True,
)

##### To improve generalization and reduce overfitting, we applied real-time data augmentation techniques on the training dataset such as rotation, zoom, shift, and horizontal flip. These techniques simulate real-world variability in MRI scans.

### 3. Create Image Generators

In [None]:
# Create image generators
train_generator = train_datagen.flow_from_directory(
    train_dir,
    target_size=IMG_SIZE,
    batch_size=BATCH_SIZE,
    class_mode='categorical'
)

val_generator = val_test_datagen.flow_from_directory(
    val_dir,
    target_size=IMG_SIZE,
    batch_size=BATCH_SIZE,
    class_mode='categorical'
)

test_generator = val_test_datagen.flow_from_directory(
    test_dir,
    target_size=IMG_SIZE,
    batch_size=BATCH_SIZE,
    class_mode='categorical',
    shuffle=False
)

We created image generators for train, validation, and test sets using the above preprocessing pipelines. These generators efficiently load and preprocess images in batches during training and evaluation.

#### 4. Visualize Augmentation Results

In [None]:
# Show 5 augmemted images
x_batch, y_batch = next(train_generator)
plt.figure(figsize=(12, 3))
for i in range(5):
  plt.subplot(1, 5, i+1)
  plt.imshow(x_batch[i])
  plt.axis('off')
plt.suptitle("Augmented Training Images", fontsize=16)
plt.show()

To confirm the correctness of our augmentation pipeline, we visualized augmented samples from the training set. As shown below, each image preserves the original tumor pattern while introducing realistic variations in orientation and brightness.

## ***4. Model Building - Custom CNN***

In [None]:
# Build custom CNN
custom_cnn = Sequential()

# Layer 1
custom_cnn.add(Conv2D(32, (3, 3), activation='relu', input_shape=(224, 224, 3)))
custom_cnn.add(BatchNormalization())
custom_cnn.add(MaxPooling2D((2, 2)))

# Layer 2
custom_cnn.add(Conv2D(64, (3, 3), activation='relu'))
custom_cnn.add(BatchNormalization())
custom_cnn.add(MaxPooling2D((2, 2)))

# Layer 3
custom_cnn.add(Conv2D(128, (3, 3), activation='relu'))
custom_cnn.add(BatchNormalization())
custom_cnn.add(MaxPooling2D((2, 2)))

# Flatten + Dense
custom_cnn.add(Flatten())
custom_cnn.add(Dense(128, activation='relu'))
custom_cnn.add(Dropout(0.5))
custom_cnn.add(Dense(NUM_CLASSES, activation='softmax')) # 4 output classes

In [None]:
# Compile Model
custom_cnn.compile(optimizer='adam',
                   loss='categorical_crossentropy',
                   metrics=['accuracy'])

# Show summary
custom_cnn.summary()

We built a custom Convolutional Neural Network (CNN) using three convolutional blocks followed by dense layers. Batch normalization was applied after each convolutional layer to speed up convergence. Dropout was used before the final dense layer to prevent overfitting. The model outputs probabilities for 4 tumor classes using a softmax activation.


## ***5. Transfer Learning (MobilenetV2)***

#### 1. Load & Customize MobileNetV2

In [None]:
# Load MobileNetV2 base model without the top layer
base_model = MobileNetV2(input_shape=(224, 224, 3),
                         include_top=False,
                         weights='imagenet')

# Freeze the base model (we'll fine-tune later)
base_model.trainable = False

# Add custom layers on top
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(128, activation='relu')(x)
x = Dropout(0.5)(x)
output = Dense(NUM_CLASSES, activation='softmax')(x)

# Final model
mobilenet_model = Model(inputs=base_model.input, outputs=output)

# Compile the model
mobilenet_model.compile(optimizer=Adam(learning_rate=0.0001),
                        loss='categorical_crossentropy',
                        metrics=['accuracy'])

# Summary
mobilenet_model.summary()

### Transfer Learning with MobileNetV2

We used **MobileNetV2**, a lightweight CNN pretrained on the ImageNet dataset, as a base for transfer learning. The top layers were removed and replaced with a global average pooling layer, a dense ReLU layer, a dropout layer, and a final softmax layer to classify into 4 tumor categories. Initially, we froze the base model to train only the top layers. Fine-tuning can be done later if needed.

## ***6. Model Training***

### 1. Set Up Callbacks

In [None]:
# Save the best model based on validation loss
early_stop = EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)

checkpoint_custom = ModelCheckpoint('custom_cnn_best.h5', monitor='val_loss', save_best_only=True)
checkpoint_mobilenet = ModelCheckpoint('mobilenetv2_best.h5', monitor='val_loss', save_best_only=True)

#### 🔁 Checkpoints and Early Stopping

To avoid overfitting and save the best performing model based on validation loss, we implemented:
- **EarlyStopping**: Stops training if no improvement in validation loss after several epochs.
- **ModelCheckpoint**: Saves the model with the best validation performance for future evaluation or deployment.

#### 2. Train Custom CNN Model

In [None]:
history_custom = custom_cnn.fit(
    train_generator,
    epochs=20,
    validation_data=val_generator,
    callbacks=[early_stop, checkpoint_custom]
)

#### 3. Train MobileNetV2 Model

In [None]:
history_mobilenet = mobilenet_model.fit(
    train_generator,
    epochs=20,
    validation_data=val_generator,
    callbacks=[early_stop, checkpoint_mobilenet]
)

In [None]:
# Save trained models permanently to Google Drive
from google.colab import drive
drive.mount('/content/drive')

!mkdir -p /content/drive/MyDrive/Labmentix_internship/Brain_Tumor_Image_Classification/
!cp custom_cnn_best.h5 /content/drive/MyDrive/Labmentix_internship/Brain_Tumor_Image_Classification/
!cp mobilenetv2_best.h5 /content/drive/MyDrive/Labmentix_internship/Brain_Tumor_Image_Classification/

#### 4. Plot Accuracy & Loss

In [None]:
def plot_history(history, model_name):
  acc = history.history['accuracy']
  val_acc = history.history['val_accuracy']
  loss = history.history['loss']
  val_loss = history.history['val_loss']

  epochs_range = range(len(acc))

  plt.figure(figsize=(14, 5))

  plt.subplot(1, 2, 1)
  plt.plot(epochs_range, acc, label='Training Accuracy')
  plt.plot(epochs_range, val_acc, label='Validation Accuracy')
  plt.legend(loc='lower right')
  plt.title(f'{model_name} - Accuracy')

  plt.subplot(1, 2, 2)
  plt.plot(epochs_range, loss, label='Training Loss')
  plt.plot(epochs_range, val_loss, label='Validation Loss')
  plt.legend(loc='upper right')
  plt.title(f'{model_name} - Loss')

  plt.show()

In [None]:
plot_history(history_custom, "Custom CNN")
plot_history(history_mobilenet, "MobileNetV2")

We trained both our Custom CNN and the MobileNetV2 transfer learning model using EarlyStopping and ModelCheckpoint to monitor validation loss. Each model was trained for up to 20 epochs, with performance tracked across training and validation sets.

Training/validation accuracy and loss curves are plotted below for each model to evaluate learning behavior and potential overfitting or underfitting.


#### ✅ Summary of Model Training

- Both Custom CNN and MobileNetV2 were trained using the defined callbacks.
- The best models were saved as `.h5` files: `custom_cnn_best.h5` and `mobilenetv2_best.h5`.
- These saved models will be used later for **evaluation**, **comparison**, and **deployment**.

## ***7. Model Evaluation***

#### 1. Load Best Model

In [None]:
from tensorflow.keras.models import load_model

# Load best saved models
custom_cnn = load_model('custom_cnn_best.h5')
mobilenet_model = load_model('mobilenetv2_best.h5')

#### 2. Prepare Predictions for Evaluation

In [None]:
# Predict labels
y_pred_cnn = custom_cnn.predict(test_generator)
y_pred_mobilenet = mobilenet_model.predict(test_generator)

# Convert predictions from probabilities to class indices
y_pred_cnn_labels = np.argmax(y_pred_cnn, axis=1)
y_pred_mobilenet_labels = np.argmax(y_pred_mobilenet, axis=1)

# True labels
y_true = test_generator.classes

#### 3. Print Classification Report

In [None]:
# Get class labels
class_labels = list(test_generator.class_indices.keys())

# For Custom CNN
print("Custom CNN Classification Report:")
print(classification_report(y_true, y_pred_cnn_labels, target_names=class_labels))

# For MobileNetV2
print("\nMobileNetV2 Classification Report:")
print(classification_report(y_true, y_pred_mobilenet_labels, target_names=class_labels))

#### 4. Plot Confusion *Matrix*

In [None]:
def plot_conf_matrix(y_true, y_pred, model_name):
  cm = confusion_matrix(y_true, y_pred)
  plt.figure(figsize=(6,5))
  sns.heatmap(cm, annot=True, fmt='d', xticklabels=class_labels, yticklabels=class_labels, cmap='Blues')
  plt.xlabel('Predicted')
  plt.ylabel('Actual')
  plt.title(f'{model_name} - Confusion Matrix')
  plt.show()

plot_conf_matrix(y_true, y_pred_cnn_labels, "Custom CNN")
plot_conf_matrix(y_true, y_pred_mobilenet_labels, "MobileNetV2")

We evaluated both models using test set predictions. The classification report includes metrics such as accuracy, precision, recall, and F1-score for each of the four tumor classes: glioma, meningioma, pituitary, and no_tumor. Confusion matrices visualize the model's prediction performance for each category.

#### 1. Explain the ML Model Used and Its Performance Using Evaluation Metric Score Chart

**Custom CNN**
- A model built from scratch using multiple *Conv2D, MaxPooling, Dropout,* and *Dense layers*.

- **Evaluation on test data:**

    - **Accuracy:** 71%

    - **Macro F1-Score:** 66%

    - **Precision:** Varies across classes; poor on meningioma.
-**Observations:**

    - Performs reasonably well on **glioma** and **no_tumor**.

    - Shows very low recall for **meningioma**, indicating difficulty in distinguishing that class.

    - Performance is unbalanced and affected by class overlap or limited representational power.


**MobileNetV2 (Transfer Learning)**
- Pretrained on ImageNet; fine-tuned with custom dense layers for brain tumor classification.

- **Evaluation on test data:**

    - **Accuracy:** 81%

    - **Macro F1-Score:** 80%

    - **Precision & Recall:** Strong performance across all 4 classes.

**Observations:**

- Excellent recall on **pituitary** (1.00).

- High precision and recall on **glioma** and **no_tumor**.

- Balanced model suitable for real-world deployment.

#### 2. Which hyperparameter optimization technique have you used and why?

- **Used Techniques:**

    - EarlyStopping: Stops training when validation loss doesn't improve.

    - ModelCheckpoint: Saves the best-performing model based on validation accuracy.

    - Manual selection of epochs and batch size.

- **Why Not Advanced Tuning?**

  - Time constraints and project focus were on architecture evaluation.

  - For future work, techniques like **Keras Tuner, Optuna,** or **GridSearchCV** can be used for deeper optimization.

#### 3. Have you seen any improvement? Note down the improvement with updates Evaluation metric Score Chart.

Yes, MobileNetV2 significantly outperformed the custom CNN.

## ***8. Model Comparison***

We compared the performance of two models: a Custom Convolutional Neural Network (CNN) and a Transfer Learning-based model using MobileNetV2.

### Evaluation Summary

| Metric        | Custom CNN | MobileNetV2 |
|---------------|------------|-------------|
| Accuracy      | 71%        | 81%         |
| Precision     | 70%        | 82%         |
| Recall        | 72%        | 81%         |
| F1-Score      | 66%        | 80%         |

### Confusion Matrix Comparison

| Class       | CNN Correct Predictions | MobileNetV2 Correct Predictions |
|-------------|--------------------------|---------------------------------|
| Glioma      | 67 / 80                  | 72 / 80                         |
| Meningioma  | 12 / 63                  | 36 / 63                         |
| No Tumor    | 42 / 49                  | 37 / 49                         |
| Pituitary   | 53 / 54                  | 54 / 54                         |

### Finding:

MobileNetV2 outperformed the custom CNN model across all evaluation metrics. It demonstrated better precision, recall, and F1-score, especially for challenging classes like Meningioma. Given its superior accuracy and generalization capability, **MobileNetV2** is identified as the most accurate, efficient, and reliable model for deployment.


## ***9. Streamlit Application Deployment***



We built an interactive web application using Streamlit that allows users to upload brain MRI images and get predictions about tumor type.

### 💡 Features of the Web App

- **Upload Interface**

Users can upload MRI images (`.jpg`, `.jpeg`, `.png`) via a drag-and-drop file uploader.

- **Image Display**

The uploaded image is displayed on the screen for confirmation.

- **Tumor Prediction**

Once an image is uploaded, the model classifies the tumor type as one of:

  - `Glioma`

  - `Meningioma`

  - `Pituitary`

  - `No Tumor`

- **Confidence Score**

The app also displays the model’s confidence (in percentage) for the predicted class.

- **Model Used**

The predictions are made using the fine-tuned MobileNetV2 model loaded from a .h5 file.

### ⚙️ Files Used in the App

- `app.py` – Main Streamlit application.
- `utils.py` – Helper functions for image preprocessing and prediction.
- `mobilenetv2_best.h5` – Trained model used for prediction.
- `requirements.txt` – Contains all the dependencies needed to run the app.

### ▶️ To Run the App

```bash

streamlit run app.py

**Web App Interface – Screenshot**

Below is the interface of our deployed **Brain Tumor MRI Classifier** built using Streamlit:

![Streamlit UI](https://drive.google.com/uc?export=view&id=1IzG0kXP8xaLyH5PSsNLImRxGuuY-ty_p)

# **Manual Testing**

To evaluate the real-world performance of our deployed brain tumor classification model, we manually tested it by uploading sample images from each tumor class through the Streamlit application.

## **Manual Test Results:**

| Test No. | Actual Label | Predicted Label | Confidence | Status      |
| -------- | ------------ | --------------- | ---------- | ----------- |
| 1        | MENINGIOMA   | MENINGIOMA      | 67.25%     | ✅ Correct   |
| 2        | GLIOMA       | GLIOMA          | 96.35%     | ✅ Correct   |
| 3        | NO\_TUMOR    | NO\_TUMOR       | 93.45%     | ✅ Correct   |
| 4        | PITUITARY    | MENINGIOMA      | 52.14%     | ❌ Incorrect |
| 5        | PITUITARY    | PITUITARY       | 94.15%     | ✅ Correct   |


**Observations:**
- The model performed well across most classes, especially **GLIOMA**, **NO_TUMOR**, and **PITUITARY.**

- Slight confusion was observed between **MENINGIOMA** and **PITUITARY**, possibly due to overlapping visual features.

- Confidence scores provided additional insight into model certainty.

# **Conclusion**

In this project, we successfully developed a Brain Tumor MRI Classification system using deep learning techniques. A custom CNN and a pretrained MobileNetV2 model were implemented and evaluated. Based on accuracy and performance metrics, MobileNetV2 significantly outperformed the custom model.

We then deployed our best model using a **Streamlit web application**, which allows users to upload MRI scans and get real-time tumor type predictions along with confidence scores. The app demonstrated strong performance across all tumor categories, with high accuracy and a user-friendly interface.

### **Key Takeaways**

* Custom CNN provided foundational understanding but showed limited generalization.
* Transfer learning with MobileNetV2 improved both accuracy and model robustness.
* The deployed Streamlit app enables accessible and practical tumor classification.
* The entire pipeline from data preprocessing to deployment was completed successfully.

This end-to-end solution can aid medical professionals by providing fast and reliable tumor classification support.