# 🧠 Brain Tumor Classification using VGG16

## 📌 Project Overview

In this notebook, we will develop a deep learning model to automatically detect and classify brain tumors from MRI images. The goal of this project is to assist medical professionals in identifying four specific categories of brain conditions:

- **Glioma Tumor**
- **Meningioma Tumor**
- **Pituitary Tumor**
- **No Tumor**

Accurate and early detection of brain tumors is crucial for effective treatment planning and improved patient outcomes. Manual diagnosis from MRI scans can be time-consuming and subjective, so our goal is to build a model that can support this process with high accuracy and consistency.

---

## 🧰 What We'll Do

- Load and preprocess a labeled dataset of brain MRI images.
- Visualize example images from each tumor category.
- Use **transfer learning** with the **VGG16** architecture, a pre-trained Convolutional Neural Network originally trained on ImageNet.
- Fine-tune the model for multi-class classification specific to our problem.
- Evaluate the model’s performance using accuracy, confusion matrix, and classification report.
- Test the model on new images to validate its predictions.

---

## 🧠 Why VGG16?

VGG16 is a widely used CNN architecture known for its simplicity and strong performance in image classification tasks. By leveraging a pre-trained version of VGG16, we can reduce training time and achieve better performance, especially when working with a limited dataset.

---

Let's get started with loading the data and exploring the dataset!

## Load data and import libraries

In [1]:
import kagglehub

# Download latest version
path = kagglehub.dataset_download("masoudnickparvar/brain-tumor-mri-dataset")

print("Path to dataset files:", path)

Path to dataset files: /kaggle/input/brain-tumor-mri-dataset


In [2]:
import os
import cv2
from matplotlib import pyplot as plt
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import  Dense,Dropout,Conv2D,MaxPooling2D,Flatten
from sklearn.model_selection import train_test_split
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.applications.vgg16 import VGG16
from tensorflow.keras.applications.vgg16 import preprocess_input
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from sklearn.metrics import confusion_matrix,classification_report
from tensorflow.keras.callbacks import EarlyStopping

os.listdir(path)
training_data=path+'/Training'
testing_data=path+'/Testing'
os.listdir(training_data)

['pituitary', 'notumor', 'meningioma', 'glioma']

In [3]:
x=[]
y=[]
def load_data(path, x, y):
  x.clear()
  y.clear()
  for i in os.listdir(path):
    for j in os.listdir(path+'/'+i):
      img=cv2.imread(path+'/'+i+'/'+j)
      img=cv2.resize(img,(224,224))
      x.append(img)
      y.append(i)


load_data(training_data, x, y)

In [4]:
len(x),len(y)

(5712, 5712)

In [6]:
y=np.array(y)
y=[0 if i=='glioma' else 1 if i=='meningioma' else 2 if i=='pituitary' else 3 for i in y]
y = np.array(y)


## split to train and valid

In [7]:
from sklearn.model_selection import train_test_split
x_train,x_vald,y_train,y_vald=train_test_split(x,y,test_size=0.2,random_state=42)

## load VGG16

In [8]:
VGG16_model=VGG16(weights='imagenet',include_top=False,input_shape=(224,224,3))
for layer in VGG16_model.layers:
  layer.trainable=False

In [9]:
model=Sequential()
model.add(VGG16_model)
model.add(Flatten())
model.add(Dense(4,activation='softmax'))
model.summary()

EarlyStopping=EarlyStopping(monitor='val_loss',patience=5,verbose=1,mode='min')


In [10]:
model.compile(optimizer='adam',loss='categorical_crossentropy',metrics=['Precision'])

## Data Augmentation for Generalization

In [11]:
datagen = ImageDataGenerator(
    rotation_range=25,
    brightness_range=[0.5, 1.5]
)

batch_size = 32

# Convert x_train to numpy array and one-hot encode y_train
x_train_np = np.array(x_train)
y_train_encoded = to_categorical(y_train, num_classes=4)
y_test_encoded = to_categorical(y_vald, num_classes=4)


generator = datagen.flow(x_train_np, y_train_encoded, batch_size=batch_size)

model.fit(generator, epochs=10, steps_per_epoch=len(x_train)//batch_size, validation_data=(np.array(x_vald), y_test_encoded),callbacks=[EarlyStopping])

  self._warn_if_super_not_called()


Epoch 1/10
[1m142/142[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m104s[0m 649ms/step - Precision: 0.7517 - loss: 4.7241 - val_Precision: 0.8591 - val_loss: 3.6221
Epoch 2/10
[1m  1/142[0m [37m━━━━━━━━━━━━━━━━━━━━[0m [1m24s[0m 173ms/step - Precision: 0.9062 - loss: 2.3101



[1m142/142[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 73ms/step - Precision: 0.9062 - loss: 2.3101 - val_Precision: 0.8346 - val_loss: 4.5142
Epoch 3/10
[1m142/142[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m104s[0m 531ms/step - Precision: 0.9057 - loss: 1.9509 - val_Precision: 0.9046 - val_loss: 2.4077
Epoch 4/10
[1m142/142[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 73ms/step - Precision: 0.8438 - loss: 5.6922 - val_Precision: 0.9204 - val_loss: 2.1144
Epoch 5/10
[1m142/142[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m131s[0m 528ms/step - Precision: 0.9446 - loss: 1.0453 - val_Precision: 0.9134 - val_loss: 2.6267
Epoch 6/10
[1m142/142[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 73ms/step - Precision: 0.9375 - loss: 0.9882 - val_Precision: 0.9046 - val_loss: 2.7373
Epoch 7/10
[1m142/142[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m128s[0m 507ms/step - Precis

<keras.src.callbacks.history.History at 0x7efa6d6eb890>

## evaluation

### load test data

In [35]:
x_text=[]
y_test=[]

def load_data(path):
  for i in os.listdir(path):
    for j in os.listdir(path+'/'+i):
      img=cv2.imread(path+'/'+i+'/'+j)
      img=cv2.resize(img,(224,224))
      x_text.append(img)
      y_test.append(i)



load_data(testing_data)
y_test=np.array(y_test)
print(len(x_text),len(y_test))
y_test=[0 if i=='glioma' else 1 if i=='meningioma' else 2 if i=='pituitary' else 3 for i in y_test]
y_test = np.array(y_test)

1311 1311


In [36]:
y_pred=model.predict(np.array(x_text))

[1m41/41[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m17s[0m 418ms/step


In [37]:
from sklearn.metrics import confusion_matrix,classification_report
y_pred=np.argmax(y_pred,axis=1)
print(classification_report(y_test,y_pred))

              precision    recall  f1-score   support

           0       0.86      0.96      0.91       300
           1       0.96      0.78      0.86       306
           2       0.93      0.99      0.96       300
           3       0.99      0.99      0.99       405

    accuracy                           0.93      1311
   macro avg       0.93      0.93      0.93      1311
weighted avg       0.94      0.93      0.93      1311



## save model

In [39]:
model.save('model.keras')