# Medical Image Classification Analysis with YOLOv8
**By: Rithvik Burri**

## Project Summary
This project explores the application of a computer vision model, YOLOv8, to classify medical images—specifically CT scans —into four categories: **Pituitary, Meningioma, Glioma, and No Tumor**. The main goal was to evaluate how well a real-time object detection model like YOLOv8, originally designed for general object recognition tasks, could perform in the medical imaging space.

### Dataset

The dataset I used is the **Labeled MRI Brain Tumor dataset** in YOLO format, containing four categories: **glioma**, **meningioma**, **pituitary tumor**, and **no tumor**. It consists of MRI brain scans from multiple imaging modalities and was obtained from Kaggle.

The dataset includes:
- 1594 healthy (normal) brain samples  
- 1321 glioma samples  
- 1339 meningioma samples  
- 1457 pituitary tumor samples  

Each class includes multiple views: **axial, coronal, and sagittal**, which simply refer to the angles or planes from which the brain was scanned. Axial scans are horizontal slices, coronal scans go front-to-back, and sagittal scans are side-to-side. Including all three views helps the model learn from different perspectives, improving its ability to generalize.

Link to **Labeled MRI Brain Tumor dataset**: https://www.kaggle.com/datasets/ammarahmed310/labeled-mri-brain-tumor-dataset/data


A YAML configuration file was created to define the dataset’s path and class names, which is required for YOLOv8 training. We used the `YOLOv8n` model with 640x640 image size, 80 epochs, and a batch size of 16. Early stopping was set with a patience value of 15.

## Results
The final model performance was measured using a 100-image test set. Key results include:
- **Precision**: 87.6%
- **Recall**: 92.0%
- **Mean Average Precision (mAP)** was evaluated using two metrics:
    - **mAP@0.5** measures how well the model classifies and localizes when we consider a prediction correct if it overlaps the ground truth by at least 50%. It’s a standard metric often used in YOLO papers and referred to as “AP50” in COCO.
    - **mAP@0.50-0.95**, on the other hand, is a stricter version that averages results over multiple IoU thresholds (from 0.5 to 0.95). It rewards models that produce very tight bounding boxes and penalizes predictions that are too loose, even if the classification is right.


Below is the validation performance across epochs, showing consistent improvement in metrics over time:

![image.png](attachment:1251496b-4bc1-41f8-ba3a-5257dfbd0d4f.png)

The model’s precision and recall stay above 85%, showing it reliably detects and labels tumors. The mAP@0.5 score is over 90%, meaning the model does very well when its predicted box overlaps the actual tumor by at least 50%.
The red line, mAP@0.50–0.95, stays around 0.55. This is a tougher test that checks how tightly the boxes match the tumors. A 0.55 score means the model is decent, but its boxes aren’t always very precise.

The loss curves shown below also indicate healthy training behavior with no major signs of overfitting:

![image.png](attachment:88d1efa5-3f05-452c-8f30-f927440df9aa.png)

Loss values drop consistently, especially for classification (train_cls and val_cls). The gap between training and validation losses is minimal, suggesting that the model generalizes well.

## Discussion
One challenge was reorganizing the dataset into YOLO format. While the original dataset was comprehensive and well-labeled for classification tasks, YOLO requires specific folder structures and label formats. This step was essential to allow successful training.

Another challenge involved understanding YOLOv8’s configuration requirements, such as setting up the YAML file and formatting labels correctly. Once this setup was complete, the model trained smoothly and produced strong results.

## Conclusion
I confirmed that YOLOv8, although originally developed for general object detection, can be effectively applied to medical image classification—supporting prior findings I reviewed. The model achieved high precision and recall, showing its adaptability.

In the future, I want to experiment with alternative models such as ResNet, VLLM, or different versions of YOLO to further boost performance. Exploring models that are either domain-specific or offer stronger feature extraction could improve both classification accuracy and localization precision. These improvements could make the model more reliable in complex medical scenarios.