# Multiclass Brain Tumor Classifier — Runner Notebook

> **Purpose:** This notebook serves as the execution script for the **Multiclass brain tumor classification model**.  
> All core components (model architecture, data processing, training pipeline) are defined in separate Python modules to maintain a clean and modular project structure.


## Overview

This notebook is part of the **Brain Tumor AI** project, focusing on **multiclass classification** of medical images **(notumor, pituitary, meningioma, glioma)**.  
It is designed to:
- Load and configure the modular components (model, data module, transforms, helpers, callbacks, loggers).
- Execute the training process using **PyTorch Lightning**.
- Save the trained model for inference.

By separating logic into `.py` files, the project ensures:
- **Reusability:** Components can be reused across multiple experiments.
- **Maintainability:** Easier debugging and updates.
- **Clarity:** The notebook focuses on workflow and results, not implementation details.


> **Note:** This project is for learning and portfolio purposes only — not for clinical use.


## 1. Install Dependencies & Import Libraries

### 1.1 Install Dependencies
Install the required packages to ensure the notebook runs without missing dependencies.

- **`datasets`** — Dataset handling and loading utilities.  
- **`fsspec`** — File system interface for remote/local storage.  
- **`pytorch-lightning`** — High-level PyTorch framework for training.  
- **`albumentations`** — Advanced image augmentation library.  
- **`torchmetrics`** — Standardized metrics for PyTorch.

> Skip this step if the environment already has these packages installed.


In [None]:
!pip install -q -U datasets fsspec pytorch-lightning albumentations torchmetrics

### 1.2 Import Required Libraries

Below are the required libraries and modules used in this notebook:

- **os, sys** — For file and system path handling.
- **torch** — PyTorch core library for deep learning operations.
- **pytorch_lightning** — High-level wrapper for PyTorch to simplify training loops.
- **scikit-learn (train_test_split, compute_class_weight)** — For dataset splitting and class weight computation.
- **google.colab.drive** — To mount Google Drive and access stored datasets/models.
- **datasets.load_dataset** — To load datasets in various formats from the Hugging Face Datasets library.


In [None]:
import sys
import torch
import numpy as np
import pytorch_lightning as pl

from sklearn.model_selection import train_test_split
from sklearn.utils.class_weight import compute_class_weight


from google.colab import drive

from datasets import load_dataset

- **Mount Google Drive**  
  Using `drive.mount('/content/drive')` to connect the Colab environment with your Google Drive, enabling you to save and access files persistently.



In [None]:
drive.mount('/content/drive')

## 2. Define Project paths and Import Custom Module

### 2.1 Configure Directory Paths  
Here, we define the key directory paths used throughout the project:  

- **`CHECKPOINT_PATH`** — Location where model checkpoints will be saved and loaded from.  
- **`PROJECT_PATH`** — Root path of the project, used as a base reference for file operations.  
- **`SAVE_PATH`** — Directory for storing final outputs, such as trained models.  

The `PROJECT_PATH` is appended to `sys.path` to make sure Python can locate and import the project modules without issues.


In [None]:
CHECKPOINT_PATH = "/content/drive/MyDrive/MyProject/brain-tumor-ai/Models/2D_Classifier_Multiclass/checkpoint"

PROJECT_PATH = "/content/drive/MyDrive/MyProject/brain-tumor-ai/Models/2D_Classifier_Multiclass"

SAVE_PATH = "/content/drive/MyDrive/MyProject/brain-tumor-ai/Models/2D_Classifier_Multiclass/save_models"

- **Add Project Directory to Python Path**

    To ensure that custom modules can be imported without issues, the project directory (`PROJECT_PATH`) is appended to the Python system path (`sys.path`).  
This step allows Python to locate and load modules defined in your project folder.

In [None]:
if PROJECT_PATH not in sys.path:
  sys.path.append(PROJECT_PATH)

### 2.2 Import Custom Modules

This section imports the custom Python modules that define the model architecture, data pipeline, training callbacks, and helper functions.  
By keeping these components in separate files, the project maintains a clean and modular structure.

- **DenseNetClassifierMulticlass** — Custom PyTorch Lightning model for multiclass brain tumor classification.  
- **BrainTumorDataModule** — Handles data loading, preprocessing, and batching using PyTorch Lightning's DataModule structure.  
- **get_callbacks** — Retrieves predefined training callbacks such as model checkpointing and early stopping.  
- **get_logger** — Logger utility for experiment tracking and visualization.  
- **utils** — Utility functions to ensure reproducibility and dataset conversion.

> Keeping core logic in separate `.py` files improves reusability, maintainability, and

In [None]:
from module import DenseNetClassifierMulticlass
from datamodule import BrainTumorDataModule
from callbacks import get_callbacks
from logger import get_logger
from utils import set_seed, hf_dataset_to_tuple

## 3. Define Seed, Load and Prepare Raw Dataset


### 3.1 Set Random Seed

To ensure reproducibility of results, a fixed random seed is set at the beginning of the data preparation process.  
By setting the seed, all operations involving randomness (such as data shuffling, train-test splitting, and weight initialization) will produce the same outcome each time the notebook is executed. This step is crucial for debugging and for achieving consistent experimental results.


In [None]:
set_seed(42)

### 3.2 Load Dataset from Hugging Face

We load the dataset directly using the `datasets` library. The dataset contains labeled 2D brain MRI scans across **four classes:** **notumor**, **pituitary**, **meningioma**, **glioma**.

In [None]:
ds = load_dataset("Cayanaaa/BrainTumorDatasets", name="multiclass")

### 3.3 View Class Label Mapping

This command reveals the label names and their corresponding integer encodings used internally by the dataset.


In [None]:
print(ds['train'].features['label'].names)

### 3.4 Convert Hugging Face Dataset to Tuples

The raw Hugging Face dataset is converted into Python tuples containing image data and corresponding labels.  
This step simplifies further processing and integration with PyTorch and custom data modules.

- **`hf_dataset_to_tuple`** — Utility function that extracts images and labels from the dataset, returning them as separate lists or arrays for easy manipulation.

In [None]:
train_data = ds['train']

images, labels = hf_dataset_to_tuple(train_data, image_key='image', label_key='label')

### 3.5 Stratified Train-Validation Split

To ensure a balanced class distribution between the training and validation sets, we perform a **stratified split** using `train_test_split` from scikit-learn.  
This approach helps prevent class imbalance issues during model training and evaluation.

- **`train_imgs`, `val_imgs`** — Image data for training and validation.
- **`train_labels`, `val_labels`** — Corresponding labels for each split.

The split uses a fixed `random_state` for reproducibility and the `stratify` parameter to maintain class proportions.

In [None]:
train_imgs, val_imgs, train_labels, val_labels = train_test_split(
    images, labels,
    test_size = 0.2,
    random_state = 42,
    stratify = labels
)

### 3.6 Compute Class Weights

To address potential class imbalance in the dataset, we calculate class weights using `compute_class_weight` from scikit-learn.  
These weights are used during model training to ensure that each class contributes equally to the loss function, improving overall model performance and fairness.

In [None]:
labels = np.array(labels)

class_weight = compute_class_weight(
    class_weight='balanced',
    classes=np.unique(labels),
    y=labels
)

class_weight = torch.tensor(class_weight, dtype=torch.float32)

## 4. Data Module 

### 4.1 Initialize Data Module

The `BrainTumorDataModule` is instantiated to handle data loading, preprocessing, and batching for both training and validation sets.  
This module streamlines the data pipeline, ensuring efficient and reproducible data handling throughout the training process.

- **`train_data`** — Tuple containing training images and labels.
- **`val_data`** — Tuple containing validation images and labels.
- **`batch_size`** — Number of samples per batch during training.
- **`num_worker`** — Number of subprocesses used for data loading.

Using a custom DataModule improves modularity and simplifies integration with PyTorch Lightning.

In [None]:
data_module = BrainTumorDataModule(
    train_data = (train_imgs, train_labels),
    val_data = (val_imgs, val_labels),
    batch_size = 64,
    num_worker = 4
)

## 5. Warm-up Training Phases

### 5.1 Model Initialization for the Warm-Up Phase

In this step, we initialize a multi-class brain tumor classifier using the DenseNet121 backbone.
For the warm-up phase, all pre-trained layers remain frozen—only the final classifier head can be trained.
This strategy allows the model to adapt its output layers to our specific tumor class while preserving robust features learned from large-scale datasets (e.g., ImageNet).

Why freeze the backbone?

Freezing the backbone during warm-up:
- Prevents the perturbation of valuable pre-trained features.
- Allows the classifier head to specialize for the new task.
- Stabilizes training before deeper fine-tuning.

Configuration:
- `learning_rate`: Controls the learning rate for the classification head.
- `weight_decay`: Regularization to prevent overfitting.
- `unfreeze_layers`: Set to `None` to keep all backbone layers frozen **(except the classifier).**
- `class_weight`: Handles class imbalance during training.

This initialization establishes the foundation for effective transfer learning and prepares the model for further fine-tuning.

In [None]:
model_warmup = DenseNetClassifierMulticlass(
    learning_rate = 1e-3,
    weight_decay = 1e-5,
    unfreeze_layers = None,
    class_weight = class_weight
)

### 5.2 Configuring Callbacks

In this step, we set up the **callbacks** that will be used during model training.  
Callbacks in PyTorch Lightning provide a mechanism to inject custom behavior at various stages of the training loop — such as saving checkpoints, early stopping, or scheduling learning rates.

Here, we use the custom function `get_callbacks()` to create and configure the following:

- **Model Checkpointing**  
  Automatically saves the model's weights whenever the monitored metric (`val_loss`) improves.  
  - **`dirpath`**: Path to store checkpoint files.  
  - **`monitor`**: Metric used to decide if a new checkpoint should be saved (`val_loss` in this case).  
  - **`mode`**: Set to `"min"` so that lower values of `val_loss` are considered better.  

- **Early Stopping**  
  Stops training early if the monitored metric does not improve after a defined patience period (`patience=3` here), preventing overfitting and saving time.

> *By modularizing callbacks into a separate function (`get_callbacks()`), we maintain cleaner code and make it easier to reuse and adjust the configuration across multiple experiments.*


In [None]:
callbacks_warmup = get_callbacks(
    dirpath = CHECKPOINT_PATH,
    monitor = 'val_loss',
    mode = 'min',
    patience = 3
)

### 5.3 Configure Logger for Warm-Up Phase

To track training progress and save experiment logs, we initialize a logger using the custom `get_logger()` function.  
This logger records metrics, checkpoints, and other relevant information for the warm-up phase.

- **`log_dir`** — Directory where logs will be stored.
- **`name`** — Unique identifier for the logger, useful for organizing multiple experiments.

Consistent logging ensures reproducibility and simplifies analysis of model performance across different training runs.

In [None]:
logger_warmup = get_logger(
    log_dir = PROJECT_PATH/logs,
    name = "best_warmup_model_checkpoint"
)

### 5.4 Configure Trainer for Warm-Up Phase

This cell sets up the **PyTorch Lightning Trainer** which orchestrates the training loop.

Configuration:
- **`max_epochs`**: The maximum number of training epochs.
- **`accelerator`**: Automatically selects the best available device (GPU/CPU).
- **`callbacks`**: Includes checkpointing and early stopping to optimize training.
- **`logger`**: Enables logging of metrics to TensorBoard.
- **`log_every_n_steps`**: Logs training metrics every 10 batches for timely monitoring.


In [None]:
trainer_warmup = pl.Trainer(
    max_epochs = 200,
    accelerator = 'gpu',
    precision = '16-mixed',
    callbacks = callbacks_warmup,
    logger = logger_warmup,
    log_every_n_step = 10,
    device = 1
)

### 5.5 Execute Warm-Up Training

In this step, the training process for the warm-up phase is started using the configured Trainer.

- The model (**`model_warmup`**) is trained with the prepared data module (**`data_module`**).
- The training loop runs for up to **`max_epochs`** epochs or until early stopping criteria are met.
- Training progress, metrics, and checkpoints are automatically handled by the Trainer and callbacks.


In [None]:
trainer_warmup.fit(model_warmup, datamodule = data_module)

## 6. Finetune Training Phases

### 6.1 Model Initialization for the Fine-Tuning Phase

In this phase, the multi-class brain tumor classifier model is initialized for fine-tuning.

Unlike the warm-up phase, some backbone layers (e.g., `features.denseblock4`, `features.norm5`) will be unfrozen to allow their weights to be updated during training.

**Goals of fine-tuning:**
- Improve feature representation to be more specific to the brain tumor dataset.
- Improve model accuracy and generalization.

**Configuration:**
- `learning_rate`: Smaller to avoid drastic changes to pre-trained weights.
- `weight_decay`: Regularization to prevent overfitting.
- `unfreeze_layers`: List of backbone layers to be unfrozen.
- `class_weight`: Class imbalance handling.

This strategy allows the model to leverage pre-trained knowledge while optimally tailoring features to the brain tumor classification task.

In [None]:
model_finetune = DenseNetClassifierMulticlass(
    learning_rate = 1e-5,
    weight_decay = 1e-6,
    unfreeze_layers = ["features.denseblock4", "features.norm5"]
)

### 6.2 Configuring Callbacks for Fine-Tuning Phase

In this step, we set up the **callbacks** for the fine-tuning phase of model training.  
Callbacks in PyTorch Lightning allow us to automate important tasks such as saving model checkpoints and stopping training early when improvements plateau.

We use the custom `get_callbacks()` function to configure:

- **Model Checkpointing**  
    Automatically saves the model's weights whenever the monitored metric (`val_loss`) improves during fine-tuning.  
    - **`dirpath`**: Directory to store checkpoint files.  
    - **`monitor`**: Metric used to determine if a new checkpoint should be saved (`val_loss`).  
    - **`mode`**: `"min"` so that lower values of `val_loss` are considered better.  

- **Early Stopping**  
    Stops training if `val_loss` does not improve after a set patience period (`patience=3`), helping to prevent overfitting and save resources.

> *By modularizing callbacks into a separate function (`get_callbacks()`), we keep the code clean and make it easy to reuse or adjust callback settings for different training phases.*

In [None]:
callbacks_finetune = get_callbacks(
    dirpath = CHECKPOINT_PATH,
    monitor = 'val_loss',
    mode = 'min',
    patience = 3
)

### 6.3 Configure Logger for Fine-Tuning Phase

To monitor and record the training progress during the fine-tuning phase, we initialize a logger using the custom `get_logger()` function.  
This logger will save experiment logs, metrics, and checkpoints specific to the fine-tuning process.

- **`log_dir`** — Directory where logs for the fine-tuning phase will be stored.
- **`name`** — Unique identifier for the logger, helping organize and distinguish between different training runs.

Consistent logging during fine-tuning ensures reproducibility and simplifies the analysis and comparison of model performance across different experiments.

In [None]:
logger_finetune = get_logger(
    log_dir = PROJECT_PATH/logs,
    name = "best_finetune_model_checkpoint"
)

### 6.4 Configure the Trainer for Fine-Tuning

In this step, we configure the PyTorch Lightning Trainer for fine-tuning.

Configuration:
- **`max_epochs`**: Maximum number of training epochs.
- **`accelerator`**: Selects the best available device (GPU/CPU).
- **`precision`**: Mixed precision for training efficiency.
- **`callbacks`**: Includes checkpointing and early stopping to optimize training.
- **`logger`**: Logs metrics to TensorBoard.
- **`log_every_n_step`**: Logs every 10 batches for better monitoring.

This trainer will run the fine-tuning process on the model with configurations adjusted to improve performance on the multi-class brain tumor dataset.

In [None]:
trainer_finetune = pl.Trainer(
    max_epochs = 200,
    accelerator = 'gpu',
    precision = '16-mixed',
    callbacks = callbacks_finetune,
    logger = logger_finetune,
    log_every_n_step = 10,
    device = 1
)

### 6.5 Execute Fine-Tuning Training

In this step, the training process for the fine-tuning phase is run using the configured Trainer.

- The model (**`model_finetune`**) will be trained using the prepared data module (**`data_module`**).
- The training process runs until **`max_epochs`** is reached or until the early stopping criterion is met.
- Training progress, metrics, and checkpoints will be automatically handled by the Trainer and callbacks.

The purpose of this fine-tuning phase is to optimize the model's feature representation to be more specific to the multiclass brain tumor dataset.

In [None]:
trainer_finetune(model_finetune, datamodule=data_module)

## 7. Load and Save checkpoint and best model

### 7.1 Load Best Model Checkpoint

After training, we load the path to the best model checkpoint saved during fine-tuning.  
This checkpoint contains the weights of the model that achieved the lowest validation loss.

- **Print Best Checkpoint Path:**  
    Display the file path of the best checkpoint for reference and verification.

- **Assign Best Checkpoint Path:**  
    Store the best checkpoint path in a variable for subsequent loading and inference.

This step ensures that we use the most optimal model for evaluation and deployment.

In [None]:
print(callbacks_finetune[0].best_model_path)

In [None]:
best_checkpoint_model_path = callbacks_finetune[0].best_model_path

### 7.2 Load Best Model and Save Final Weights

After identifying the best checkpoint from the fine-tuning phase, we proceed with the following steps:

- **Load Best Model from Checkpoint:**  
    The model is reconstructed using the weights stored in the best checkpoint file (`best_checkpoint_model_path`).  
    This ensures that all parameters reflect the optimal state achieved during training.

- **Save Final Model Weights:**  
    The loaded model's weights are saved as a `.pth` file in the project directory.  
    This file can be used for future inference, deployment, or further experimentation.

By saving the final model weights, we ensure reproducibility and simplify downstream usage for clinical or research applications.

In [None]:
best_model = DenseNetClassifierMulticlass.load_from_checkpoint(best_checkpoint_model_path)

In [None]:
torch.save(best_model.state_dict(), f"{PROJECT_PATH}/best_ft_braTS_multiclass.pth")

# Conclusion

> This notebook marks a significant milestone in my personal learning journey as I transitioned from standard PyTorch to PyTorch Lightning.

Through this experience, I gained valuable insights into:
- How to manage deep learning projects modularly for increased clarity and maintainability.
- The practical benefits of PyTorch Lightning in simplifying training workflows, including built-in support for checkpointing, logging, and callbacks.
- Implementing a two-phase training strategy (warm-up and fine-tune) to effectively adapt pre-trained models to new tasks.
- Recognizing and correcting previous errors in the binary runner, resulting in cleaner logic and more reliable metric tracking.

> While focused on model training and storage, this notebook lays a solid foundation for the future evaluation and inference phases, which will be handled separately to keep the workflow clean and manageable.

This hands-on exploration not only deepened my understanding of deep learning engineering best practices but also established a professional and reproducible body of work that can be extended or adapted for other projects and users.