# Binary Brain Tumor Classifier — Runner Notebook

> **Purpose:** This notebook serves as the execution script for the **binary brain tumor classification model**.  
> All core components (model architecture, data processing, training pipeline) are defined in separate Python modules to maintain a clean and modular project structure.


## Overview

This notebook is part of the **Brain Tumor AI** project, focusing on **binary classification** of medical images (tumor vs. no tumor).  
It is designed to:
- Load and configure the modular components (model, data module, transforms, helpers, callbacks, loggers).
- Execute the training process using **PyTorch Lightning**.
- Save the trained model for inference.

By separating logic into `.py` files, the project ensures:
- **Reusability:** Components can be reused across multiple experiments.
- **Maintainability:** Easier debugging and updates.
- **Clarity:** The notebook focuses on workflow and results, not implementation details.


> **Note:** This project is for learning and portfolio purposes only — not for clinical use.


## 1. Install Dependencies & Import Libraries

### 1.1 Install Dependencies
Install the required packages to ensure the notebook runs without missing dependencies.

- **`datasets`** — Dataset handling and loading utilities.  
- **`fsspec`** — File system interface for remote/local storage.  
- **`pytorch-lightning`** — High-level PyTorch framework for training.  
- **`tensorboard`** — Visualization of training logs.  
- **`albumentations`** — Advanced image augmentation library.  
- **`torchmetrics`** — Standardized metrics for PyTorch.

> Skip this step if the environment already has these packages installed.


In [None]:
!pip install -q -U datasets fsspec pytorch-lightning tensorboard albumentations torchmetrics

### 1.2 Import Required Libraries

Below are the required libraries and modules used in this notebook:

- **os, sys** — For file and system path handling.
- **torch** — PyTorch core library for deep learning operations.
- **pytorch_lightning** — High-level wrapper for PyTorch to simplify training loops.
- **TensorBoardLogger** — For logging training metrics to TensorBoard.
- **scikit-learn (train_test_split)** — For dataset splitting.
- **google.colab.drive** — To mount Google Drive and access stored datasets/models.
- **datasets.load_dataset** — To load datasets in various formats from the Hugging Face Datasets library.


In [None]:
import os
import sys
import torch
import numpy as np
import pytorch_lightning as pl
from pytorch_lightning.loggers import TensorBoardLogger

from sklearn.model_selection import train_test_split

from google.colab import drive

from datasets import load_dataset

- **Mount Google Drive**  
  Using `drive.mount('/content/drive')` to connect the Colab environment with your Google Drive, enabling you to save and access files persistently.



In [3]:
drive.mount('/content/drive')

Mounted at /content/drive


## 2. Define Project paths and Import Custom Module

### 2.1 Configure Directory Paths  
Here, we define the key directory paths used throughout the project:  

- **`CHECKPOINT_PATH`** — Location where model checkpoints will be saved and loaded from.  
- **`PROJECT_PATH`** — Root path of the project, used as a base reference for file operations.  
- **`SAVE_PATH`** — Directory for storing final outputs, such as trained models.  

The `PROJECT_PATH` is appended to `sys.path` to make sure Python can locate and import the project modules without issues.


In [4]:
CHECKPOINT_PATH = "/content/drive/MyDrive/MyProject/brain-tumor-ai/Models/2D_Classifier_Binary/checkpoint"

PROJECT_PATH = "/content/drive/MyDrive/MyProject/brain-tumor-ai/Models/2D_Classifier_Binary"

SAVE_PATH = "/content/drive/MyDrive/MyProject/brain-tumor-ai/Models/2D_Classifier_Binary/save_models"

In [5]:
if PROJECT_PATH not in sys.path:
    sys.path.append(PROJECT_PATH)

### 2.2 Import Custom Modules

This section imports the custom Python modules that define the model architecture, data pipeline, training callbacks, and helper functions.  
By keeping these components in separate files, the project maintains a clean and modular structure.

- **DenseNetClassifierBinary** → Custom PyTorch Lightning model for binary brain tumor classification.  
- **BrainTumorDataModule** → Handles data loading, preprocessing, and batching using PyTorch Lightning's DataModule structure.  
- **get_callbacks** → Retrieves predefined training callbacks such as model checkpointing and early stopping.  
- **set_seed** → Utility function to ensure reproducibility across runs.


In [6]:
from module import DenseNetClassifierBinary
from datamodule import BrainTumorDataModule
from callbacks import get_callbacks
from utils import set_seed, hf_dataset_to_tuple

## 3. Define Seed, Load and Prepare Raw Dataset


### 3.1 Set Random Seed

To ensure reproducibility of results, a fixed random seed is set at the beginning of the data preparation process.  
By setting the seed, all operations involving randomness (such as data shuffling, train-test splitting, and weight initialization) will produce the same outcome each time the notebook is executed. This step is crucial for debugging and for achieving consistent experimental results.


In [7]:
set_seed(42)

### 3.2 Load Dataset from Hugging Face

We load the dataset directly using the `datasets` library. The dataset contains labeled 2D brain MRI scans across two classes: **tumor** and **no tumor**.

In [None]:
ds = load_dataset("Cayanaaa/BrainTumorDatasets", name="binary")

### 3.3 View Class Label Mapping

This command reveals the label names and their corresponding integer encodings used internally by the dataset.


In [None]:
print(ds['train'].features['label'].names)

### 3.4 Extract Train and Test from Hugging Faces Dataset

We extract the raw image and label pairs from the dataset for further processing.


In [10]:
train_data = ds['train']

### 3.5 Convert dataset.arrow from Huggingfaces to Tuple

In [11]:
images, labels = hf_dataset_to_tuple(train_data, image_key='image', label_key='label')

In [None]:
type(images), type(labels)

### 3.6 Stratified Train-Validation Split

To ensure balanced class distribution across the training and validation sets, we perform a stratified split. This minimizes the risk of class imbalance during model training.


In [13]:
train_imgs, val_imgs, train_labels, val_labels = train_test_split(
    images, labels,
    test_size = 0.2,
    random_state = 42,
    stratify = labels
)

In [None]:
type(train_imgs), type(train_labels)

### 3.6 Initialize Data Module

We instantiate the **`BrainTumorDataModule`** with the prepared training and validation datasets.  
This module handles data loading, preprocessing, and batching automatically during training and validation.

**Parameters:**
- **`train_data`** & **`val_data`** — Tuples containing image tensors and corresponding labels.
- **`batch_size`** — Number of samples per batch during training/validation.
- **`img_size`** — Target spatial size for resizing images before feeding them into the model.
- **`num_workers`** — Number of subprocesses to use for data loading to speed up I/O operations.

By using a **`LightningDataModule`**, we ensure a clean separation between the **data pipeline** and the **model logic**, improving code maintainability and reusability.


In [15]:
data_module = BrainTumorDataModule(
    train_data = (train_imgs, train_labels),
    val_data = (val_imgs, val_labels),
    batch_size = 64,
    img_size = (224, 224),
    num_workers = 4
)

## 4. Warm-Up Training Phase


### 4.1 Model Initialization for the Warm-Up Phase

In this step, we initialize a **binary classifier** model using `DenseNet121` as the backbone.  
During the **warm-up phase**, all pre-trained layers remain **frozen** to preserve the features learned from ImageNet.  
Only the final **classification head** is trained, which helps stabilize the training process before fine-tuning deeper layers.

**Configuration for this phase:**
- **`learning_rate:`** `1e-3` — relatively high for faster convergence on the new classification head.
- **`weight_decay:`** `1e-5` — small *L2 regularization* to prevent overfitting.
- **`unfreeze_layers:`** `None` — ensures that only the classification head can be trained.

> **Note:** The classifier will be unfrozen in later phases for fine-tuning deeper layers.


In [None]:
model_warmup = DenseNetClassifierBinary(
    learning_rate = 1e-3,
    weight_decay = 1e-5,
    unfreeze_layers = None
)

### 4.2 Configuring Callbacks

In this step, we set up the **callbacks** that will be used during model training.  
Callbacks in PyTorch Lightning provide a mechanism to inject custom behavior at various stages of the training loop — such as saving checkpoints, early stopping, or scheduling learning rates.

Here, we use the custom function `get_callbacks()` to create and configure the following:

- **Model Checkpointing**  
  Automatically saves the model's weights whenever the monitored metric (`val_loss`) improves.  
  - **`dirpath`**: Path to store checkpoint files.  
  - **`monitor`**: Metric used to decide if a new checkpoint should be saved (`val_loss` in this case).  
  - **`mode`**: Set to `"min"` so that lower values of `val_loss` are considered better.  

- **Early Stopping**  
  Stops training early if the monitored metric does not improve after a defined patience period (`patience=3` here), preventing overfitting and saving time.

> *By modularizing callbacks into a separate function (`get_callbacks()`), we maintain cleaner code and make it easier to reuse and adjust the configuration across multiple experiments.*


In [17]:
callbacks_warmup = get_callbacks(
    dirpath = CHECKPOINT_PATH,
    monitor = 'val_loss',
    mode = 'min',
    patience = 3
)

### 4.3 Setup TensorBoard Logger for Warm-Up Phase

In this step, we initialize the **TensorBoard logger** to track and visualize training metrics during the warm-up phase.

- **`save_dir`** specifies the root directory where logs will be stored.  
- Logs are saved inside a subfolder named `"warmup"` to keep warm-up training logs organized separately from other phases.  
- This setup enables detailed monitoring of key metrics such as loss, accuracy, and learning rate using TensorBoard’s interactive web interface.

> **Note:** The magic command `%load_ext tensorboard` is executed once to enable TensorBoard integration in this notebook session.  
> After that, the `%tensorboard` command can be run multiple times to launch the TensorBoard UI pointing to the appropriate log directory without needing to reload the extension.


In [None]:
logger_warmup = TensorBoardLogger(
    save_dir = os.path.join(SAVE_PATH, "logs"),
    name = "best_warmup_model"
)

### 4.4 Configure Trainer for Warm-Up Phase

This cell sets up the **PyTorch Lightning Trainer** which orchestrates the training loop.

Key parameters:
- **`max_epochs`**: The maximum number of training epochs.
- **`accelerator`**: Automatically selects the best available device (GPU/CPU).
- **`callbacks`**: Includes checkpointing and early stopping to optimize training.
- **`logger`**: Enables logging of metrics to TensorBoard.
- **`log_every_n_steps`**: Logs training metrics every 10 batches for timely monitoring.


In [None]:
trainer_warmup = pl.Trainer(
    max_epochs = 100,
    accelerator = 'gpu',
    precision='16-mixed',
    callbacks = callbacks_warmup,
    logger = logger_warmup,
    log_every_n_steps = 10,
    devices=1
)

### 4.5 Execute Warm-Up Training

In this step, the training process for the warm-up phase is started using the configured Trainer.

- The model (**`model_warmup`**) is trained with the prepared data module (**`data_module`**).
- The training loop runs for up to **`max_epochs`** epochs or until early stopping criteria are met.
- Training progress, metrics, and checkpoints are automatically handled by the Trainer and callbacks.
- TensorBoard UI will be launched automatically, allowing you to monitor training metrics in real-time.


In [None]:
trainer_warmup.fit(model_warmup, datamodule=data_module)

## 5 Finetune Training Phases

### 5.1 Model Initialization for the Fine-Tuning Phase

In this phase, we initialize the binary classifier model with selective layer unfreezing to allow fine-tuning.

- **`learning_rate:`** Lowered to **`1e-5`** for more precise updates and to avoid disrupting previously learned features.
- **`weight_decay:`** Reduced to **`1e-6`** for minimal regularization, allowing more flexibility during fine-tuning.
- **`unfreeze_layers:`** Specific layers such as **`features.denseblock4`** and **`features.norm5`** are unfrozen to enable gradient updates, while other layers remain frozen.
  
> This strategy allows the model to adapt deeper feature representations to the new task while maintaining stability in earlier layers.


In [None]:
model_finetune = DenseNetClassifierBinary(
    learning_rate = 1e-5,
    weight_decay = 1e-6,
    unfreeze_layers = ["features.denseblock4", "features.norm5"]
)

### 5.2 Configuring Callbacks for Fine-Tuning

In this step, we set up the **callbacks** that will be used during the **fine-tuning** phase.  
Callbacks in PyTorch Lightning provide a mechanism to inject custom behavior at various stages of the training loop — such as saving checkpoints, early stopping, or scheduling learning rates.

Here, we use the custom function `get_callbacks()` to create and configure the following:

- **Model Checkpointing**  
  Automatically saves the model's weights whenever the monitored metric (`val_loss`) improves.  
  - **`dirpath`**: Path to store checkpoint files (defined by `CHECKPOINT_PATH`).  
  - **`monitor`**: Metric used to decide if a new checkpoint should be saved (`val_loss` in this case).  
  - **`mode`**: Set to `"min"` so that lower values of `val_loss` are considered better.

- **Early Stopping**  
  Stops training early if the monitored metric does not improve after a defined patience period (`patience=3` here), preventing overfitting and saving time.

> *By modularizing callbacks into a separate function (`get_callbacks()`), we maintain cleaner code and make it easier to reuse and adjust the configuration across multiple experiments.*


In [23]:
callbacks_finetune = get_callbacks(
    dirpath = CHECKPOINT_PATH,
    monitor = 'val_loss',
    mode = 'min',
    patience = 3
)

### 5.3 Setup TensorBoard Logger for Fine-Tuning Phase

This step initializes the **TensorBoard logger** to track and visualize training metrics during the fine-tuning phase.  

- **`save_dir`** specifies the root directory where logs will be stored.  
- Logs are saved inside a subfolder named `"finetune"` to keep fine-tuning training logs organized separately from other phases.

Using TensorBoard enables easy monitoring of key metrics such as loss, accuracy, and learning rate through an interactive web interface.


In [24]:
logger_finetune = TensorBoardLogger(
    save_dir = os.path.join(SAVE_PATH, "logs"),
    name = "best_finetune_model"
)

### 5.4 Configure Trainer for Warm-Up Phase

This cell sets up the **PyTorch Lightning Trainer** which orchestrates the training loop.

Key parameters:
- **`max_epochs`**: The maximum number of training epochs.
- **`accelerator`**: Automatically selects the best available device (GPU/CPU).
- **`callbacks`**: Includes checkpointing and early stopping to optimize training.
- **`logger`**: Enables logging of metrics to TensorBoard.
- **`log_every_n_steps`**: Logs training metrics every 10 batches for timely monitoring.


In [None]:
trainer_finetune = pl.Trainer(
    max_epochs = 100,
    accelerator = 'auto',
    callbacks = callbacks_finetune,
    logger = logger_finetune,
    log_every_n_steps = 10
)

### 5.5 Execute Fine-Tuning Training

In this step, the training process for the fine-tuning phase is started using the configured Trainer.

- The model (**`model_finetune`**) is trained with the prepared data module (**`data_module`**).
- The training loop runs for up to **`max_epochs`** epochs or until early stopping criteria are met.
- Training progress, metrics, and checkpoints are automatically handled by the Trainer and callbacks.
- TensorBoard UI will be launched automatically, allowing you to monitor training metrics in real-time.


In [None]:
trainer_finetune.fit(model_finetune, datamodule=data_module)

## 6. Save and Export Fine-Tuned Model


### 6.1 Load Best Checkpoint After Fine-Tuning

After completing the fine-tuning process, we retrieve the path of the **best checkpoint** automatically saved by the `ModelCheckpoint` callback.

- `callbacks_finetune[0].best_model_path` accesses the best checkpoint based on the monitored metric (`val_loss` in this case).
- Using PyTorch Lightning's `load_from_checkpoint` method, the model is reloaded with weights from this best checkpoint.
- This loaded model can then be saved as a `.pth` file for easy storage and future use.
- The saved `.pth` model file can be used later for **evaluation** and **inference** without needing to retrain or reload the entire checkpoint.

This workflow ensures a clean separation between training, model saving, and later deployment or analysis.


In [28]:
best_checkpoint_path = callbacks_finetune[0].best_model_path

In [None]:
best_model = DenseNetClassifierBinary.load_from_checkpoint(best_checkpoint_path)

### 6.2 Save Fine-Tuned Model

In this step, we perform two important actions:

- **Save Model Weights**  
  The fine-tuned model's parameters (weights) are saved as a `.pth` file using `torch.save()`.  
  - `best_model.state_dict()` extracts the model's state dictionary containing all learnable parameters.  
  - The file is saved at the specified `PROJECT_PATH` with the name `best_ft_braTS_binary.pth`.  
  - Saving the model weights separately allows lightweight storage and easy loading for future inference or evaluation without the full training checkpoint overhead.


In [31]:
torch.save(best_model.state_dict(), f"{PROJECT_PATH}/best_ft_braTS_binary.pth")

# Conclusion

> This notebook marks an important milestone in my personal learning journey as I transition from vanilla PyTorch to PyTorch Lightning.  

Through this experience, I have gained valuable insights into:  
- How to organize deep learning projects modularly for improved clarity and maintainability.  
- The practical benefits of PyTorch Lightning in simplifying training workflows, including built-in support for checkpointing, logging, and callbacks.  
- Implementing a two-phase training strategy (warm-up and fine-tuning) to effectively adapt pre-trained models to new tasks.  
- Using TensorBoard for real-time monitoring and Google Drive for seamless model persistence in cloud environments.

> While focused on training and model saving, this notebook lays a solid foundation for future evaluation and inference stages, which will be handled separately to keep workflows clean and manageable.

This hands-on exploration not only deepens my understanding of deep learning engineering best practices but also builds a professional and reproducible pipeline that can be extended or adapted for other projects and users.
