# ODIR-5K Dataset Loader

## Objective
This notebook defines the PyTorch data pipeline used by all models.
It:
- Loads frozen `train.csv` and `val.csv`
- Implements a custom PyTorch `Dataset`
- Applies image preprocessing and augmentation
- Builds CUDA-compatible `DataLoader` objects

This layer is purely **engineering**, not data manipulation.

## Section 1 - Import Required Libraries and paths

In [2]:
import pandas as pd
import numpy as np
from pathlib import Path

import torch
from torch.utils.data import Dataset, DataLoader
from torchvision import transforms
from PIL import Image

# Display options
pd.set_option("display.max_columns", None)

In [3]:
# Detect CUDA
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

print(f"Using device: {device}")
if device.type == "cuda":
    print(f"GPU: {torch.cuda.get_device_name(0)}")

Using device: cuda
GPU: NVIDIA GeForce RTX 2050


In [4]:
# Resolve project root (notebooks â†’ project root)
PROJECT_ROOT = Path.cwd().parent

DATA_DIR = PROJECT_ROOT / "data"
PROCESSED_DIR = DATA_DIR / "processed"

TRAIN_CSV = PROCESSED_DIR / "train.csv"
VAL_CSV = PROCESSED_DIR / "val.csv"

IMAGE_ROOT = DATA_DIR / "ODIR-5K" / "ODIR-5K"

assert TRAIN_CSV.exists(), "train.csv not found!"
assert VAL_CSV.exists(), "val.csv not found!"

In [5]:
# Load CSV Files
train_df = pd.read_csv(TRAIN_CSV)
val_df = pd.read_csv(VAL_CSV)

print(f"Training samples: {len(train_df)}")
print(f"Validation samples: {len(val_df)}")

train_df.head()

Training samples: 5600
Validation samples: 1400


Unnamed: 0,image_name,eye,split,N,D,G,C,A,H,M,O
0,741_left.jpg,left,train,0,0,0,0,0,0,1,0
1,3150_right.jpg,right,train,1,0,0,0,0,0,0,0
2,3419_right.jpg,right,train,1,0,0,0,0,0,0,0
3,4063_left.jpg,left,train,0,1,0,0,0,0,0,0
4,4607_right.jpg,right,train,0,1,0,0,0,0,0,0


In [6]:
# Label Columns
label_cols = ['N', 'D', 'G', 'C', 'A', 'H', 'M', 'O']
NUM_CLASSES = len(label_cols)

## Section 2 - Image Transforms

We use ImageNet-compatible normalization because all models use ImageNet-pretrained backbones. Augmentation is applied **only to training data**.

In [7]:
# ImageNet normalization stats
IMAGENET_MEAN = [0.485, 0.456, 0.406]
IMAGENET_STD = [0.229, 0.224, 0.225]

train_transforms = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.RandomHorizontalFlip(p=0.5),
    transforms.RandomRotation(10),
    transforms.ToTensor(),
    transforms.Normalize(mean=IMAGENET_MEAN, std=IMAGENET_STD),
])

val_transforms = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=IMAGENET_MEAN, std=IMAGENET_STD),
])

## Section 3 - Custom PyTorch Dataset

This dataset:
- Loads fundus images
- Returns `(image_tensor, multi_label_vector)`
- Is reusable across all model architectures

## Create Dataset Objects

In [8]:
# Dataset Class
class ODIRDataset(Dataset):
    def __init__(self, dataframe, image_root, transform=None):
        # Reset index to ensure clean integer indexing
        self.df = dataframe.reset_index(drop=True)

        # Root directory containing 'Training Images'
        self.image_root = image_root

        # Image transformations
        self.transform = transform

        # Cache labels for faster access (micro-optimization)
        self.labels = self.df[label_cols].values.astype(np.float32)

        # Cache image names
        self.image_names = self.df["image_name"].values

    def __len__(self):
        return len(self.df)

    def __getitem__(self, idx):
        # Resolve image path (training images only)
        image_path = self.image_root / "Training Images" / self.image_names[idx]

        # Load and convert image to RGB
        image = Image.open(image_path).convert("RGB")

        # Apply transformations if provided
        if self.transform:
            image = self.transform(image)

        # Convert labels to tensor
        labels = torch.from_numpy(self.labels[idx])

        return image, labels


In [9]:
# Instantiate Datasets
train_dataset = ODIRDataset(
    dataframe=train_df,
    image_root=IMAGE_ROOT,
    transform=train_transforms
)

val_dataset = ODIRDataset(
    dataframe=val_df,
    image_root=IMAGE_ROOT,
    transform=val_transforms
)

## Section 4 - Create DataLoaders

We enable:
- Shuffling for training
- Pinned memory for CUDA
- Configurable batch size

In [10]:
# DataLoaders
BATCH_SIZE = 16
NUM_WORKERS = 0

train_loader = DataLoader(
    train_dataset,
    batch_size=BATCH_SIZE,
    shuffle=True,
    num_workers=NUM_WORKERS,
    pin_memory=False
)

val_loader = DataLoader(
    val_dataset,
    batch_size=BATCH_SIZE,
    shuffle=False,
    num_workers=NUM_WORKERS,
    pin_memory=False
)

## Section 5 - Sanity Check

We verify:
- Batch shapes
- Label dimensions
- CUDA compatibility

In [11]:
for batch_idx, (images, labels) in enumerate(train_loader):
    print(f"Batch index: {batch_idx}")
    print("Image batch shape:", images.shape)
    print("Label batch shape:", labels.shape)

    # Move batch to GPU
    images = images.to(device, non_blocking=True)
    labels = labels.to(device, non_blocking=True)

    print("Images device:", images.device)
    print("Labels device:", labels.device)

    # Only test ONE batch
    break

Batch index: 0
Image batch shape: torch.Size([16, 3, 224, 224])
Label batch shape: torch.Size([16, 8])
Images device: cuda:0
Labels device: cuda:0


## Key Outcomes

- PyTorch `Dataset` and `DataLoader` objects were successfully implemented
- Image preprocessing and augmentation are standardized
- Multi-label targets are correctly formatted for BCE-based losses
- Data pipeline is CUDA-ready and reusable across models