# Baseline Classifier Performance

This notebook runs a baseline classifier on multiple training subsets and evaluates on several validation sets, across multiple random seeds. Results (mean ± std) will serve as a reference when comparing later augmented models.


## 1. Imports and Setup

Load all necessary libraries, dataset helpers, and set up the device and loss function.


In [None]:
import os
import sys
sys.path.insert(0, os.path.abspath('..'))

import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader

from pkldataset import PKLDataset
from helpers import set_seed, get_model, train_model, eval_model

# Device and loss
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
criterion = nn.CrossEntropyLoss()


## 2. Configuration

- **Train paths**: directories containing pickle files for various subset sizes  
- **Validation paths**: held-out datasets for evaluation  
- **Seeds**: for estimating training stability


In [None]:
# Training subsets (folders w pkl files)
train_paths = [
    "../datasets/RPDC197/train_20",
    "../datasets/RPDC197/train_50",
    "../datasets/RPDC197/train_100",
    "../datasets/RPDC197/train_200",
    "../datasets/RPDC197/train_300",
    "../datasets/RPDC197/train_400",
    "../datasets/RPDC197/train_500",
    "../datasets/RPDC197/train_600",
]

# Validation sets
val_paths = [
    "../datasets/RPDC185/val_1000",
    "../datasets/RPDC188/val_1000",
    "../datasets/RPDC191/val_1000",
    "../datasets/RPDC194/val_1000",
    "../datasets/RPDC197/val_1000",
]

# Seeds for reproducibility
seeds = [101, 202, 303, 404, 505, 606, 707, 808, 909, 1001]

# Prepare results container: {train_path: {val_path: [accuracies]}}
results = {tp: {vp: [] for vp in val_paths} for tp in train_paths}


## 3. Multi-Seed Training & Evaluation Loop

For each seed:
1. Set the random seed  
2. For each training subset:
   - Load data  
   - Instantiate model, optimizer, scheduler  
   - Train for 50 epochs  
3. Evaluate on every validation set and record accuracy


In [None]:
if __name__ == "__main__":
    for seed in seeds:
        print(f"\n=== Seed {seed} ===")
        set_seed(seed)

        for tp in train_paths:
            print(f"-- Training on {tp}")
            ds_real = PKLDataset(tp)
            train_loader = DataLoader(ds_real, batch_size=32, shuffle=True)

            model = get_model().to(device)
            optimizer = optim.Adam(model.parameters(), lr=1e-3, weight_decay=1e-5)
            scheduler = optim.lr_scheduler.StepLR(optimizer, step_size=15, gamma=0.1)

            model = train_model(
                model,
                train_loader,
                criterion,
                optimizer,
                scheduler,
                num_epochs=50,
                device=device
            )

            for vp in val_paths:
                val_loader = DataLoader(PKLDataset(vp), batch_size=64, shuffle=False)
                acc = eval_model(model, val_loader, device)
                results[tp][vp].append(acc)
                print(f"[{tp} -> {vp}] Seed {seed}: Acc = {acc:.2f}%")


## 4. Summary of Results

Compute mean and standard deviation of accuracy across seeds for each (train → val) pair.


In [None]:
print("\n=== Summary across seeds ===")
for tp in train_paths:
    for vp in val_paths:
        arr = np.array(results[tp][vp])
        mean, std = arr.mean(), arr.std(ddof=1)
        print(f"{tp} -> {vp}: Mean = {mean:.2f}%, Std = {std:.2f}%")
