# FER2013 Quickstart (Train & Evaluate)

This notebook guides you through downloading FER2013 with KaggleHub, preparing the dataset, training a ResNet-18 classifier on grayscale 48×48 images, and evaluating with confusion matrix and per-class metrics.

## 0. Environment Setup
Create a virtual environment (optional) and install dependencies. If using Colab or a managed environment, you can skip the venv step.

In [None]:
# Create a venv locally (optional). On notebooks, venv activation is manual.
import os, subprocess, shlex
use_venv = False  # set True to create local venv
venv_dir = '.venv'
if use_venv and not os.path.exists(venv_dir):
    subprocess.run(shlex.split(f'python -m venv {venv_dir}'), check=True)
    print(f'Created {venv_dir}. To activate in a terminal: source {venv_dir}/bin/activate')

# Install requirements in current interpreter
!pip -q install -r requirements.txt -r requirements-dev.txt


## 1. Download and Prepare FER2013 (ImageFolder)
This will create `data/train`, `data/val`, and `data/test` with a 75/25 train/val split and the official test set.

In [None]:
!python scripts/prepare_fer2013.py --out-dir data --val-ratio 0.25

In [None]:
# Preview a few samples per class from the training split
import os, random
from IPython.display import display
from PIL import Image
root = 'data/train'
classes = sorted([d for d in os.listdir(root) if os.path.isdir(os.path.join(root, d))])
print('Classes:', classes)
samples_per_class = 3
for cls in classes:
    cls_dir = os.path.join(root, cls)
    imgs = [os.path.join(cls_dir, f) for f in os.listdir(cls_dir) if f.lower().endswith(('.png','.jpg','.jpeg'))]
    random.shuffle(imgs)
    print(f'Class: {cls} (showing {min(samples_per_class, len(imgs))})')
    for path in imgs[:samples_per_class]:
        display(Image.open(path))


## 2. Train (ResNet-18, grayscale 1-ch, with imbalance handling)
- Uses weighted sampling and class-weighted CE
- AMP enabled by default (GPU recommended)
- Adjust `--epochs` and `--batch-size` to your hardware

In [None]:
import torch
batch = 256 if torch.cuda.is_available() else 64
epochs = 10
!python scripts/train.py \
--data-dir data \
--out-dir runs/fer18 \
--epochs {epochs} \
--batch-size {batch} \
--img-size 48 \
--arch resnet18 \
--weighted-sampler \
--class-weighted-ce \
--label-smoothing 0.1


### Distributed Training (DDP)
Run from a terminal (notebooks are not ideal for multi-process). Example:

`````
CUDA_VISIBLE_DEVICES=0,1 torchrun --standalone --nproc_per_node=2 scripts/train.py \
  --data-dir data --out-dir runs/ddp_fer --epochs 10 --batch-size 256 --img-size 48 --arch resnet18 --weighted-sampler --class-weighted-ce
`````
Artifacts will be saved under `runs/ddp_fer/` on rank 0.


Artifacts are saved under `runs/fer18/`:
- `best.pt`, `last.pt`
- `history.json`, `history.csv`
- `loss_curve.png`, `accuracy_curve.png`
- `confusion_matrix_epoch*.png`, `confusion_matrix_best.png`

In [None]:
# Display training curves from the run directory
from IPython.display import Image, display
import os
run_dir = 'runs/fer18'
for fname in ['loss_curve.png', 'accuracy_curve.png', 'confusion_matrix_best.png']:
    path = os.path.join(run_dir, fname)
    if os.path.exists(path):
        display(Image(filename=path))


## 3. Evaluate on Validation and Test Splits
Saves confusion matrix and per-class precision/recall charts, plus a CSV of per-class metrics.

In [None]:
!python scripts/eval_classification.py --data-dir data/val --weights runs/fer18/best.pt --arch resnet18 --img-size 48
!python scripts/eval_classification.py --data-dir data/test --weights runs/fer18/best.pt --arch resnet18 --img-size 48

In [None]:
# Display evaluation artifacts inline if present
from IPython.display import Image, display
import os
for fname in ['confusion_matrix.png', 'precision_per_class.png', 'recall_per_class.png']:
    if os.path.exists(fname):
        display(Image(filename=fname))


Outputs in the working directory: 
- `confusion_matrix.png`
- `precision_per_class.png`
- `recall_per_class.png`
- `metrics_per_class.csv`

## 4. Notes
- Adjust batch size if you run on CPU.
- Training artifacts: check `runs/fer18/`.
- For DDP, prefer a terminal and ensure CUDA is available.
