## Flowers102: baseline and few-shot pipeline
This document contains the code implementation for sections 2.1-2.2 and 3.3-3.4 of the report, and includes the following:
 - **Baseline classification**
   - ResNet-50 backbone with a cosine classifier head.
   - Supervised training on the train split with early stopping on validation loss.
   - Final evaluation on the test split with accuracy and macro/weighted metrics.
 - **Few-shot (Siamese + contrastive)**
   - Shared ResNet-50 backbone and projection head.
   - K-shot subset construction and balanced positive/negative pair sampling.
   - Contrastive loss training and N-way K-shot episodic evaluation.
 - **Few-shot (Triplet)**
   - TripletNet with the same backbone interface and a projection head.
   - Triplet (anchor, positive, negative) sampling from the K-shot subset.
   - TripletMarginLoss training and the same episodic evaluation protocol. 

In [8]:
import os
import math
import random
from collections import defaultdict

import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import Dataset, DataLoader
from tqdm import tqdm
from PIL import Image, UnidentifiedImageError, ImageFile
from torchvision import transforms as T

# Baseline Flowers102 pipeline setup
 - Initialize the baseline Flowers102 classification experiment using shared utilities from `flowers_common`.
 - Import helpers for seeding, device selection, data loading, model construction, training, evaluation, and inference.
 - Fix random seeds to ensure deterministic behavior across Python, NumPy, and PyTorch.
 - Select the compute device (CPU, CUDA, or MPS) via `get_device_config`.
 - Build deterministic train/validation/test `DataLoader` objects for the Flowers102 dataset.

In [9]:
from flowers_common import seed_all, get_device_config, get_dataloaders
from flowers_common import build_resnet50_cosine, safe_load_backbone_and_transplant_head
from flowers_common import TrainConfig, train_model, evaluate_model, predict_image

seed_all(1029)
dc = get_device_config()
device = dc.device
train_loader, val_loader, test_loader = get_dataloaders(root="data", batch_size=32, img_size=224)

## 2.1.1 Data and Batch Inspection (Consistent with Baseline Script)
Print data partitioning scale, batch size, number of batches per round, and to provide a visual inspection of the shape of a batch, ensuring consistency with the original script.

In [10]:
train_set, val_set, test_set = train_loader.dataset, val_loader.dataset, test_loader.dataset
print("Dataset sizes:")
print(f"  Train: {len(train_set):,} samples")
print(f"  Val:   {len(val_set):,} samples")
print(f"  Test:  {len(test_set):,} samples")

def num_batches(loader): return len(loader)
print("\nBatches per epoch:")
print(f"  Train: {num_batches(train_loader)}")
print(f"  Val:   {num_batches(val_loader)}")
print(f"  Test:  {num_batches(test_loader)}")

imgs, labels = next(iter(train_loader))
print(f"\nSample batch: {imgs.shape} images, {labels.shape} labels")

Dataset sizes:
  Train: 1,020 samples
  Val:   1,020 samples
  Test:  6,149 samples

Batches per epoch:
  Train: 32
  Val:   16
  Test:  97

Sample batch: torch.Size([32, 3, 224, 224]) images, torch.Size([32]) labels


## 2.1.2 Build the model and load existing weights as needed.

Use the ResNet50+Cosine head; if old weights exist, load the backbone in non-strict mode and attempt to migrate `fc` weights to the cosine head.
- Old weights: `ckpt/resnet50_imagenet_finetuned_v1.pth`
- New save: See subsequent training units

In [11]:
LOAD_WEIGHTS = True
CKPT_EXISTING = "ckpt/resnet50_imagenet_finetuned_v1.pth"

model = build_resnet50_cosine(num_classes=102, pretrained=True, device=device)
if LOAD_WEIGHTS and os.path.exists(CKPT_EXISTING):
    safe_load_backbone_and_transplant_head(model, CKPT_EXISTING, device=device, cosine_scale=30.0)
else:
    print("[INFO] No existing checkpoint loaded or file missing.")

[INFO] No existing checkpoint loaded or file missing.


In [12]:
import os

TRAIN = True
CKPT_NEW = "ckpt/best_cosine_auto.pth"
cfg = TrainConfig(epochs=30, lr=1e-4, patience=5, ckpt_path=CKPT_NEW)

if TRAIN:
    out = train_model(model, train_loader, val_loader, cfg, device_cfg=dc, epoch_log_cb=None)
    print(out)
else:
    print("[SKIP] TRAIN=False; skipping baseline training.")


  scaler = GradScaler(enabled=device_cfg.amp_enabled)
  with autocast():
  with autocast():


Epoch 01 | Train 4.5387/0.0451 | Val 4.1008/0.1206
Saved best model → ckpt/best_cosine_auto.pth
Epoch 02 | Train 3.5230/0.3471 | Val 3.2684/0.4059
Saved best model → ckpt/best_cosine_auto.pth
Epoch 03 | Train 2.5164/0.6716 | Val 2.4644/0.6353
Saved best model → ckpt/best_cosine_auto.pth
Epoch 04 | Train 1.6920/0.8480 | Val 1.8150/0.7569
Saved best model → ckpt/best_cosine_auto.pth
Epoch 05 | Train 1.0726/0.9167 | Val 1.3315/0.8235
Saved best model → ckpt/best_cosine_auto.pth
Epoch 06 | Train 0.6998/0.9608 | Val 1.0517/0.8490
Saved best model → ckpt/best_cosine_auto.pth
Epoch 07 | Train 0.4246/0.9863 | Val 0.8656/0.8578
Saved best model → ckpt/best_cosine_auto.pth
Epoch 08 | Train 0.2806/0.9902 | Val 0.7203/0.8824
Saved best model → ckpt/best_cosine_auto.pth
Epoch 09 | Train 0.1974/0.9902 | Val 0.6433/0.8824
Saved best model → ckpt/best_cosine_auto.pth
Epoch 10 | Train 0.1318/0.9971 | Val 0.5819/0.8833
Saved best model → ckpt/best_cosine_auto.pth
Epoch 11 | Train 0.1103/0.9980 | Val 0.5

## 2.2.1 Training and Evaluation Without Data Augmentation

Under the same hyperparameters as the baseline, retrain using the **training set without data augmentation**, and evaluate both models (with and without augmentation) on the **same test set**, comparing the differences in metrics.

- Rebuild the DataLoader without data augmentation (`augment=False`).

- Train using the same `TrainConfig` and workflow as the baseline, saving to `ckpt/best_cosine_noaug.pth`.

- Load `ckpt/best_cosine_auto.pth` (with augmentation) and `ckpt/best_cosine_noaug.pth` (without augmentation) for test set evaluation.

- Summarize accuracy and F1 scores and display a comparison table.

In [13]:
train_loader_noaug, val_loader_noaug, test_loader_noaug = get_dataloaders(
    root="data", batch_size=32, img_size=224, augment=False
)

TRAIN_NOAUG = True
CKPT_NOAUG = "ckpt/best_cosine_noaug.pth"

try:
    LOAD_WEIGHTS
except NameError:
    LOAD_WEIGHTS = True
try:
    CKPT_EXISTING
except NameError:
    CKPT_EXISTING = "ckpt/resnet50_imagenet_finetuned_v1.pth"

model_noaug = build_resnet50_cosine(num_classes=102, pretrained=True, device=device)
if LOAD_WEIGHTS and os.path.exists(CKPT_EXISTING):
    safe_load_backbone_and_transplant_head(model_noaug, CKPT_EXISTING, device=device, cosine_scale=30.0)

cfg_noaug = TrainConfig(epochs=30, lr=1e-4, patience=5, ckpt_path=CKPT_NOAUG)

if TRAIN_NOAUG:
    out_noaug = train_model(model_noaug, train_loader_noaug, val_loader_noaug, cfg_noaug, device_cfg=dc, epoch_log_cb=None)
    print(out_noaug)
else:
    print("[SKIP] TRAIN_NOAUG=False; skipping no-augmentation training.")


Epoch 01 | Train 4.5678/0.0461 | Val 3.9826/0.1853
Saved best model → ckpt/best_cosine_noaug.pth
Epoch 02 | Train 2.5836/0.8186 | Val 3.1637/0.4863
Saved best model → ckpt/best_cosine_noaug.pth
Epoch 03 | Train 1.2714/0.9892 | Val 2.6063/0.6461
Saved best model → ckpt/best_cosine_noaug.pth
Epoch 04 | Train 0.5743/1.0000 | Val 2.2516/0.7176
Saved best model → ckpt/best_cosine_noaug.pth
Epoch 05 | Train 0.2670/1.0000 | Val 2.0266/0.7598
Saved best model → ckpt/best_cosine_noaug.pth
Epoch 06 | Train 0.1469/1.0000 | Val 1.9039/0.7686
Saved best model → ckpt/best_cosine_noaug.pth
Epoch 07 | Train 0.0963/1.0000 | Val 1.8224/0.7676
Saved best model → ckpt/best_cosine_noaug.pth
Epoch 08 | Train 0.0723/1.0000 | Val 1.7594/0.7725
Saved best model → ckpt/best_cosine_noaug.pth
Epoch 09 | Train 0.0576/1.0000 | Val 1.6923/0.7775
Saved best model → ckpt/best_cosine_noaug.pth
Epoch 10 | Train 0.0444/1.0000 | Val 1.6477/0.7843
Saved best model → ckpt/best_cosine_noaug.pth
Epoch 11 | Train 0.0374/1.0000

In [14]:
def _load_model_from_ckpt(ckpt_path: str):
    m = build_resnet50_cosine(num_classes=102, pretrained=True, device=device)
    if os.path.exists(ckpt_path):
        state = torch.load(ckpt_path, map_location=device)
        m.load_state_dict(state, strict=False)
    else:
        print(f"[WARN] Checkpoint not found: {ckpt_path}")
    return m.eval()

CKPT_AUG   = "ckpt/best_cosine_auto.pth"
CKPT_NOAUG = "ckpt/best_cosine_noaug.pth"

model_aug_best   = _load_model_from_ckpt(CKPT_AUG)
model_noaug_best = _load_model_from_ckpt(CKPT_NOAUG)

metrics_aug   = evaluate_model(model_aug_best, test_loader, device=device, amp_enabled=dc.amp_enabled)
metrics_noaug = evaluate_model(model_noaug_best, test_loader, device=device, amp_enabled=dc.amp_enabled)

import pandas as pd

def _flat(name, m):
    return {
        "Setting": name,
        "Accuracy": m["accuracy"],
        "Macro F1": m["macro_avg"]["f1"],
        "Macro Precision": m["macro_avg"]["precision"],
        "Macro Recall": m["macro_avg"]["recall"],
        "Weighted F1": m["weighted_avg"]["f1"],
        "Weighted Precision": m["weighted_avg"]["precision"],
        "Weighted Recall": m["weighted_avg"]["recall"],
    }

df = pd.DataFrame([_flat("With Aug", metrics_aug), _flat("No Aug", metrics_noaug)])
print(df.to_string(index=False))

metrics = evaluate_model(model, test_loader, device=device, amp_enabled=dc.amp_enabled)
print(metrics)

  with autocast():
  with autocast():


 Setting  Accuracy  Macro F1  Macro Precision  Macro Recall  Weighted F1  Weighted Precision  Weighted Recall
With Aug  0.894129  0.893732         0.887430      0.912919     0.894941            0.907822         0.894129
  No Aug  0.792812  0.778482         0.770266      0.806773     0.793224            0.813344         0.792812


  with autocast():


{'accuracy': 0.8876240039030737, 'macro_avg': {'precision': 0.8814031472960611, 'recall': 0.9070421440957819, 'f1': 0.8873378436877947}, 'weighted_avg': {'precision': 0.9009453265318921, 'recall': 0.8876240039030737, 'f1': 0.8875011026434374}}


## 3.3.1: Siamese Few-Shot (pairs + episodic eval)

This section tests how well our model can recognize flowers when given only a few examples per class, aka concept known as few-shot learning.

Instead of training the model to directly predict flower labels, we train it to measure how similar two images are.
It learns to bring similar flowers closer together in its feature space, and push different flowers further apart.

We do this by:

1. Creating image pairs, some from the same class (positive) and some from different classes (negative).

2. Training a Siamese network that passes both images through the same ResNet-50 backbone and compares their feature distance.

3. Using a contrastive loss, which teaches the network to minimize distance for similar pairs and maximize distance for different pairs.

4. After training, we evaluate using few-shot episodes: the model sees only a few “support” images per class and classifies new “query” images by comparing how close they are to each class’s examples.

In short, this section helps the model learn similarity instead of strict classification, allowing it to generalize better when only a few training images are available.

In [15]:
# =============================================================================
# 3.3.1: Siamese Few-Shot (Contrastive) 
# =============================================================================

# ---- Config (few-shot) ----
FEWSHOT_K          = 5            # K images per class for training (K-shot)
PAIRS_PER_CLASS    = 40           # how many positive pairs per class to prebuild (negatives are matched to keep balance)
MARGIN             = 1.0          # margin for contrastive loss
EPOCHS_SIAMESE     = 10
LR_SIAMESE         = 3e-4
BATCH_SIZE_PAIRS   = 64
EPISODES           = 200          # N-way K-shot evaluation episodes
N_WAY              = 5            # number of classes per episode
Q_PER_CLASS        = 5            # queries per class in episodic eval
SEED               = 1029

rng = random.Random(SEED)
np.random.seed(SEED)
torch.manual_seed(SEED)

# Use the same device you set earlier (Section 4 picked CUDA or MPS)
print(f"[FewShot] Using device: {device}")

# ---------- Helpers: take a K-shot subset of the Flowers102 train split ----------
def build_class_index(dataset):
    by_cls = defaultdict(list)
    for idx, (_, y) in enumerate(dataset):
        by_cls[int(y)].append(idx)
    return by_cls

def kshot_indices(by_cls, k, rng):
    subset = []
    for c, idxs in by_cls.items():
        if len(idxs) < k:
            # In Flowers102, train has ≥ 10 per class; still guard:
            chosen = idxs if len(idxs) == k else rng.sample(idxs, min(k, len(idxs)))
        else:
            chosen = rng.sample(idxs, k)
        subset.extend(chosen)
    return sorted(subset)

# ---------- Pair list (precomputed for determinism) ----------
def build_pairs(index_by_class, k_per_class, pairs_per_class, rng):
    # choose k per class deterministically
    chosen_per_class = {c: rng.sample(idxs, min(k_per_class, len(idxs)))
                        for c, idxs in index_by_class.items()}

    pos_pairs = []
    for c, idxs in chosen_per_class.items():
        # all unordered positive pairs
        cand = []
        for i in range(len(idxs)):
            for j in range(i+1, len(idxs)):
                cand.append( (idxs[i], idxs[j]) )
        rng.shuffle(cand)
        pos_pairs.extend([(a,b,1) for (a,b) in cand[:pairs_per_class]])

    # negatives: for each class, sample pairs with a different class
    all_classes = list(chosen_per_class.keys())
    neg_pairs = []
    for c in all_classes:
        a_list = rng.choices(chosen_per_class[c], k=pairs_per_class)
        for a in a_list:
            c2 = rng.choice([x for x in all_classes if x != c])
            b = rng.choice(chosen_per_class[c2])
            neg_pairs.append((a,b,0))

    pairs = pos_pairs + neg_pairs
    rng.shuffle(pairs)
    return pairs, chosen_per_class

class PairDataset(Dataset):
    def __init__(self, base_dataset, pairs):
        self.base = base_dataset
        self.pairs = pairs
        self.tf = base_dataset.transform  # use same transform

    def __len__(self):
        return len(self.pairs)

    def __getitem__(self, i):
        i1, i2, y = self.pairs[i]
        x1, _ = self.base[i1]
        x2, _ = self.base[i2]
        return x1, x2, torch.tensor(y, dtype=torch.float32)

# ---------- Siamese model: shared backbone + projection ----------
class ProjectionHead(nn.Module):
    def __init__(self, in_dim=2048, out_dim=512):
        super().__init__()
        self.net = nn.Sequential(
            nn.Linear(in_dim, 1024),
            nn.BatchNorm1d(1024),
            nn.ReLU(inplace=True),
            nn.Linear(1024, out_dim)
        )
    def forward(self, x):
        z = self.net(x)
        return F.normalize(z, dim=1)

class Siamese(nn.Module):
    def __init__(self, backbone, proj_out=512, freeze_backbone=True):
        super().__init__()
        self.backbone = backbone           # expects [B,3,224,224] -> [B,2048]
        if freeze_backbone:
            for p in self.backbone.parameters():
                p.requires_grad = False
        self.proj = ProjectionHead(2048, proj_out)

    def embed(self, x):
        feats = self.backbone(x)           # [B,2048]
        return self.proj(feats)            # [B,proj_out] (L2-normalized)

    def forward(self, x1, x2):
        z1 = self.embed(x1)
        z2 = self.embed(x2)
        # cosine similarity for monitoring (not used in loss directly)
        sim = (z1 * z2).sum(dim=1)
        return z1, z2, sim

class ContrastiveLoss(nn.Module):
    """
    Hadsell et al. 2006
    y=1 for positive (same class), y=0 for negative.
    """
    def __init__(self, margin=1.0):
        super().__init__()
        self.margin = margin

    def forward(self, z1, z2, y):
        d = F.pairwise_distance(z1, z2)            # [B]
        pos = y * (d ** 2)
        neg = (1 - y) * F.relu(self.margin - d) ** 2
        return (pos + neg).mean()

# ---------- Build pairs from your existing train_set ----------
train_set_fs = train_loader.dataset     # Flowers102 split="train"
val_set_fs   = val_loader.dataset

by_cls = build_class_index(train_set_fs)
pairs, chosen_per_class = build_pairs(by_cls, FEWSHOT_K, PAIRS_PER_CLASS, rng)
pair_ds = PairDataset(train_set_fs, pairs)
pair_loader = DataLoader(pair_ds, batch_size=BATCH_SIZE_PAIRS,
                         shuffle=True, num_workers=0, pin_memory=False,
                         generator=torch.Generator().manual_seed(SEED))

print(f"[FewShot] K-shot chosen per class: {FEWSHOT_K}")
print(f"[FewShot] Pair dataset size: {len(pair_ds):,} (pos≈neg)")

# ---------- Siamese model from Section 3 backbone ----------
# classifier 'model' is nn.Sequential([resnet50_backbone, CosineClassifier])
backbone = model[0]                         # reuse trained backbone
siam = Siamese(backbone, proj_out=512, freeze_backbone=True).to(device)
criterion = ContrastiveLoss(MARGIN)
opt = torch.optim.AdamW(filter(lambda p: p.requires_grad, siam.parameters()), lr=LR_SIAMESE)

print(siam)

# ---------- Train (pairs) ----------
for epoch in range(1, EPOCHS_SIAMESE + 1):
    siam.train()
    running, n = 0.0, 0
    for x1, x2, y in tqdm(pair_loader, desc=f"[Siamese] Epoch {epoch}/{EPOCHS_SIAMESE}"):
        x1, x2, y = x1.to(device), x2.to(device), y.to(device)
        opt.zero_grad(set_to_none=True)
        z1, z2, _ = siam(x1, x2)
        loss = criterion(z1, z2, y)
        loss.backward()
        opt.step()
        running += loss.item() * x1.size(0)
        n += x1.size(0)
    print(f"  Train contrastive loss: {running/n:.4f}")

# --------- Save siamese projection (optional) ----------
os.makedirs("ckpt", exist_ok=True)
torch.save({"proj": siam.proj.state_dict()}, "ckpt/siamese_proj.pth")
print("[FewShot] Saved projection head: ckpt/siamese_proj.pth")

# ---------- Episodic N-way K-shot evaluation (prototype nearest) ----------
@torch.no_grad()
def episodic_eval(backbone, proj, base_dataset, n_way=5, k_shot=1, q_per_class=5, episodes=200):
    backbone.eval()
    proj.eval()
    by_cls = build_class_index(base_dataset)
    classes = list(by_cls.keys())
    accs = []

    for _ in range(episodes):
        episode_classes = rng.sample(classes, n_way)

        # support / query split
        support_idx = []
        query_idx = []
        for c in episode_classes:
            idxs = by_cls[c]
            if len(idxs) < k_shot + q_per_class:
                # fallback: sample with replacement if needed
                sampled = rng.choices(idxs, k=k_shot + q_per_class)
            else:
                sampled = rng.sample(idxs, k=k_shot + q_per_class)
            support_idx.extend([(i, c) for i in sampled[:k_shot]])
            query_idx.extend([(i, c) for i in sampled[k_shot:]])

        # build tensors
        def load_batch(pairs):
            xs, ys = [], []
            for i, c in pairs:
                x, _ = base_dataset[i]
                xs.append(x)
                ys.append(c)
            xs = torch.stack(xs).to(device)
            ys = torch.tensor(ys, device=device)
            return xs, ys

        xs_s, ys_s = load_batch(support_idx)
        xs_q, ys_q = load_batch(query_idx)

        # embeddings
        z_s = F.normalize(proj(backbone(xs_s)), dim=1)   # [n_way*k, d]
        z_q = F.normalize(proj(backbone(xs_q)), dim=1)   # [n_way*q, d]

        # class prototypes
        protos = []
        for c in episode_classes:
            zc = z_s[ (ys_s == c) ]
            protos.append(zc.mean(dim=0))
        protos = torch.stack(protos)                     # [n_way, d]

        # cosine similarity to prototypes
        sims = z_q @ protos.t()                          # [n_way*q, n_way]
        preds = sims.argmax(dim=1)

        # map true labels to 0..n_way-1
        class_to_pos = {c:i for i,c in enumerate(episode_classes)}
        ys_true = torch.tensor([class_to_pos[int(c)] for c in ys_q.tolist()], device=device)

        acc = (preds == ys_true).float().mean().item()
        accs.append(acc)

    return float(np.mean(accs)), float(np.std(accs)/math.sqrt(len(accs)))

mean_acc, se = episodic_eval(backbone=siam.backbone, proj=siam.proj,
                             base_dataset=val_loader.dataset,
                             n_way=N_WAY, k_shot=1, q_per_class=Q_PER_CLASS,
                             episodes=EPISODES)
print(f"[FewShot] {N_WAY}-way 1-shot accuracy on VAL: {mean_acc*100:.2f}% ± {se*100:.2f}% (s.e.)")

# You can also test 5-way 5-shot:
mean_acc5, se5 = episodic_eval(backbone=siam.backbone, proj=siam.proj,
                               base_dataset=val_loader.dataset,
                               n_way=N_WAY, k_shot=5, q_per_class=Q_PER_CLASS,
                               episodes=EPISODES)
print(f"[FewShot] {N_WAY}-way 5-shot accuracy on VAL: {mean_acc5*100:.2f}% ± {se5*100:.2f}% (s.e.)")


import json, time

results_siamese = {
    "model": "Siamese (Contrastive)",
    "timestamp": time.strftime("%Y-%m-%d %H:%M:%S"),
    "dataset_split": "val",
    "n_way": N_WAY,
    "q_per_class": Q_PER_CLASS,
    "1-shot_accuracy": round(mean_acc * 100, 2),
    "1-shot_std_error": round(se * 100, 2),
    "5-shot_accuracy": round(mean_acc5 * 100, 2),
    "5-shot_std_error": round(se5 * 100, 2),
}

os.makedirs("results", exist_ok=True)
with open("results/fewshot_siamese_results.json", "w") as f:
    json.dump(results_siamese, f, indent=4)

print(f"[Saved] Results written to results/fewshot_siamese_results.json")


[FewShot] Using device: cuda:0
[FewShot] K-shot chosen per class: 5
[FewShot] Pair dataset size: 5,100 (pos≈neg)
Siamese(
  (backbone): ResNet(
    (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
    (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu): ReLU(inplace=True)
    (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
    (layer1): Sequential(
      (0): Bottleneck(
        (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affi

[Siamese] Epoch 1/10: 100%|██████████| 80/80 [01:59<00:00,  1.50s/it]


  Train contrastive loss: 0.0371


[Siamese] Epoch 2/10: 100%|██████████| 80/80 [02:01<00:00,  1.52s/it]


  Train contrastive loss: 0.0157


[Siamese] Epoch 3/10: 100%|██████████| 80/80 [01:40<00:00,  1.26s/it]


  Train contrastive loss: 0.0118


[Siamese] Epoch 4/10: 100%|██████████| 80/80 [01:37<00:00,  1.22s/it]


  Train contrastive loss: 0.0101


[Siamese] Epoch 5/10: 100%|██████████| 80/80 [01:27<00:00,  1.09s/it]


  Train contrastive loss: 0.0085


[Siamese] Epoch 6/10: 100%|██████████| 80/80 [01:26<00:00,  1.08s/it]


  Train contrastive loss: 0.0075


[Siamese] Epoch 7/10: 100%|██████████| 80/80 [01:26<00:00,  1.08s/it]


  Train contrastive loss: 0.0067


[Siamese] Epoch 8/10: 100%|██████████| 80/80 [01:26<00:00,  1.08s/it]


  Train contrastive loss: 0.0064


[Siamese] Epoch 9/10: 100%|██████████| 80/80 [01:26<00:00,  1.08s/it]


  Train contrastive loss: 0.0058


[Siamese] Epoch 10/10: 100%|██████████| 80/80 [01:26<00:00,  1.08s/it]


  Train contrastive loss: 0.0054
[FewShot] Saved projection head: ckpt/siamese_proj.pth
[FewShot] 5-way 1-shot accuracy on VAL: 97.00% ± 0.33% (s.e.)
[FewShot] 5-way 5-shot accuracy on VAL: 97.76% ± 0.22% (s.e.)
[Saved] Results written to results/fewshot_siamese_results.json


## 3.3.2 Triplet Loss for Improved Few-Shot Learning

This section builds on the previous Siamese setup but uses triplets of images instead of pairs.
Each triplet consists of:

- Anchor – a reference image,

- Positive – another image of the same class, and

- Negative – an image from a different class.

The model learns by comparing the distances between these three embeddings:

- It tries to make the anchor–positive distance smaller,

- and the anchor–negative distance larger by at least a certain margin.

This is called Triplet Loss, and it encourages the network to create a more structured embedding space where images of the same class form tight clusters and different classes are clearly separated.

By training with triplets, the model can further improve its ability to distinguish visually similar flower types when only a few examples are available — making it a stronger few-shot learner overall.

In [None]:
ImageFile.LOAD_TRUNCATED_IMAGES = True  # Tolerate slightly truncated JPEGs

# ---- Triplet configuration ----
TRI_FEWSHOT_K        = 5
TRIPLETS_PER_CLASS   = 60
TRI_MARGIN           = 0.8
TRI_EPOCHS           = 12
TRI_LR               = 3e-4
TRI_BATCH_SIZE       = 64
TRI_EPISODES         = 200
TRI_N_WAY            = 5
TRI_Q_PER_CLASS      = 5
TRI_SEED             = 1029

rng = random.Random(TRI_SEED)
np.random.seed(TRI_SEED)
torch.manual_seed(TRI_SEED)

print(f"[Triplet] Using device: {device}")

# --- Reuse helpers from Section 7: build_class_index, episodic_eval, ProjectionHead ---

def _dummy_tensor(base_dataset):
    """Create a placeholder image run through the SAME transform as the dataset."""
    pil = Image.new("RGB", (256, 256), (0, 0, 0))
    tf = getattr(base_dataset, "transform", None)
    return tf(pil) if tf is not None else T.ToTensor()(pil)

# --- Build triplets from existing Flowers102 train split ---
def build_triplets(index_by_class, k_per_class, triplets_per_class, rng):
    """Returns (anchor, positive, negative) index triplets for each class."""
    chosen_per_class = {
        c: (rng.sample(idxs, k_per_class) if len(idxs) >= k_per_class else rng.choices(idxs, k=k_per_class))
        for c, idxs in index_by_class.items()
    }

    triplets = []
    all_classes = list(index_by_class.keys())
    for c in all_classes:
        pool = chosen_per_class[c]
        for _ in range(triplets_per_class):
            a, p = rng.sample(pool, 2) if len(pool) >= 2 else (pool[0], pool[0])
            neg_c = rng.choice([x for x in all_classes if x != c])
            n = rng.choice(chosen_per_class[neg_c])
            triplets.append((a, p, n))
    rng.shuffle(triplets)
    return triplets, chosen_per_class

class TripletDataset(Dataset):
    def __init__(self, base_dataset, triplets):
        self.base = base_dataset
        self.triplets = triplets
    def __len__(self):
        return len(self.triplets)
    def __getitem__(self, i):
        a, p, n = self.triplets[i]
        # Each access may raise UnidentifiedImageError if the JPEG is corrupted.
        try:
            xa, _ = self.base[a]
        except (UnidentifiedImageError, OSError):
            xa = _dummy_tensor(self.base)
        try:
            xp, _ = self.base[p]
        except (UnidentifiedImageError, OSError):
            xp = _dummy_tensor(self.base)
        try:
            xn, _ = self.base[n]
        except (UnidentifiedImageError, OSError):
            xn = _dummy_tensor(self.base)
        return xa, xp, xn

# --- Form triplets ---
train_set_fs = train_loader.dataset
val_set_fs   = val_loader.dataset
index_by_class = build_class_index(train_set_fs)
triplets, _ = build_triplets(index_by_class, TRI_FEWSHOT_K, TRIPLETS_PER_CLASS, rng)

tri_ds = TripletDataset(train_set_fs, triplets)
tri_loader = DataLoader(
    tri_ds,
    batch_size=TRI_BATCH_SIZE,
    shuffle=True,
    num_workers=0,
    pin_memory=False,
    generator=torch.Generator().manual_seed(TRI_SEED),
)

print(f"[Triplet] K-shot per class: {TRI_FEWSHOT_K}")
print(f"[Triplet] Triplet dataset size: {len(tri_ds):,}")

# --- Model + optimizer ---
class TripletNet(nn.Module):
    def __init__(self, backbone, proj_out=512, freeze_backbone=True):
        super().__init__()
        self.backbone = backbone
        if freeze_backbone:
            for p in self.backbone.parameters():
                p.requires_grad = False
        self.proj = ProjectionHead(2048, proj_out)
    def embed(self, x):
        feats = self.backbone(x)
        return self.proj(feats)

backbone = model[0]
tri_model = TripletNet(backbone, proj_out=512, freeze_backbone=True).to(device)

criterion = nn.TripletMarginLoss(margin=TRI_MARGIN, p=2)
optimizer = torch.optim.AdamW(filter(lambda p: p.requires_grad, tri_model.parameters()), lr=TRI_LR)

print(tri_model)

# --- Training loop ---
for epoch in range(1, TRI_EPOCHS + 1):
    tri_model.train()
    total_loss, count = 0.0, 0
    for xa, xp, xn in tqdm(tri_loader, desc=f"[Triplet] Epoch {epoch}/{TRI_EPOCHS}"):
        xa, xp, xn = xa.to(device), xp.to(device), xn.to(device)
        optimizer.zero_grad(set_to_none=True)
        za = F.normalize(tri_model.embed(xa), dim=1)
        zp = F.normalize(tri_model.embed(xp), dim=1)
        zn = F.normalize(tri_model.embed(xn), dim=1)
        loss = criterion(za, zp, zn)
        loss.backward()
        optimizer.step()
        total_loss += loss.item() * xa.size(0)
        count += xa.size(0)
    print(f"  Train triplet loss: {total_loss / count:.4f}")

# --- Save projection head ---
os.makedirs("ckpt", exist_ok=True)
torch.save({"proj": tri_model.proj.state_dict()}, "ckpt/triplet_proj.pth")
print("[Triplet] Saved projection head: ckpt/triplet_proj.pth")

# --- Episodic evaluation (reuse from 3.3.1: Siamese Few-Shot (Contrastive) ) ---
mean1, se1 = episodic_eval(
    backbone=tri_model.backbone,
    proj=tri_model.proj,
    base_dataset=val_set_fs,
    n_way=TRI_N_WAY, k_shot=1, q_per_class=TRI_Q_PER_CLASS, episodes=TRI_EPISODES
)
print(f"[Triplet] {TRI_N_WAY}-way 1-shot: {mean1*100:.2f}% ± {se1*100:.2f}%")

mean5, se5 = episodic_eval(
    backbone=tri_model.backbone,
    proj=tri_model.proj,
    base_dataset=val_set_fs,
    n_way=TRI_N_WAY, k_shot=5, q_per_class=TRI_Q_PER_CLASS, episodes=TRI_EPISODES
)
print(f"[Triplet] {TRI_N_WAY}-way 5-shot: {mean5*100:.2f}% ± {se5*100:.2f}%")

import json, time

results_triplet = {
    "model": "Triplet Loss",
    "timestamp": time.strftime("%Y-%m-%d %H:%M:%S"),
    "dataset_split": "val",
    "n_way": TRI_N_WAY,
    "q_per_class": TRI_Q_PER_CLASS,
    "1-shot_accuracy": round(mean1 * 100, 2),
    "1-shot_std_error": round(se1 * 100, 2),
    "5-shot_accuracy": round(mean5 * 100, 2),
    "5-shot_std_error": round(se5 * 100, 2),
}

os.makedirs("results", exist_ok=True)
with open("results/fewshot_triplet_results.json", "w") as f:
    json.dump(results_triplet, f, indent=4)

print(f"[Saved] Results written to results/fewshot_triplet_results.json")


[Triplet] Using device: cuda:0
[Triplet] K-shot per class: 5
[Triplet] Triplet dataset size: 6,120
TripletNet(
  (backbone): ResNet(
    (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
    (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu): ReLU(inplace=True)
    (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
    (layer1): Sequential(
      (0): Bottleneck(
        (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, tr

[Triplet] Epoch 1/12: 100%|██████████| 96/96 [02:37<00:00,  1.64s/it]


  Train triplet loss: 0.0273


[Triplet] Epoch 2/12: 100%|██████████| 96/96 [02:35<00:00,  1.62s/it]


  Train triplet loss: 0.0027


[Triplet] Epoch 3/12: 100%|██████████| 96/96 [02:44<00:00,  1.71s/it]


  Train triplet loss: 0.0018


[Triplet] Epoch 4/12: 100%|██████████| 96/96 [02:43<00:00,  1.71s/it]


  Train triplet loss: 0.0011


[Triplet] Epoch 5/12: 100%|██████████| 96/96 [02:40<00:00,  1.67s/it]


  Train triplet loss: 0.0010


[Triplet] Epoch 6/12: 100%|██████████| 96/96 [02:42<00:00,  1.69s/it]


  Train triplet loss: 0.0011


[Triplet] Epoch 7/12: 100%|██████████| 96/96 [02:43<00:00,  1.70s/it]


  Train triplet loss: 0.0007


[Triplet] Epoch 8/12: 100%|██████████| 96/96 [02:39<00:00,  1.66s/it]


  Train triplet loss: 0.0005


[Triplet] Epoch 9/12: 100%|██████████| 96/96 [02:39<00:00,  1.67s/it]


  Train triplet loss: 0.0008


[Triplet] Epoch 10/12: 100%|██████████| 96/96 [02:39<00:00,  1.66s/it]


  Train triplet loss: 0.0005


[Triplet] Epoch 11/12: 100%|██████████| 96/96 [02:39<00:00,  1.66s/it]


  Train triplet loss: 0.0004


[Triplet] Epoch 12/12: 100%|██████████| 96/96 [02:39<00:00,  1.66s/it]


  Train triplet loss: 0.0005
[Triplet] Saved projection head: ckpt/triplet_proj.pth
[Triplet] 5-way 1-shot: 94.82% ± 0.46%
[Triplet] 5-way 5-shot: 97.96% ± 0.21%
[Saved] Results written to results/fewshot_triplet_results.json


## 3.4.1 Few-Shot Evaluation on TEST set (Siamese + Triplet)

**Purpose**
 - Produce final N-way K-shot results on the held-out TEST split for both metric-learning models.
 
**Method**
 - Reuse the trained Siamese and Triplet models and the shared episodic evaluation function.
 - Run N-way 1-shot and 5-shot episodes on the TEST dataset for each model.
 - Log separate metrics for Siamese and Triplet configurations, including standard errors.
 - Aggregate all TEST metrics into a single dictionary and write them to a JSON file.
 - Use these outputs to compare baseline, Siamese, and Triplet few-shot performance in the report.


In [17]:
print("[INFO] Running final few-shot evaluation on TEST split")

# --- Siamese evaluation on TEST set ---
mean1_siam, se1_siam = episodic_eval(
    backbone=siam.backbone,
    proj=siam.proj,
    base_dataset=test_loader.dataset,
    n_way=N_WAY, k_shot=1, q_per_class=Q_PER_CLASS, episodes=EPISODES
)
mean5_siam, se5_siam = episodic_eval(
    backbone=siam.backbone,
    proj=siam.proj,
    base_dataset=test_loader.dataset,
    n_way=N_WAY, k_shot=5, q_per_class=Q_PER_CLASS, episodes=EPISODES
)

print(f"[Siamese-TEST] {N_WAY}-way 1-shot: {mean1_siam*100:.2f}% ± {se1_siam*100:.2f}%")
print(f"[Siamese-TEST] {N_WAY}-way 5-shot: {mean5_siam*100:.2f}% ± {se5_siam*100:.2f}%\n")

# --- Triplet evaluation on TEST set ---
mean1_triplet, se1_triplet = episodic_eval(
    backbone=tri_model.backbone,
    proj=tri_model.proj,
    base_dataset=test_loader.dataset,
    n_way=TRI_N_WAY, k_shot=1, q_per_class=TRI_Q_PER_CLASS, episodes=TRI_EPISODES
)
mean5_triplet, se5_triplet = episodic_eval(
    backbone=tri_model.backbone,
    proj=tri_model.proj,
    base_dataset=test_loader.dataset,
    n_way=TRI_N_WAY, k_shot=5, q_per_class=TRI_Q_PER_CLASS, episodes=TRI_EPISODES
)

print(f"[Triplet-TEST] {TRI_N_WAY}-way 1-shot: {mean1_triplet*100:.2f}% ± {se1_triplet*100:.2f}%")
print(f"[Triplet-TEST] {TRI_N_WAY}-way 5-shot: {mean5_triplet*100:.2f}% ± {se5_triplet*100:.2f}%")

print("\n[INFO] Few-shot evaluation on TEST split completed.")

# --- Save combined TEST results ---
combined_results = {
    "siamese_test": {
        "1-shot_accuracy": round(mean1_siam * 100, 2),
        "1-shot_std_error": round(se1_siam * 100, 2),
        "5-shot_accuracy": round(mean5_siam * 100, 2),
        "5-shot_std_error": round(se5_siam * 100, 2),
        "episodes": EPISODES,
        "n_way": N_WAY,
        "q_per_class": Q_PER_CLASS,
    },
    "triplet_test": {
        "1-shot_accuracy": round(mean1_triplet * 100, 2),
        "1-shot_std_error": round(se1_triplet * 100, 2),
        "5-shot_accuracy": round(mean5_triplet * 100, 2),
        "5-shot_std_error": round(se5_triplet * 100, 2),
        "episodes": TRI_EPISODES,
        "n_way": TRI_N_WAY,
        "q_per_class": TRI_Q_PER_CLASS,
    },
    "timestamp": time.strftime("%Y-%m-%d %H:%M:%S")
}

os.makedirs("results", exist_ok=True)
with open("results/fewshot_test_results.json", "w") as f:
    json.dump(combined_results, f, indent=4)

print("[Saved] Combined test results written to results/fewshot_test_results.json")

[INFO] Running final few-shot evaluation on TEST split
[Siamese-TEST] 5-way 1-shot: 96.70% ± 0.33%
[Siamese-TEST] 5-way 5-shot: 98.12% ± 0.21%

[Triplet-TEST] 5-way 1-shot: 96.78% ± 0.33%
[Triplet-TEST] 5-way 5-shot: 98.38% ± 0.19%

[INFO] Few-shot evaluation on TEST split completed.
[Saved] Combined test results written to results/fewshot_test_results.json
