# Phase 2: SOTA Optimization Experiments

This notebook executes the updated Phase 2 experimental roadmap incorporating external research feedback:

**New State-of-the-Art Experiments:**
1. **EXP-001**: Threshold Optimization (CPU, 4-6 hours)
2. **EXP-002A**: ConvNeXt V2-Tiny Baseline (GPU, ~12 hours) - *FCMAE pre-training*
3. **EXP-002B**: ConvNeXt V2-Tiny + Label Smoothing (GPU, ~12 hours)
4. **EXP-003A**: EfficientNet-B2 Baseline (GPU, ~10 hours) - *Speed-focused*
5. **EXP-003B**: EfficientNet-B2 + Label Smoothing (GPU, ~10 hours)
6. **EXP-006A**: DINOv2 ViT-S/14 Linear Probe (GPU, ~4 hours) - *Foundation model*

**Key Improvements:**
- **Enhanced augmentations**: Motion blur + stronger color jitter for conveyor realism
- **ConvNeXt V2**: Upgraded from V1 to V2 with FCMAE pre-training (better for defects)
- **DINOv2**: Foundation model approach with frozen backbone + linear probe

**Current Baseline:**
- ResNet50: 94.05% accuracy, 16ms latency, 25.6M params

**Target:**
- Accuracy: ≥94.5% (stretch: 95%)
- Pit Recall: ≥99.0% (food safety)
- Latency: <30ms on CPU

**Prerequisites:**
- Google Colab Pro (GPU required for training experiments)
- Baseline model uploaded to Google Drive
- Data in Drive at: `cherry_classification/data/`

## Cell 1: Experiment Configuration

Set skip flags to control which experiments run. Use `SMOKE_TEST=True` for quick validation (1 epoch, 3 batches).

In [14]:
# === EXPERIMENT CONFIGURATION ===
# Set these flags to control execution

SMOKE_TEST = False  # Set True for quick validation (1 epoch, 3 batches)

# Skip flags - set True to skip already completed experiments
SKIP_EXP_001 = False    # Threshold optimization (CPU only)
SKIP_EXP_002A = False   # ConvNeXt V2-Tiny baseline (NEW - Phase 2)
SKIP_EXP_002B = False   # ConvNeXt V2-Tiny + label smoothing (NEW)
SKIP_EXP_003A = False   # EfficientNet-B2 baseline
SKIP_EXP_003B = False   # EfficientNet-B2 + label smoothing
SKIP_EXP_006A = False   # DINOv2 ViT-S/14 linear probe (NEW - Phase 2)

# Random seed for reproducibility
RANDOM_SEED = 42

# Paths (adjust for your Drive structure)
DRIVE_MOUNT_PATH = "/content/drive"
DRIVE_BASE_PATH = "/content/drive/MyDrive/cherry_experiments"
DATA_PATH = "/content/cherry_classification/data"
BASELINE_MODEL_PATH = "/content/drive/MyDrive/cherry_experiments/resnet50_augmented_unnormalized/model_best_fixed.pt"

# Print configuration
print("=" * 60)
print("EXPERIMENT CONFIGURATION - Phase 2 SOTA")
print("=" * 60)
print(f"SMOKE_TEST: {SMOKE_TEST}")
print(f"RANDOM_SEED: {RANDOM_SEED}")
print("\nPhase 2 NEW Experiments:")
print(f"  EXP-002A (ConvNeXt V2 baseline): {SKIP_EXP_002A}")
print(f"  EXP-002B (ConvNeXt V2 + LS): {SKIP_EXP_002B}")
print(f"  EXP-006A (DINOv2 linear probe): {SKIP_EXP_006A}")
print("\nOther Experiments:")
print(f"  EXP-001 (Threshold opt): {SKIP_EXP_001}")
print(f"  EXP-003A (EfficientNet B2): {SKIP_EXP_003A}")
print(f"  EXP-003B (EfficientNet B2 + LS): {SKIP_EXP_003B}")
print("=" * 60)

EXPERIMENT CONFIGURATION - Phase 2 SOTA
SMOKE_TEST: False
RANDOM_SEED: 42

Phase 2 NEW Experiments:
  EXP-002A (ConvNeXt V2 baseline): False
  EXP-002B (ConvNeXt V2 + LS): False
  EXP-006A (DINOv2 linear probe): False

Other Experiments:
  EXP-001 (Threshold opt): False
  EXP-003A (EfficientNet B2): False
  EXP-003B (EfficientNet B2 + LS): False


## Cell 2: Environment Setup & Dependencies

Install required packages including `timm` for ConvNeXt V2 support.

In [15]:
# Install dependencies
print("Installing dependencies...")
!pip install -q pyyaml scikit-learn matplotlib tqdm

# NEW: Install timm for ConvNeXt V2
print("Installing timm for ConvNeXt V2...")
!pip install -q timm

# Verify installations
import importlib

print("\n" + "=" * 60)
print("DEPENDENCY CHECK")
print("=" * 60)

# Check timm
try:
    import timm
    print(f"✓ timm installed: {timm.__version__}")
except ImportError:
    print("✗ timm not available - ConvNeXt V2 will fail")

# Check torch
import torch
print(f"✓ PyTorch: {torch.__version__}")
print("=" * 60)

Installing dependencies...
Installing timm for ConvNeXt V2...

DEPENDENCY CHECK
✓ timm installed: 1.0.24
✓ PyTorch: 2.9.0+cu126


## Cell 3: GPU Check & Drive Mount

In [16]:
# Mount Google Drive
from google.colab import drive
drive.mount(DRIVE_MOUNT_PATH, force_remount=True)

# Check GPU availability
print("\n" + "=" * 60)
print("GPU CHECK")
print("=" * 60)
print(f"CUDA available: {torch.cuda.is_available()}")

if not torch.cuda.is_available():
    # Check if we're running training experiments
    needs_gpu = not (SKIP_EXP_002A and SKIP_EXP_002B and SKIP_EXP_003A and SKIP_EXP_003B and SKIP_EXP_006A)

    if needs_gpu:
        raise RuntimeError(
            "\n" + "!" * 60 + "\n" +
            "GPU REQUIRED FOR TRAINING EXPERIMENTS!\n" +
            "Go to: Runtime -> Change runtime type -> GPU\n" +
            "Then re-run this cell.\n" +
            "!" * 60
        )
    else:
        print("WARNING: No GPU available, but only running EXP-001 (CPU). Continuing...")
else:
    print(f"CUDA version: {torch.version.cuda}")
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"GPU memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")

print("=" * 60)

Mounted at /content/drive

GPU CHECK
CUDA available: True
CUDA version: 12.6
GPU: NVIDIA A100-SXM4-80GB
GPU memory: 85.17 GB


## Cell 4: Clone Repositories

In [17]:
# Clone training repository
!if [ ! -d "/content/cherries" ]; then \
    git clone https://github.com/usefulmove/cherries.git /content/cherries; \
else \
    cd /content/cherries && git pull; \
fi

# Clone dataset repository (shallow clone)
!if [ ! -d "/content/cherry_classification" ]; then \
    git clone --depth 1 https://github.com/weshavener/cherry_classification.git /content/cherry_classification; \
fi

# Add training scripts to path
import sys
sys.path.insert(0, '/content/cherries/training')

# Create output directories
import os
from pathlib import Path

output_dirs = [
    f"{DRIVE_BASE_PATH}/threshold_optimization",
    f"{DRIVE_BASE_PATH}/convnextv2_tiny_baseline_seed42",  # NEW: ConvNeXt V2
    f"{DRIVE_BASE_PATH}/convnextv2_tiny_label_smooth_seed42",
    f"{DRIVE_BASE_PATH}/efficientnet_b2_baseline_seed42",
    f"{DRIVE_BASE_PATH}/efficientnet_b2_label_smooth_seed42",
    f"{DRIVE_BASE_PATH}/dinov2_vits14_linear_probe_seed42",  # NEW: DINOv2
]

for dir_path in output_dirs:
    Path(dir_path).mkdir(parents=True, exist_ok=True)

print("\n" + "=" * 60)
print("SETUP COMPLETE")
print("=" * 60)
print(f"Training code: /content/traina")
print(f"Data: {DATA_PATH}")
print(f"Output base: {DRIVE_BASE_PATH}")

# Verify data exists
if not os.path.exists(DATA_PATH):
    print(f"\nWARNING: Data not found at {DATA_PATH}")
else:
    print(f"\nData verified: {DATA_PATH}")
    !ls -lh {DATA_PATH}
print("=" * 60)

Already up to date.

SETUP COMPLETE
Training code: /content/traina
Data: /content/cherry_classification/data
Output base: /content/drive/MyDrive/cherry_experiments

Data verified: /content/cherry_classification/data
total 8.0K
drwxr-xr-x 4 root root 4.0K Feb  7 06:11 train
drwxr-xr-x 4 root root 4.0K Feb  7 06:11 val


## Cell 5: EXP-001 - Threshold Optimization

**Type:** Analysis (no training)
**Duration:** 4-6 hours
**Requires:** Baseline model (94.05% ResNet50)

Find optimal decision boundaries for 3-class classification (clean/pit/maybe).

In [18]:
if SKIP_EXP_001:
    print("EXP-001: SKIPPED (set SKIP_EXP_001=False to run)")
else:
    print("\n" + "=" * 60)
    print("RUNNING EXP-001: Threshold Optimization")
    print("=" * 60)

    # Verify baseline model exists
    if not os.path.exists(BASELINE_MODEL_PATH):
        print(f"ERROR: Baseline model not found: {BASELINE_MODEL_PATH}")
        print("Please upload the 94.05% ResNet50 model to Google Drive")
        print("Expected location: MyDrive/cherry_experiments/resnet50_augmented_unnormalized/model_best_fixed.pt")
    else:
        output_dir = f"{DRIVE_BASE_PATH}/threshold_optimization"

        # Run threshold optimization
        !python /content/cherries/training/scripts/optimize_thresholds.py \
            --model-path {BASELINE_MODEL_PATH} \
            --data-root {DATA_PATH} \
            --architecture resnet50 \
            --output-dir {output_dir} \
            --min-recall 0.99 \
            --device cpu

        print("\n" + "=" * 60)
        print(f"Results saved to: {output_dir}")
        print("\nKey files to download:")
        print("  - threshold_results.json")
        print("  - optimal_thresholds.yaml")
        print("  - threshold_analysis.png")
        print("  - probability_distributions.png")
        print("=" * 60)


RUNNING EXP-001: Threshold Optimization
ERROR: Baseline model not found: /content/drive/MyDrive/cherry_experiments/resnet50_augmented_unnormalized/model_best_fixed.pt
Please upload the 94.05% ResNet50 model to Google Drive
Expected location: MyDrive/cherry_experiments/resnet50_augmented_unnormalized/model_best_fixed.pt


## Cell 6: EXP-002A - ConvNeXt V2-Tiny Baseline (NEW - Phase 2)

**Type:** Training
**Duration:** ~12 hours (30 epochs)
**Architecture:** ConvNeXt V2-Tiny with FCMAE pre-training (via timm)
**Key Feature:** Enhanced augmentations with motion blur for conveyor realism
**Hypothesis:** FCMAE pre-training superior for defect detection, ≥94.5% accuracy

In [19]:
if SKIP_EXP_002A:
    print("EXP-002A: SKIPPED (set SKIP_EXP_002A=False to run)")
else:
    print("\n" + "=" * 60)
    print("RUNNING EXP-002A: ConvNeXt V2-Tiny Baseline (Phase 2)")
    print("Features: FCMAE pre-training, AdamW optimizer, enhanced augmentations")
    print("=" * 60)

    # Set random seed
    import random
    import numpy as np

    random.seed(RANDOM_SEED)
    np.random.seed(RANDOM_SEED)
    torch.manual_seed(RANDOM_SEED)
    if torch.cuda.is_available():
        torch.cuda.manual_seed_all(RANDOM_SEED)

    print(f"Random seed set: {RANDOM_SEED}")

    # Load config
    import yaml

    config_path = "/content/cherries/training/configs/experiments/convnextv2_tiny_baseline_seed42.yaml"

    with open(config_path, 'r') as f:
        config = yaml.safe_load(f)

    # Override paths for Colab
    config['checkpointing']['output_dir'] = f"{DRIVE_BASE_PATH}/convnextv2_tiny_baseline_seed42"
    config['data']['root'] = DATA_PATH

    # Apply smoke test settings
    if SMOKE_TEST:
        print("\nSMOKE TEST MODE: 1 epoch, 3 batches")
        config['training']['epochs'] = 1
        config['checkpointing']['save_every'] = 1

    # Save temp config
    temp_config = "/tmp/convnextv2_tiny_baseline_seed42.yaml"
    with open(temp_config, 'w') as f:
        yaml.dump(config, f)

    # Run training
    !python /content/cherries/training/scripts/train.py \
        --config {temp_config} \
        --data-root {DATA_PATH}

    print("\n" + "=" * 60)
    print(f"Training complete! Results: {config['checkpointing']['output_dir']}")
    print("\nKey files to download:")
    print("  - model_best.pt")
    print("  - metrics.json")
    print("=" * 60)


RUNNING EXP-002A: ConvNeXt V2-Tiny Baseline (Phase 2)
Features: FCMAE pre-training, AdamW optimizer, enhanced augmentations
Random seed set: 42
Using cache found in /root/.cache/torch/hub/facebookresearch_dinov2_main

Loading config from: /tmp/convnextv2_tiny_baseline_seed42.yaml
Output directory: /content/drive/MyDrive/cherry_experiments/convnextv2_tiny_baseline_seed42
Config saved to: /content/drive/MyDrive/cherry_experiments/convnextv2_tiny_baseline_seed42/config.yaml
Using device: cuda

LOADING DATA

Dataset loaded from: /content/cherry_classification/data
Training samples: 3676
Validation samples: 1226
Classes: ['cherry_clean', 'cherry_pit']
Class distribution (train): {'cherry_clean': 1978, 'cherry_pit': 1698}
Class distribution (val): {'cherry_clean': 660, 'cherry_pit': 566}

Class names: ['cherry_clean', 'cherry_pit']

CREATING MODEL
Architecture: convnextv2_tiny
Loaded convnextv2_tiny.fcmae_ft_in1k with FCMAE pre-training
Model moved to device: cuda
Optimizer: adamw
Learning 

## Cell 7: EXP-002B - ConvNeXt V2-Tiny with Label Smoothing

In [20]:
if SKIP_EXP_002B:
    print("EXP-002B: SKIPPED (set SKIP_EXP_002B=False to run)")
else:
    print("\n" + "=" * 60)
    print("RUNNING EXP-002B: ConvNeXt V2-Tiny with Label Smoothing")
    print("=" * 60)

    # Set random seed
    random.seed(RANDOM_SEED)
    np.random.seed(RANDOM_SEED)
    torch.manual_seed(RANDOM_SEED)
    if torch.cuda.is_available():
        torch.cuda.manual_seed_all(RANDOM_SEED)

    print(f"Random seed set: {RANDOM_SEED}")
    print("Label smoothing: alpha=0.1")

    # Load config
    config_path = "/content/cherries/training/configs/experiments/convnextv2_tiny_label_smooth_seed42.yaml"

    with open(config_path, 'r') as f:
        config = yaml.safe_load(f)

    # Override paths
    config['checkpointing']['output_dir'] = f"{DRIVE_BASE_PATH}/convnextv2_tiny_label_smooth_seed42"
    config['data']['root'] = DATA_PATH

    # Apply smoke test settings
    if SMOKE_TEST:
        print("\nSMOKE TEST MODE: 1 epoch, 3 batches")
        config['training']['epochs'] = 1
        config['checkpointing']['save_every'] = 1

    # Save temp config
    temp_config = "/tmp/convnextv2_tiny_label_smooth_seed42.yaml"
    with open(temp_config, 'w') as f:
        yaml.dump(config, f)

    # Run training
    !python /content/cherries/training/scripts/train.py \
        --config {temp_config} \
        --data-root {DATA_PATH}

    print("\n" + "=" * 60)
    print(f"Training complete! Results: {config['checkpointing']['output_dir']}")
    print("=" * 60)


RUNNING EXP-002B: ConvNeXt V2-Tiny with Label Smoothing
Random seed set: 42
Label smoothing: alpha=0.1
Using cache found in /root/.cache/torch/hub/facebookresearch_dinov2_main

Loading config from: /tmp/convnextv2_tiny_label_smooth_seed42.yaml
Output directory: /content/drive/MyDrive/cherry_experiments/convnextv2_tiny_label_smooth_seed42
Config saved to: /content/drive/MyDrive/cherry_experiments/convnextv2_tiny_label_smooth_seed42/config.yaml
Using device: cuda

LOADING DATA

Dataset loaded from: /content/cherry_classification/data
Training samples: 3676
Validation samples: 1226
Classes: ['cherry_clean', 'cherry_pit']
Class distribution (train): {'cherry_clean': 1978, 'cherry_pit': 1698}
Class distribution (val): {'cherry_clean': 660, 'cherry_pit': 566}

Class names: ['cherry_clean', 'cherry_pit']

CREATING MODEL
Architecture: convnextv2_tiny
Loaded convnextv2_tiny.fcmae_ft_in1k with FCMAE pre-training
Model moved to device: cuda
Optimizer: adamw
Learning rate: 0.0001
Weight decay: 0.

## Cell 8: EXP-003A - EfficientNet-B2 Baseline

**Type:** Training
**Duration:** ~10 hours
**Architecture:** EfficientNet-B2 (9.2M params)
**Key Feature:** Speed-focused alternative, unnormalized training

In [21]:
if SKIP_EXP_003A:
    print("EXP-003A: SKIPPED (set SKIP_EXP_003A=False to run)")
else:
    print("\n" + "=" * 60)
    print("RUNNING EXP-003A: EfficientNet-B2 Baseline")
    print("=" * 60)

    # Set random seed
    random.seed(RANDOM_SEED)
    np.random.seed(RANDOM_SEED)
    torch.manual_seed(RANDOM_SEED)
    if torch.cuda.is_available():
        torch.cuda.manual_seed_all(RANDOM_SEED)

    print(f"Random seed set: {RANDOM_SEED}")

    # Load config
    config_path = "/content/cherries/training/configs/experiments/efficientnet_b2_baseline_seed42.yaml"

    with open(config_path, 'r') as f:
        config = yaml.safe_load(f)

    # Override paths
    config['checkpointing']['output_dir'] = f"{DRIVE_BASE_PATH}/efficientnet_b2_baseline_seed42"
    config['data']['root'] = DATA_PATH

    # Apply smoke test settings
    if SMOKE_TEST:
        print("\nSMOKE TEST MODE: 1 epoch, 3 batches")
        config['training']['epochs'] = 1
        config['checkpointing']['save_every'] = 1

    # Save temp config
    temp_config = "/tmp/efficientnet_b2_baseline_seed42.yaml"
    with open(temp_config, 'w') as f:
        yaml.dump(config, f)

    # Run training
    !python /content/cherries/training/scripts/train.py \
        --config {temp_config} \
        --data-root {DATA_PATH}

    print("\n" + "=" * 60)
    print(f"Training complete! Results: {config['checkpointing']['output_dir']}")
    print("=" * 60)


RUNNING EXP-003A: EfficientNet-B2 Baseline
Random seed set: 42
Using cache found in /root/.cache/torch/hub/facebookresearch_dinov2_main

Loading config from: /tmp/efficientnet_b2_baseline_seed42.yaml
Output directory: /content/drive/MyDrive/cherry_experiments/efficientnet_b2_baseline_seed42
Config saved to: /content/drive/MyDrive/cherry_experiments/efficientnet_b2_baseline_seed42/config.yaml
Using device: cuda

LOADING DATA

Dataset loaded from: /content/cherry_classification/data
Training samples: 3676
Validation samples: 1226
Classes: ['cherry_clean', 'cherry_pit']
Class distribution (train): {'cherry_clean': 1978, 'cherry_pit': 1698}
Class distribution (val): {'cherry_clean': 660, 'cherry_pit': 566}

Class names: ['cherry_clean', 'cherry_pit']

CREATING MODEL
Architecture: efficientnet_b2
Loaded EfficientNet-B2 with ImageNet weights
Replaced classifier head: Linear(1408, 2)
Model moved to device: cuda
Optimizer: adam
Learning rate: 0.0001
Weight decay: 1e-05
Using standard CrossEnt

## Cell 9: EXP-003B - EfficientNet-B2 with Label Smoothing

In [22]:
if SKIP_EXP_003B:
    print("EXP-003B: SKIPPED (set SKIP_EXP_003B=False to run)")
else:
    print("\n" + "=" * 60)
    print("RUNNING EXP-003B: EfficientNet-B2 with Label Smoothing")
    print("=" * 60)

    # Set random seed
    random.seed(RANDOM_SEED)
    np.random.seed(RANDOM_SEED)
    torch.manual_seed(RANDOM_SEED)
    if torch.cuda.is_available():
        torch.cuda.manual_seed_all(RANDOM_SEED)

    print(f"Random seed set: {RANDOM_SEED}")
    print("Label smoothing: alpha=0.1")

    # Load config
    config_path = "/content/cherries/training/configs/experiments/efficientnet_b2_label_smooth_seed42.yaml"

    with open(config_path, 'r') as f:
        config = yaml.safe_load(f)

    # Override paths
    config['checkpointing']['output_dir'] = f"{DRIVE_BASE_PATH}/efficientnet_b2_label_smooth_seed42"
    config['data']['root'] = DATA_PATH

    # Apply smoke test settings
    if SMOKE_TEST:
        print("\nSMOKE TEST MODE: 1 epoch, 3 batches")
        config['training']['epochs'] = 1
        config['checkpointing']['save_every'] = 1

    # Save temp config
    temp_config = "/tmp/efficientnet_b2_label_smooth_seed42.yaml"
    with open(temp_config, 'w') as f:
        yaml.dump(config, f)

    # Run training
    !python /content/cherries/training/scripts/train.py \
        --config {temp_config} \
        --data-root {DATA_PATH}

    print("\n" + "=" * 60)
    print(f"Training complete! Results: {config['checkpointing']['output_dir']}")
    print("=" * 60)


RUNNING EXP-003B: EfficientNet-B2 with Label Smoothing
Random seed set: 42
Label smoothing: alpha=0.1
Using cache found in /root/.cache/torch/hub/facebookresearch_dinov2_main

Loading config from: /tmp/efficientnet_b2_label_smooth_seed42.yaml
Output directory: /content/drive/MyDrive/cherry_experiments/efficientnet_b2_label_smooth_seed42
Config saved to: /content/drive/MyDrive/cherry_experiments/efficientnet_b2_label_smooth_seed42/config.yaml
Using device: cuda

LOADING DATA

Dataset loaded from: /content/cherry_classification/data
Training samples: 3676
Validation samples: 1226
Classes: ['cherry_clean', 'cherry_pit']
Class distribution (train): {'cherry_clean': 1978, 'cherry_pit': 1698}
Class distribution (val): {'cherry_clean': 660, 'cherry_pit': 566}

Class names: ['cherry_clean', 'cherry_pit']

CREATING MODEL
Architecture: efficientnet_b2
Loaded EfficientNet-B2 with ImageNet weights
Replaced classifier head: Linear(1408, 2)
Model moved to device: cuda
Optimizer: adam
Learning rate:

## Cell 10: EXP-006A - DINOv2 ViT-S/14 Linear Probe (NEW - Phase 2)

**Type:** Training (frozen backbone)
**Duration:** ~4-6 hours (fast training - only linear head optimized)
**Architecture:** DINOv2 ViT-S/14 foundation model
**Key Feature:** Self-supervised features, frozen backbone + trainable head
**Hypothesis:** Foundation model features will achieve ≥94.5% with minimal training

In [23]:
if SKIP_EXP_006A:
    print("EXP-006A: SKIPPED (set SKIP_EXP_006A=False to run)")
else:
    print("\n" + "=" * 60)
    print("RUNNING EXP-006A: DINOv2 ViT-S/14 Linear Probe (Phase 2)")
    print("Features: Frozen foundation backbone, fast training, normalized input")
    print("=" * 60)

    # Set random seed
    random.seed(RANDOM_SEED)
    np.random.seed(RANDOM_SEED)
    torch.manual_seed(RANDOM_SEED)
    if torch.cuda.is_available():
        torch.cuda.manual_seed_all(RANDOM_SEED)

    print(f"Random seed set: {RANDOM_SEED}")

    # Load config
    config_path = "/content/cherries/training/configs/experiments/dinov2_vits14_linear_probe_seed42.yaml"

    with open(config_path, 'r') as f:
        config = yaml.safe_load(f)

    # Override paths
    config['checkpointing']['output_dir'] = f"{DRIVE_BASE_PATH}/dinov2_vits14_linear_probe_seed42"
    config['data']['root'] = DATA_PATH

    # Apply smoke test settings
    if SMOKE_TEST:
        print("\nSMOKE TEST MODE: 1 epoch, 3 batches")
        config['training']['epochs'] = 1
        config['checkpointing']['save_every'] = 1

    # Save temp config
    temp_config = "/tmp/dinov2_vits14_linear_probe_seed42.yaml"
    with open(temp_config, 'w') as f:
        yaml.dump(config, f)

    # Run training
    !python /content/cherries/training/scripts/train.py \
        --config {temp_config} \
        --data-root {DATA_PATH}

    print("\n" + "=" * 60)
    print(f"Training complete! Results: {config['checkpointing']['output_dir']}")
    print("\nNOTE: DINOv2 inference may be slower than CNNs. If accurate but slow,")
    print("      consider using it as a teacher for distillation.")
    print("=" * 60)


RUNNING EXP-006A: DINOv2 ViT-S/14 Linear Probe (Phase 2)
Features: Frozen foundation backbone, fast training, normalized input
Random seed set: 42
Using cache found in /root/.cache/torch/hub/facebookresearch_dinov2_main

Loading config from: /tmp/dinov2_vits14_linear_probe_seed42.yaml
Output directory: /content/drive/MyDrive/cherry_experiments/dinov2_vits14_linear_probe_seed42
Config saved to: /content/drive/MyDrive/cherry_experiments/dinov2_vits14_linear_probe_seed42/config.yaml
Using device: cuda

LOADING DATA

Dataset loaded from: /content/cherry_classification/data
Training samples: 3676
Validation samples: 1226
Classes: ['cherry_clean', 'cherry_pit']
Class distribution (train): {'cherry_clean': 1978, 'cherry_pit': 1698}
Class distribution (val): {'cherry_clean': 660, 'cherry_pit': 566}

Class names: ['cherry_clean', 'cherry_pit']

CREATING MODEL
Architecture: dinov2_vits14
Using cache found in /root/.cache/torch/hub/facebookresearch_dinov2_main
Loaded DINOv2 ViT-S/14 backbone fro

## Cell 11: Experiment Summary

View all completed experiments and their results.

In [24]:
print("\n" + "=" * 60)
print("EXPERIMENT SUMMARY - Phase 2 SOTA")
print("=" * 60)

import json
from pathlib import Path

base_path = Path(DRIVE_BASE_PATH)

print("\nCompleted Experiments:")
print("-" * 60)

results = []

# Check each experiment directory
for exp_dir in sorted(base_path.glob("*")):
    if exp_dir.is_dir():
        metrics_file = exp_dir / "metrics.json"
        model_file = exp_dir / "model_best.pt"

        if metrics_file.exists():
            # Read last line for final metrics
            with open(metrics_file) as f:
                lines = f.readlines()
                if lines:
                    try:
                        last_entry = json.loads(lines[-1])
                        acc = last_entry.get('accuracy', 'N/A')
                        epoch = last_entry.get('epoch', 'N/A')
                        phase = last_entry.get('phase', 'N/A')

                        if phase == 'val':
                            status = '✓' if model_file.exists() else '⚠'
                            print(f"{status} {exp_dir.name:45s} | Epoch {str(epoch):>2s} | Acc: {acc}")
                            results.append({
                                'name': exp_dir.name,
                                'accuracy': acc,
                                'epoch': epoch
                            })
                    except json.JSONDecodeError:
                        pass

print("-" * 60)

if results:
    print(f"\nTotal experiments completed: {len(results)}")

    # Find best accuracy
    valid_results = [r for r in results if isinstance(r['accuracy'], (int, float))]
    if valid_results:
        best = max(valid_results, key=lambda x: x['accuracy'])
        print(f"\nBest model: {best['name']}")
        print(f"Accuracy: {best['accuracy']:.2%}")

        # Identify Phase 2 experiments
        phase2_models = [r for r in valid_results if 'convnextv2' in r['name'] or 'dinov2' in r['name']]
        if phase2_models:
            best_p2 = max(phase2_models, key=lambda x: x['accuracy'])
            print(f"\nBest Phase 2 (SOTA) model: {best_p2['name']}")
            print(f"Accuracy: {best_p2['accuracy']:.2%}")
else:
    print("\nNo completed experiments found yet.")

print("\n" + "=" * 60)
print(f"All results saved to: {DRIVE_BASE_PATH}")
print("\nDownload these files for local analysis:")
print("  - model_best.pt (trained model weights)")
print("  - metrics.json (training history)")
print("  - config.yaml (experiment configuration)")
print("=" * 60)


EXPERIMENT SUMMARY - Phase 2 SOTA

Completed Experiments:
------------------------------------------------------------
✓ convnextv2_tiny_baseline_seed42               | Epoch 30 | Acc: N/A
✓ convnextv2_tiny_label_smooth_seed42           | Epoch 30 | Acc: N/A
✓ dinov2_vits14_linear_probe_seed42             | Epoch 30 | Acc: N/A
✓ efficientnet_b2_baseline_seed42               | Epoch 30 | Acc: N/A
✓ efficientnet_b2_label_smooth_seed42           | Epoch 30 | Acc: N/A
------------------------------------------------------------

Total experiments completed: 5

All results saved to: /content/drive/MyDrive/cherry_experiments

Download these files for local analysis:
  - model_best.pt (trained model weights)
  - metrics.json (training history)
  - config.yaml (experiment configuration)


## Cell 12: Model Comparison & Decision Matrix

Compare all trained models and make deployment decision.

In [25]:
print("\n" + "=" * 60)
print("MODEL COMPARISON & DECISION MATRIX")
print("=" * 60)

# Find all model_best.pt files
model_files = list(Path(DRIVE_BASE_PATH).rglob("model_best.pt"))

print(f"\nFound {len(model_files)} trained model(s)")

if len(model_files) >= 1:
    print("\nModels available:")
    for i, model_path in enumerate(model_files, 1):
        exp_name = model_path.parent.name
        is_phase2 = 'convnextv2' in exp_name or 'dinov2' in exp_name
        marker = " [Phase 2 SOTA]" if is_phase2 else ""
        print(f"  {i}. {exp_name}{marker}")

    print("\n" + "=" * 60)
    print("DECISION CRITERIA:")
    print("=" * 60)
    print("Deploy if:")
    print("  ✓ Accuracy ≥ 94.05% (beats baseline)")
    print("  ✓ Latency < 30ms on CPU")
    print("  ✓ Pit recall ≥ 99.0% (food safety)")
    print("\nPhase 2 Priority:")
    print("  1. ConvNeXt V2-Tiny (best accuracy/speed tradeoff)")
    print("  2. DINOv2 (if accuracy > 95%, consider distillation)")
    print("  3. EfficientNet-B2 (if speed is critical)")

    print("\n" + "=" * 60)
    print("Next steps after training:")
    print("  1. Run local comparison: python scripts/compare_models.py")
    print("  2. Benchmark latency on production hardware")
    print("  3. Optimize thresholds for 3-class deployment")

print("=" * 60)


MODEL COMPARISON & DECISION MATRIX

Found 5 trained model(s)

Models available:
  1. convnextv2_tiny_baseline_seed42 [Phase 2 SOTA]
  2. convnextv2_tiny_label_smooth_seed42 [Phase 2 SOTA]
  3. efficientnet_b2_baseline_seed42
  4. efficientnet_b2_label_smooth_seed42
  5. dinov2_vits14_linear_probe_seed42 [Phase 2 SOTA]

DECISION CRITERIA:
Deploy if:
  ✓ Accuracy ≥ 94.05% (beats baseline)
  ✓ Latency < 30ms on CPU
  ✓ Pit recall ≥ 99.0% (food safety)

Phase 2 Priority:
  1. ConvNeXt V2-Tiny (best accuracy/speed tradeoff)
  2. DINOv2 (if accuracy > 95%, consider distillation)
  3. EfficientNet-B2 (if speed is critical)

Next steps after training:
  1. Run local comparison: python scripts/compare_models.py
  2. Benchmark latency on production hardware
  3. Optimize thresholds for 3-class deployment


## Cell 13: Download Results

Create download script for transferring results from Drive to local machine.

In [26]:
# Create download script
download_script = f"""
#!/bin/bash
# Download Phase 2 experiment results from Google Drive

# Source (Google Drive path)
DRIVE_SOURCE="{DRIVE_BASE_PATH}"

# Destination (local path)
LOCAL_DEST="./training/experiments"

echo "Downloading Phase 2 experiment results..."
echo "From: $DRIVE_SOURCE"
echo "To: $LOCAL_DEST"

# Create destination directory
mkdir -p $LOCAL_DEST

# Sync from Drive
rsync -av --progress "$DRIVE_SOURCE/" "$LOCAL_DEST/"

echo "Download complete!"
echo "\nPhase 2 models to evaluate:"
ls -lh $LOCAL_DEST/*/model_best.pt 2>/dev/null || echo "No models found"
"""

# Save script
script_path = f"{DRIVE_BASE_PATH}/download_phase2_results.sh"
with open(script_path, 'w') as f:
    f.write(download_script)

print("\n" + "=" * 60)
print("DOWNLOAD INSTRUCTIONS")
print("=" * 60)
print(f"Download script created: {script_path}")
print("\nTo download results to your local machine:")
print("1. Open a terminal on your local machine")
print("2. Navigate to your project root")
print("3. Run the download script:")
print(f"\n   bash {script_path}")
print("\nOr manually copy from Google Drive:")
print(f"   Source: {DRIVE_BASE_PATH}")
print("   Dest: ./training/experiments/")
print("=" * 60)


DOWNLOAD INSTRUCTIONS
Download script created: /content/drive/MyDrive/cherry_experiments/download_phase2_results.sh

To download results to your local machine:
1. Open a terminal on your local machine
2. Navigate to your project root
3. Run the download script:

   bash /content/drive/MyDrive/cherry_experiments/download_phase2_results.sh

Or manually copy from Google Drive:
   Source: /content/drive/MyDrive/cherry_experiments
   Dest: ./training/experiments/


---

## Summary

This Phase 2 notebook implements state-of-the-art approaches based on external research:

### What's New
1. **ConvNeXt V2** - FCMAE pre-training for superior defect detection
2. **DINOv2** - Foundation model with linear probe (fast training, high accuracy)
3. **Enhanced Augmentations** - Motion blur + photometric distortion for conveyor realism

### Experiment Priority
1. **EXP-001** - Run first (CPU, immediate value)
2. **EXP-006A** - DINOv2 (fast, high potential)
3. **EXP-002A/B** - ConvNeXt V2 (main contender)
4. **EXP-003A/B** - EfficientNet (speed alternative)

### Success Criteria
- **Minimum:** ≥94.05% accuracy (beat baseline)
- **Target:** ≥94.5% accuracy
- **Stretch:** ≥95% accuracy or <10ms latency

### References
- [Experiment Specifications](../../docs/reference/EXPERIMENT_SPECIFICATIONS.md)
- [Phase 2 Implementation Summary](../../docs/reference/PHASE2_IMPLEMENTATION_SUMMARY.md)
- External Research: temp-external-research/claude.md, gemini.md, gpt.md