# Unnormalized Training Experiment

This notebook runs a single experiment: `resnet50_augmented_unnormalized`.
The goal is to train a model that works with **raw images (0-255)** to match the production system, while still using augmentation to achieve high accuracy.

**Target:** >93% Accuracy on normalized validation set (or equivalent raw set).


## Step 1: Setup Environment

In [None]:
# 1. Mount Google Drive
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

# 2. Install Dependencies
!pip install -q pyyaml scikit-learn matplotlib tqdm

# 3. Clone Repository (Training Code)
# REPLACE WITH YOUR REPO URL IF DIFFERENT
!git clone https://github.com/usefulmove/cherries.git

# 4. Clone Dataset (Shallow clone)
!git clone --depth 1 https://github.com/weshavener/cherry_classification.git

print("\nSetup complete!")

## Step 2: Run Training

In [None]:
import subprocess
import os
from pathlib import Path

config_path = "cherries/training/configs/resnet50_augmented_unnormalized.yaml"
drive_output_base = "/content/drive/MyDrive/cherry_training/experiments"
data_root = "/content/cherry_classification/data"

output_dir = f"{drive_output_base}/resnet50_augmented_unnormalized"

print(f"\n{'='*60}")
print(f"STARTING EXPERIMENT: resnet50_augmented_unnormalized")
print(f"Config: {config_path}")
print(f"Output: {output_dir}")
print(f"{'='*60}\n")

cmd = [
    "python", "cherries/training/scripts/train.py",
    "--config", config_path,
    "--output-dir", output_dir,
    "--data-root", data_root
]

try:
    # Run with live output streaming
    process = subprocess.Popen(
        cmd, 
        stdout=subprocess.PIPE, 
        stderr=subprocess.STDOUT, 
        text=True, 
        bufsize=1
    )
    
    # Print output line by line
    for line in process.stdout:
        print(line, end='')
    
    process.wait()
    
    if process.returncode == 0:
        print(f"\nSUCCESS: Experiment completed.")
    else:
        print(f"\nFAILURE: Experiment failed with code {process.returncode}.")
        
except Exception as e:
    print(f"\nERROR: Execution failed: {e}")

## Step 3: Run Benchmark Comparison
Compare the newly trained unnormalized model against the production baseline (which is also unnormalized).

In [None]:
# Path to the new model we just trained
new_model_path = f"{output_dir}/model_best.pt"

# Path to production model (assuming you uploaded it to Drive or it's in the repo)
# For this notebook, we'll assume it's in the repo at the standard location
prod_model_path = "cherries/cherry_system/cherry_detection/resource/cherry_classification.pt"

print("Running Benchmark...")

cmd_bench = [
    "python", "cherries/training/scripts/compare_models.py",
    "--data-root", data_root,
    "--prod-model", prod_model_path,
    "--new-model", new_model_path,
    "--architecture", "resnet50",
    "--device", "cuda" # or cpu
]

subprocess.run(cmd_bench, check=False)