# ML Image Grading - Kaggle Training Notebook

This notebook trains the image quality scoring model on Kaggle using the Adobe FiveK dataset.

## Setup Instructions

1. Upload this notebook to Kaggle
2. Add the Adobe FiveK dataset to the notebook
3. Enable GPU accelerator (Settings > Accelerator > GPU)
4. Run all cells
5. Download the trained model from the output section

## Dataset

- Adobe FiveK: https://data.csail.mit.edu/graphics/fivek/
- Or search for "Adobe FiveK" on Kaggle Datasets

## 1. Install Dependencies

In [None]:
# Install additional dependencies if needed
# Most packages (numpy, opencv, tensorflow) are pre-installed on Kaggle
!pip install imageio -q

## 2. Upload Project Files

Upload the `src/` directory from this project, or clone from GitHub:

In [None]:
# Option 1: Clone from GitHub (replace with your repo URL)
# !git clone https://github.com/yourusername/ML-Image_grading.git
# import sys
# sys.path.append('/kaggle/working/ML-Image_grading/src')

# Option 2: Upload files to Kaggle Dataset and add it to the notebook
import sys
sys.path.append('/kaggle/input/ml-image-grading/src')  # Adjust path as needed

print("Setup complete!")

## 3. Import Libraries

In [None]:
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
from pathlib import Path
import os

# Import project modules
from image_loader import CR2ImageLoader
from feature_extractor import ImageFeatureExtractor
from scoring_model import ImageScoringModel
from train_model import ModelTrainer

print(f"TensorFlow version: {tf.__version__}")
print(f"GPU available: {tf.config.list_physical_devices('GPU')}")

## 4. Configure Dataset Path

Update this path to point to your Adobe FiveK dataset location in Kaggle:

In [None]:
# Adjust this path based on where you uploaded/added the Adobe FiveK dataset
DATASET_PATH = '/kaggle/input/adobe-fivek'  # Update this path
OUTPUT_PATH = '/kaggle/working/models'
MODEL_NAME = 'image_quality_scorer.h5'

# Create output directory
os.makedirs(OUTPUT_PATH, exist_ok=True)

print(f"Dataset path: {DATASET_PATH}")
print(f"Output path: {OUTPUT_PATH}")

## 5. Initialize Components

In [None]:
# Initialize components
image_loader = CR2ImageLoader(target_size=(512, 512))
feature_extractor = ImageFeatureExtractor()
model = ImageScoringModel(feature_dim=30, input_shape=(512, 512, 3))

# Create trainer
trainer = ModelTrainer(model, feature_extractor, image_loader)

print("✓ Components initialized")

## 6. Train the Model

In [None]:
# Training configuration
EXPERT = 'c'  # Expert C is most commonly used
NUM_SAMPLES = 1000  # Use None for all samples (takes longer)
EPOCHS = 30
BATCH_SIZE = 16

print(f"Training Configuration:")
print(f"  Expert: {EXPERT}")
print(f"  Samples: {NUM_SAMPLES if NUM_SAMPLES else 'All'}")
print(f"  Epochs: {EPOCHS}")
print(f"  Batch Size: {BATCH_SIZE}")
print("\nStarting training...\n")

# Train the model
history = trainer.train_on_adobe_fivek(
    dataset_path=DATASET_PATH,
    expert=EXPERT,
    num_samples=NUM_SAMPLES,
    epochs=EPOCHS,
    batch_size=BATCH_SIZE,
    validation_split=0.2
)

print("\n✓ Training complete!")

## 7. Visualize Training History

In [None]:
# Plot training history
fig, axes = plt.subplots(2, 2, figsize=(15, 10))

metrics = [
    ('overall_score_loss', 'Overall Score Loss'),
    ('composition_score_loss', 'Composition Score Loss'),
    ('color_score_loss', 'Color Score Loss'),
    ('technical_score_loss', 'Technical Score Loss')
]

for idx, (metric, title) in enumerate(metrics):
    ax = axes[idx // 2, idx % 2]
    
    if metric in history:
        ax.plot(history[metric], label='Training')
        if f'val_{metric}' in history:
            ax.plot(history[f'val_{metric}'], label='Validation')
        ax.set_title(title)
        ax.set_xlabel('Epoch')
        ax.set_ylabel('Loss')
        ax.legend()
        ax.grid(True)

plt.tight_layout()
plt.savefig(os.path.join(OUTPUT_PATH, 'training_history.png'), dpi=150)
plt.show()

print("✓ Training history plotted")

## 8. Save the Model

In [None]:
# Save model
model_path = os.path.join(OUTPUT_PATH, MODEL_NAME)
model.save_model(model_path)

print(f"✓ Model saved to: {model_path}")
print(f"  File size: {os.path.getsize(model_path) / 1024 / 1024:.2f} MB")

# Save training history
import json
history_path = os.path.join(OUTPUT_PATH, 'training_history.json')
with open(history_path, 'w') as f:
    history_serializable = {k: [float(v) for v in vals] for k, vals in history.items()}
    json.dump(history_serializable, f, indent=2)

print(f"✓ Training history saved to: {history_path}")

## 9. Model Summary

In [None]:
print("Model Architecture:")
print("=" * 70)
print(model.get_model_summary())

## 10. Download Instructions

**To use this model on your Mac:**

1. Download the trained model file from the output section (right side panel)
2. Look for: `models/image_quality_scorer.h5`
3. Place it in your local `models/` directory
4. Run the pipeline with: `python src/pipeline.py <image.CR2> --model models/image_quality_scorer.h5`

The model is now ready to use locally!

## 11. Test the Model (Optional)

In [None]:
# Test on a sample image
from dataset_loader import AdobeFiveKLoader

# Load a test image
loader = AdobeFiveKLoader(DATASET_PATH, expert=EXPERT)
test_pairs = loader.load_image_pairs(limit=1)

if test_pairs:
    original, edited, filename = test_pairs[0]
    
    # Extract features
    features = feature_extractor.get_feature_vector(original)
    
    # Get scores
    scores = model.predict_score(original, features)
    
    # Display results
    print(f"Test Image: {filename}")
    print("\nPredicted Scores:")
    for key, value in scores.items():
        print(f"  {key}: {value * 100:.2f}/100")
    
    # Show image
    plt.figure(figsize=(12, 6))
    plt.subplot(1, 2, 1)
    plt.imshow(original)
    plt.title('Original')
    plt.axis('off')
    
    plt.subplot(1, 2, 2)
    plt.imshow(edited)
    plt.title('Expert Edit')
    plt.axis('off')
    
    plt.tight_layout()
    plt.show()
else:
    print("No test images available")