# Finetuning YOLOv11 model for UN Number Detection

This notebook demonstrates the fine-tuning of YOLOv11 models specifically for detecting UN number hazard plates on freight trains. We use the largest model in the YOLOv11 family (`yolov11x`) to achieve the best possible accuracy for this critical safety application.

## Overview

We conduct three systematic experiments with different training strategies to identify the optimal configuration:

1. **Default Experiment**: Baseline training with standard parameters
2. **Early Stopping Experiment**: Training with early stopping to prevent overfitting  
3. **Early Stopping + Lower Learning Rate**: Fine-tuned approach with reduced learning rate for better convergence

Each experiment uses our custom `UNNumberYOLO` class which provides simplified configuration management and optimized default parameters for hazard plate detection.

## Dataset

- **Source**: ProRail UN number hazard plate dataset
- **Format**: YOLO format with bounding box annotations
- **Location**: `../data/annotations/prorail/yolo/dataset.yaml`

## Model Selection

We use **YOLOv11x** (extra-large) for the following reasons:
- Highest accuracy in the YOLOv11 family
- Better performance on small objects (hazard plates can be small in images)
- Suitable for safety-critical applications where precision is paramount
- Acceptable inference speed for our use case

## Setup and Imports

First, we'll import our custom UNNumberYOLO class and set up the paths for our dataset and model outputs. The custom class provides simplified access to YOLO functionality with pre-configured parameters optimized for UN number detection.

In [None]:
import sys
from pathlib import Path
import os
sys.path.append(str(Path().resolve().parent / "src"))

In [None]:
# Import our custom YOLO class
from un_detector.models.yolo import UNNumberYOLO

# Define paths for dataset and model outputs
dataset_path = os.path.join("..", "data", "annotations", "prorail", "yolo", "dataset.yaml")
model_output_path = os.path.join("..", "outputs", "models", "prorail")

# Create output directory if it doesn't exist
os.makedirs(model_output_path, exist_ok=True)

print(f"📊 Dataset path: {dataset_path}")
print(f"💾 Model output path: {model_output_path}")
print(f"✅ Setup complete!")

## Training Experiments

We'll conduct three systematic experiments to find the optimal training configuration for UN number detection. Each experiment builds upon the previous one, incorporating lessons learned to improve model performance.

### Experiment Design

| Experiment | Key Features | Purpose |
|------------|--------------|---------|
| **Default** | Standard parameters, 10 epochs | Establish baseline performance |
| **Early Stopping** | Patience=5, up to 100 epochs | Prevent overfitting, find optimal training duration |
| **Early Stop + Lower LR** | Patience=10, reduced learning rate | Fine-tune convergence for better accuracy |

All experiments use:
- **Model**: YOLOv11x (extra-large)
- **Device**: CUDA (GPU acceleration)
- **Dataset**: ProRail UN number annotations
- **Output**: Saved models for comparison and deployment

### Experiment 1: Default Configuration

**Objective**: Establish a baseline performance using standard training parameters.

**Configuration Details**:
- **Epochs**: 10 (short training for initial assessment)
- **Learning Rate**: Default YOLO settings
- **Augmentation**: Standard augmentation pipeline
- **Early Stopping**: Disabled (fixed epoch count)

**Purpose**: 
- Validate the training pipeline works correctly
- Get initial performance metrics
- Establish baseline for comparison with optimized experiments
- Quick iteration to identify potential issues

This experiment uses the `default_experiment` configuration which includes balanced parameters suitable for most object detection tasks.

In [None]:
# Initialize YOLOv11x model for default experiment
yolo_default_experiment = UNNumberYOLO(model_size="xlarge", device="cuda")
print("🔧 Model initialized for default experiment:")
yolo_default_experiment

In [None]:
# Load default experiment configuration
default_config_name = "default_experiment"
print(f"📋 Loading configuration: {default_config_name}")

# Display available configurations
print(f"📁 Available configurations: {UNNumberYOLO.list_configs()}")

In [None]:
# Preview the default configuration parameters
default_config = UNNumberYOLO.load_config(default_config_name)
print("⚙️ Default configuration parameters:")
for key, value in default_config.items():
    print(f"   {key}: {value}")
print(f"\n📊 Total parameters: {len(default_config)}")
default_config

In [None]:
# Start training with default configuration
# Override epochs to 10 for this baseline experiment
print("🏋️ Starting Default Experiment Training...")
print("=" * 50)

results = yolo_default_experiment.train(
    data_path=dataset_path, 
    config_path=default_config_name, 
    epochs=10
)

print("✅ Default experiment training completed!")
print(f"📈 Results type: {type(results)}")

In [None]:
# Save the trained model
model_save_path = os.path.join(model_output_path, "yolo_default_experiment.pt")
print(f"💾 Saving default experiment model to: {model_save_path}")

yolo_default_experiment.save(model_save_path)

print("✅ Default experiment model saved successfully!")
print(f"📁 File size: {os.path.getsize(model_save_path) / (1024*1024):.1f} MB")

### Experiment 2: Early Stopping Strategy

**Objective**: Prevent overfitting and find the optimal training duration automatically.

**Configuration Details**:
- **Max Epochs**: 100 (upper limit)
- **Patience**: 5 epochs (stops if no improvement for 5 consecutive epochs)
- **Early Stopping**: Enabled based on validation metrics
- **Monitoring**: Validation loss and mAP metrics

**Key Advantages**:
- **Prevents Overfitting**: Stops training when model performance plateaus
- **Automatic Duration**: No need to guess optimal epoch count
- **Resource Efficient**: Avoids unnecessary training time
- **Better Generalization**: Model stops at peak validation performance

**Expected Outcome**:
The model should achieve better validation performance than the default experiment by stopping at the optimal point rather than training for a fixed number of epochs.

In [None]:
# Initialize fresh YOLOv11x model for early stopping experiment
yolo_early_stopping = UNNumberYOLO(model_size="xlarge", device="cuda")
print("🔧 Model initialized for early stopping experiment:")
yolo_early_stopping

In [None]:
# Load early stopping experiment configuration
early_stopping_config_name = "earlystopping-experiment"
print(f"📋 Configuration for experiment 2: {early_stopping_config_name}")
print("⏱️ This config includes early stopping parameters for optimal training duration")

In [None]:
# Preview early stopping configuration
early_stopping_config = UNNumberYOLO.load_config(early_stopping_config_name)
print("⚙️ Early stopping configuration parameters:")

# Show key differences from default config
key_params = ['epochs', 'patience', 'lr0', 'batch']
for param in key_params:
    if param in early_stopping_config:
        print(f"   {param}: {early_stopping_config[param]}")

print(f"\n📊 Total parameters: {len(early_stopping_config)}")
early_stopping_config

In [None]:
# Start training with early stopping
print("🏋️ Starting Early Stopping Experiment...")
print("=" * 50)
print("⏱️ Training will stop automatically when validation performance plateaus")
print(f"📊 Max epochs: 100, Patience: 5")

results_early_stop = yolo_early_stopping.train(
    data_path=dataset_path, 
    config_path=early_stopping_config_name, 
    patience=5,    # Stop if no improvement for 5 epochs
    epochs=100     # Maximum epochs allowed
)

print("✅ Early stopping experiment completed!")
print("📊 Training stopped automatically at optimal point")

In [None]:
# Save the early stopping experiment model
model_save_path_2 = os.path.join(model_output_path, "yolo_early_stopping_experiment.pt")
print(f"💾 Saving early stopping experiment model to: {model_save_path_2}")

yolo_early_stopping.save(model_save_path_2)

print("✅ Early stopping experiment model saved successfully!")
print(f"📁 File size: {os.path.getsize(model_save_path_2) / (1024*1024):.1f} MB")

### Experiment 3: Early Stopping + Lower Learning Rate

**Objective**: Achieve the highest possible accuracy through fine-tuned hyperparameters.

**Configuration Details**:
- **Max Epochs**: 100 (same as experiment 2)
- **Patience**: 10 epochs (increased patience for more thorough training)
- **Learning Rate**: Reduced for finer convergence
- **Early Stopping**: Enabled with extended patience

**Key Improvements**:
- **Lower Learning Rate**: Smaller steps for more precise weight updates
- **Extended Patience**: Allows more time to find optimal weights
- **Fine-tuned Convergence**: Better final accuracy through careful parameter adjustment
- **Stability**: Reduced risk of overshooting optimal weights

**Hypothesis**:
The combination of lower learning rate and extended patience should produce:
1. More stable training curves
2. Higher final validation accuracy
3. Better model generalization
4. Optimal performance for deployment

This represents our most sophisticated training approach, incorporating lessons from the previous experiments.

In [None]:
# Initialize fresh YOLOv11x model for optimized experiment
yolo_early_stopping_lower_lr = UNNumberYOLO(model_size="xlarge", device="cuda")
print("🔧 Model initialized for early stopping + lower learning rate experiment:")
print("🎯 This is our most optimized configuration for maximum accuracy")
yolo_early_stopping_lower_lr

In [None]:
# Load optimized experiment configuration  
early_stopping_lower_lr_config_name = "early-stopping-lowerlr-experiment"
print(f"📋 Configuration for final experiment: {early_stopping_lower_lr_config_name}")
print("🎯 This config combines early stopping with reduced learning rate for optimal results")

In [None]:
# Preview optimized configuration parameters
early_stopping_lower_lr_config = UNNumberYOLO.load_config(early_stopping_lower_lr_config_name)
print("⚙️ Optimized configuration parameters:")

# Highlight key optimization parameters
key_params = ['epochs', 'patience', 'lr0', 'lrf', 'batch', 'warmup_epochs']
for param in key_params:
    if param in early_stopping_lower_lr_config:
        print(f"   {param}: {early_stopping_lower_lr_config[param]}")

print(f"\n📊 Total parameters: {len(early_stopping_lower_lr_config)}")
print("🔍 Key optimizations: Lower learning rates and extended patience")
early_stopping_lower_lr_config

In [None]:
# Start optimized training with early stopping + lower learning rate
print("🏋️ Starting Optimized Experiment (Early Stopping + Lower LR)...")
print("=" * 60)
print("🎯 This is our most sophisticated training approach")
print("⏱️ Extended patience (10 epochs) for thorough optimization")
print("📈 Lower learning rate for precise convergence")

results_optimized = yolo_early_stopping_lower_lr.train(
    data_path=dataset_path, 
    config_path=early_stopping_lower_lr_config_name, 
    patience=10,   # Extended patience for better optimization
    epochs=100     # Maximum epochs allowed
)

print("✅ Optimized experiment completed!")
print("🎯 This should be our best-performing model")
print(f"📊 Results: {type(results_optimized)}")

In [None]:
# Save the optimized experiment model
model_save_path_3 = os.path.join(model_output_path, "yolo_early_stopping_lower_lr_experiment.pt")
print(f"💾 Saving optimized experiment model to: {model_save_path_3}")

yolo_early_stopping_lower_lr.save(model_save_path_3)

print("✅ Optimized experiment model saved successfully!")
print(f"📁 File size: {os.path.getsize(model_save_path_3) / (1024*1024):.1f} MB")

# Summary of all experiments
print("\n" + "="*60)
print("🎉 ALL EXPERIMENTS COMPLETED!")
print("="*60)
print("📊 Three models trained:")
print(f"   1️⃣ Default: {os.path.basename(model_save_path)}")
print(f"   2️⃣ Early Stopping: {os.path.basename(model_save_path_2)}")  
print(f"   3️⃣ Optimized: {os.path.basename(model_save_path_3)}")
print("\n🔍 Next steps: Compare model performance and select best for deployment")

## Experiment Summary and Next Steps

### Results Overview

We have successfully completed three systematic training experiments:

| Experiment | Configuration | Key Features | Expected Outcome |
|------------|---------------|--------------|------------------|
| **Baseline** | `default_experiment` | Standard parameters, 10 epochs | Quick baseline performance |
| **Early Stop** | `earlystopping-experiment` | Patience=5, max 100 epochs | Prevent overfitting |
| **Optimized** | `early-stopping-lowerlr-experiment` | Patience=10, lower LR | Maximum accuracy |

### Model Comparison Framework

To determine the best model, evaluate each on:

1. **Validation Metrics**:
   - mAP@0.5 (primary metric)
   - mAP@0.5:0.95 (overall precision)
   - Precision and Recall
   - Training/Validation loss curves

2. **Practical Performance**:
   - Inference speed
   - Memory usage  
   - Real-world test accuracy

3. **Training Efficiency**:
   - Total training time
   - Convergence stability
   - Resource utilization

### Next Steps

1. **Evaluate Models**: Run validation on test set
2. **Compare Performance**: Analyze metrics and select best model
3. **Deploy Selected Model**: Use best performer for production
4. **Document Results**: Record findings for future reference

### Generated Models

All trained models are saved in: `../outputs/models/prorail/`
- `yolo_default_experiment.pt`
- `yolo_early_stopping_experiment.pt` 
- `yolo_early_stopping_lower_lr_experiment.pt`