# YOLOv11 Landmark Detection Training Framework

## Project Overview
This notebook implements a comprehensive training framework for Singapore landmark detection using the YOLOv11 architecture. The implementation incorporates advanced deep learning techniques including teacher-student architecture, knowledge distillation, and multi-scale training optimization specifically designed for landmark detection tasks.

### Architecture Components
- **Teacher-Student Architecture**: YOLOv11m (teacher) and YOLOv11n (student) models providing a balance between accuracy and computational efficiency
- **Knowledge Distillation**: Advanced pseudo-labeling and distillation techniques for effective knowledge transfer from teacher to student
- **Performance Optimization**: Multi-scale training, enhanced augmentation strategies, and deployment optimization for various hardware configurations
- **Comprehensive Evaluation**: Detailed metrics analysis, benchmarking, and comparative model assessment

## Dataset Specifications
- **Target Classes**: 4 Singapore landmarks (ArtScience Museum, Esplanade, Marina Bay Sands, Merlion)
- **Image Dataset**: Over 1400 balanced images with enhanced augmentation from preprocessing pipeline
- **Data Format**: YOLO format with normalized bounding box annotations
- **Dataset Structure**: Pre-processed balanced dataset with static augmentations for consistent training

## Training Strategy
The training approach employs a multi-phase methodology:

1. **Teacher Model (YOLOv11m)**: Primary high-accuracy model for maximum performance benchmarking
2. **Student Model (YOLOv11n)**: Compact model optimized for mobile and edge deployment scenarios  
3. **Knowledge Distillation**: Systematic knowledge transfer from teacher to student maintaining performance while reducing computational requirements
4. **Multi-Format Export**: Comprehensive model export including ONNX, TensorFlow Lite, CoreML, and OpenVINO formats for deployment flexibility

## Training Configuration
- **Hardware Optimization**: RTX 4090 GPU with 25.76GB VRAM utilization optimization
- **Optimizer**: AdamW optimizer with cosine learning rate scheduling for stable convergence
- **Regularization**: Dropout, label smoothing, and conservative augmentation strategies
- **Training Stability**: Reduced batch size for stable gradient computation with extended patience parameters

## Performance Objectives
- **Teacher Model Performance**: Target metric of >78% mAP@50-95 for high-accuracy deployment scenarios
- **Student Model Performance**: Target metric of >65% mAP@50-95 with 4-8x model compression ratio
- **Deployment Optimization**: Models optimized for both server-side and mobile deployment environments

---
## Environment Setup and Dependencies

This section configures the complete training environment and imports all necessary libraries for the landmark detection training pipeline. The setup includes computer vision libraries, deep learning frameworks, data processing utilities, and visualization tools required for comprehensive model training and evaluation.

In [4]:
import os
import cv2
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from pathlib import Path
from ultralytics import YOLO
import shutil
import yaml
from collections import Counter
import albumentations as A
from PIL import Image
import random
import json
from typing import List, Tuple, Dict
import warnings
warnings.filterwarnings('ignore')

print("[SUCCESS] Libraries imported successfully!")


[SUCCESS] Libraries imported successfully!


---
## Optimized YOLOv11m Teacher Model Training

### Model Selection Rationale

**YOLOv11m Architecture Selection**:
- **Parameter Efficiency**: 22M parameters providing optimal balance for landmark detection tasks
- **Training Performance**: 1-1.5 minutes per epoch resulting in 2-3x faster training compared to YOLOv11x
- **Convergence Stability**: Enhanced training stability and more reliable convergence patterns
- **Accuracy Potential**: Superior accuracy capabilities compared to smaller YOLOv11s variant while maintaining computational efficiency

The YOLOv11m model represents the optimal balance point between computational efficiency and detection accuracy for landmark detection applications, making it the ideal teacher model for knowledge distillation frameworks.

In [9]:
# OPTIMIZED YOLOv11m Training - Perfect Balance for Landmark Detection
from ultralytics import YOLO
import torch
from pathlib import Path
import numpy as np
import time

print("="*80)
print("OPTIMIZED YOLOv11m TRAINING - PERFECT BALANCE FOR LANDMARK DETECTION")
print("="*80)

# Paths - Using the balanced dataset from YOLOv11_data notebook
DATASET_PATH = Path(r"D:\SIT\AAI3001 Computer Vision\Project\monuai_model\Project2_YOLO_wimages")
YOLO_DATASET_DIR = DATASET_PATH / "balanced_yolo_dataset"
yaml_path = YOLO_DATASET_DIR / "dataset.yaml"

# Verify dataset.yaml exists
if not yaml_path.exists():
    raise FileNotFoundError(f"dataset.yaml not found at {yaml_path}. Run YOLOv11_data notebook first.")

# Device configuration
device = 'cuda' if torch.cuda.is_available() else 'cpu'
print(f"Training device: {device}")
if device == 'cuda':
    try:
        print(f"GPU: {torch.cuda.get_device_name(0)}")
        print(f"GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")
    except Exception:
        pass

# OPTIMIZED Configuration - YOLOv11m Sweet Spot
print("\n[OPTIMIZED] YOLOv11m FOR LANDMARK DETECTION")
print("="*50)

# PERFECT-SIZED hyperparameters for YOLOv11m
EPOCHS = 100          # Efficient training length
IMG_SIZE = 640        # Standard input size
BATCH_SIZE = 18       # Increased batch size for better speed
LR0 = 0.01           # Higher LR for faster convergence
OPTIM = 'AdamW'      # AdamW optimizer
WEIGHT_DECAY = 0.0005 # Light regularization
PATIENCE = 10        # Good patience for convergence

# OPTIMIZED: Balanced Augmentation for YOLOv11s
OPTIMIZED_CONFIG = {
    # Learning Rate Strategy
    'lr0': LR0,                    # Higher LR for faster training
    'lrf': 0.1,                    # Good final LR
    'momentum': 0.937,             # Standard momentum
    'weight_decay': WEIGHT_DECAY,  # Light regularization
    
    # Warmup Strategy
    'warmup_epochs': 3.0,          # Short warmup
    'warmup_momentum': 0.8,        # Good warmup momentum
    'warmup_bias_lr': 0.1,         # Higher warmup bias LR
    
    # Loss Functions
    'box': 7.5,                    # Balanced box loss
    'cls': 0.5,                    # Moderate classification loss
    'dfl': 1.5,                    # Moderate DFL
    
    # BALANCED Dynamic Augmentation ( not excessive)
    'hsv_h': 0.015,               # Light HSV hue variation
    'hsv_s': 0.1,                 # Moderate saturation
    'hsv_v': 0.1,                 # Moderate value
    'degrees': 0.0,               # NO rotation (preserve landmarks)
    'translate': 0.05,            # Small translation
    'scale': 0.1,                 # Small scale variation
    'shear': 0.0,                 # NO shear (preserve shape)
    'perspective': 0.0,           # NO perspective
    'flipud': 0.0,                # NO vertical flip
    'fliplr': 0.1,                # Some horizontal flip
    
    # BALANCED Advanced Augmentations
    'mosaic': 0.2,                # Moderate mosaic
    'mixup': 0.0,                 # NO mixup for landmarks
    'copy_paste': 0.0,            # NO copy-paste
    'close_mosaic': 15,           # Close mosaic mid-training
    
    # Light Regularization
    'dropout': 0.0,               # No dropout needed
    'label_smoothing': 0.0,       # No label smoothing
}

PROJECT_DIR = Path('monuai_model')
RUN_NAME = 'YOLOv11m_teacher'

print(f"[OPTIMIZED] YOLOv11m CONFIGURATION:")
print(f"  [MODEL] YOLOv11m: ~22M parameters (vs 56M+ YOLOv11x)")
print(f"  [SPEED] Expected: 1-1.5min per epoch (vs 3+ min)")
print(f"  [MEMORY] Expected: ~18GB VRAM (vs 25+ GB)")
print(f"  [BATCH] Batch Size: {BATCH_SIZE} (optimal for stability)")
print(f"  [LR] Learning Rate: {LR0} (higher for efficiency)")
print(f"  [EPOCHS] Training: {EPOCHS} epochs (efficient)")

print(f"\n[AUGMENTATION] BALANCED DYNAMIC SETTINGS:")
print(f"  - HSV: 0.015, 0.1, 0.1 (moderate variation)")
print(f"  - Translation: 0.05 (small positioning)")
print(f"  - Scale: 0.1 (small scale variation)")
print(f"  - Flip: 0.1 (some horizontal flip)")
print(f"  - Mosaic: 0.2 (moderate mosaic)")
print(f"  - NO geometric distortion (preserve landmarks)")

# Load YOLOv11m model
print(f"\n[LOADING] Loading YOLOv11m model...")
model = YOLO('yolo11m.pt')
print("[SUCCESS] YOLOv11m model loaded")

# Train with OPTIMIZED configuration
print(f"\n[TRAINING] Starting OPTIMIZED YOLOv11m training...")

start_time = time.time()

results = model.train(
    data=str(yaml_path),
    epochs=EPOCHS,
    imgsz=IMG_SIZE,
    batch=BATCH_SIZE,
    device=device,
    patience=PATIENCE,
    save=True,
    project=str(PROJECT_DIR),
    name=RUN_NAME,
    exist_ok=True,
    pretrained=True,
    optimizer=OPTIM,
    verbose=True,
    seed=42,
    val=True,
    plots=True,
    workers=4,                 # FIXED: Reduced workers to prevent multiprocessing errors
    cos_lr=True,               # Cosine learning rate scheduler
    amp=True,                  # Automatic Mixed Precision
    fraction=1.0,              # Use full dataset
    profile=False,             # Disable profiling
    freeze=None,               # Don't freeze layers
    multi_scale=False,         # DISABLED: Multi-scale for speed
    overlap_mask=True,         # Overlap masking
    mask_ratio=4,              # Mask ratio
    save_period=20,            # OPTIMIZED: Save less frequently
    cache='disk',              # Use disk cache for deterministic results
    **OPTIMIZED_CONFIG
)

training_time = time.time() - start_time
print(f"\n[COMPLETE] YOLOv11m training completed in {training_time/3600:.2f} hours!")

# Comprehensive Model Evaluation
OPTIMIZED_DIR = PROJECT_DIR / RUN_NAME
BEST_WEIGHTS = OPTIMIZED_DIR / 'weights' / 'best.pt'
print(f"YOLOv11m weights: {BEST_WEIGHTS}")

if not BEST_WEIGHTS.exists():
    print(f"ERROR: Best weights not found at {BEST_WEIGHTS}. Check training logs.")
else:
    # Load and evaluate the optimized YOLOv11m model
    print("\n[EVALUATION] YOLOv11m PERFORMANCE EVALUATION")
    print("="*55)
    optimized_model = YOLO(str(BEST_WEIGHTS))

    # Detailed validation
    val_metrics = optimized_model.val(
        data=str(yaml_path), 
        imgsz=IMG_SIZE, 
        batch=BATCH_SIZE, 
        device=device, 
        plots=True,
        save_json=True,
        conf=0.001,
        iou=0.6,
        max_det=300,
        verbose=True
    )

    print("\n[PERFORMANCE] YOLOv11m PERFORMANCE RESULTS")
    print("="*50)
    try:
        metrics = getattr(val_metrics, 'results_dict', None) or {}
        
        # Extract comprehensive metrics
        precision = metrics.get('metrics/precision(B)', 0)
        recall = metrics.get('metrics/recall(B)', 0) 
        map50_95 = metrics.get('metrics/mAP50-95(B)', 0)
        map50 = metrics.get('metrics/mAP50(B)', 0)
        map75 = metrics.get('metrics/mAP75(B)', 0)
        
        print(f"[METRICS] YOLOv11m Performance:")
        print(f"  [mAP@50-95] mAP@50-95: {map50_95:.4f}")
        print(f"  [mAP@50] mAP@50:    {map50:.4f}")
        print(f"  [mAP@75] mAP@75:    {map75:.4f}")
        print(f"  [PRECISION] Precision: {precision:.4f}")
        print(f"  [RECALL] Recall:    {recall:.4f}")
        
        # Training efficiency analysis
        epochs_per_hour = EPOCHS / (training_time / 3600)
        print(f"\n[EFFICIENCY] Training Efficiency:")
        print(f"  [TIME] Total training: {training_time/3600:.2f} hours")
        print(f"  [SPEED] Epochs per hour: {epochs_per_hour:.1f}")
        
        # Performance assessment
        print(f"\n[ASSESSMENT] YOLOv11m Performance:")
        if map50_95 > 0.78:
            print(f"  [EXCELLENT] YOLOv11m achieved >78% mAP@50-95!")
        elif map50_95 > 0.73:
            print(f"  [VERY_GOOD] YOLOv11m achieved >73% mAP@50-95")
        elif map50_95 > 0.68:
            print(f"  [GOOD] YOLOv11m achieved >68% mAP@50-95")
        else:
            print(f"  [NEEDS_ANALYSIS] Performance below expectations")
        
        # Model size comparison
        model_size = BEST_WEIGHTS.stat().st_size / (1024**2)
        print(f"\n[MODEL_ANALYSIS] Model Characteristics:")
        print(f"  [SIZE] Model size: {model_size:.1f} MB")
        print(f"  [PARAMETERS] ~22M parameters")
        
        # Landmark-specific analysis
        if map75 > 0.3:
            print(f"  [LANDMARKS] EXCELLENT localization (mAP@75 > 0.3)")
        elif map75 > 0.15:
            print(f"  [LANDMARKS] GOOD localization")
        else:
            print(f"  [LANDMARKS] Localization needs improvement")
                
    except Exception as e:
        print(f"Could not parse metrics: {e}")

    print(f"\n[FILES] YOLOv11m artifacts:")
    print(f"  Model directory: {OPTIMIZED_DIR}")
    print(f"  Best weights: {BEST_WEIGHTS}")


OPTIMIZED YOLOv11m TRAINING - PERFECT BALANCE FOR LANDMARK DETECTION
Training device: cuda
GPU: NVIDIA GeForce RTX 4090
GPU Memory: 25.76 GB

[OPTIMIZED] YOLOv11m FOR LANDMARK DETECTION
[OPTIMIZED] YOLOv11m CONFIGURATION:
  [MODEL] YOLOv11m: ~22M parameters (vs 56M+ YOLOv11x)
  [SPEED] Expected: 1-1.5min per epoch (vs 3+ min)
  [MEMORY] Expected: ~18GB VRAM (vs 25+ GB)
  [BATCH] Batch Size: 18 (optimal for stability)
  [LR] Learning Rate: 0.01 (higher for efficiency)
  [EPOCHS] Training: 100 epochs (efficient)

[AUGMENTATION] BALANCED DYNAMIC SETTINGS:
  - HSV: 0.015, 0.1, 0.1 (moderate variation)
  - Translation: 0.05 (small positioning)
  - Scale: 0.1 (small scale variation)
  - Flip: 0.1 (some horizontal flip)
  - Mosaic: 0.2 (moderate mosaic)
  - NO geometric distortion (preserve landmarks)

[LOADING] Loading YOLOv11m model...
[SUCCESS] YOLOv11m model loaded

[TRAINING] Starting OPTIMIZED YOLOv11m training...
New https://pypi.org/project/ultralytics/8.3.228 available  Update with '

---
## Advanced Performance Optimization Framework

### Achieving Superior Model Performance (>80% mAP@50-95)

Building upon previous achievements of 79% mAP, this section implements advanced optimization techniques to exceed the 80% mAP@50-95 threshold through systematic performance enhancement strategies.

#### Strategy 1: Enhanced Single Model Training
- **Hyperparameter Optimization**: Fine-tuned parameters specifically calibrated for landmark detection characteristics
- **Advanced Augmentation**: Sophisticated augmentation strategies preserving landmark geometric properties
- **Multi-Scale Training**: Variable resolution training while maintaining landmark spatial relationships

#### Strategy 2: Model Ensembling Architecture
- **Complementary Model Training**: YOLOv11m primary model with YOLOv11s complementary architecture
- **Test-Time Augmentation (TTA)**: Ensemble predictions across multiple augmented test variations  
- **Weighted Prediction Fusion**: Optimized prediction combining using confidence-based weighting

#### Strategy 3: Advanced Training Methodologies
- **Extended Training Duration**: Longer training cycles with extended patience parameters for convergence optimization
- **Progressive Resizing Strategy**: Dynamic input resolution scheduling throughout training phases
- **Loss Function Optimization**: Advanced loss function parameter tuning for landmark-specific optimization

In [10]:
# STRATEGY 1: Enhanced YOLOv11m Training for >80% Performance
from ultralytics import YOLO
import torch
from pathlib import Path
import numpy as np
import time

print("="*80)
print("ENHANCED YOLOv11m TRAINING - TARGETING >80% mAP@50-95")
print("="*80)

# Paths
DATASET_PATH = Path(r"D:\SIT\AAI3001 Computer Vision\Project\monuai_model\Project2_YOLO_wimages")
YOLO_DATASET_DIR = DATASET_PATH / "balanced_yolo_dataset"
yaml_path = YOLO_DATASET_DIR / "dataset.yaml"

device = 'cuda' if torch.cuda.is_available() else 'cpu'
print(f"Training device: {device}")

# ENHANCED Configuration for >80% Performance
print("\n[ENHANCED] YOLOv11m FOR >80% PERFORMANCE")
print("="*50)

# ENHANCED hyperparameters for maximum accuracy
EPOCHS = 150          # Extended training for convergence
IMG_SIZE = 640        # Standard input size
BATCH_SIZE = 16       # Optimal batch size for stability
LR0 = 0.008          # Slightly lower for fine-tuning
OPTIM = 'AdamW'      # AdamW optimizer
WEIGHT_DECAY = 0.0003 # Reduced weight decay
PATIENCE = 25        # Extended patience for convergence

# ENHANCED Configuration for Maximum Performance
ENHANCED_CONFIG = {
    # Learning Rate Strategy - Fine-tuned
    'lr0': LR0,                    # Lower LR for fine convergence
    'lrf': 0.05,                   # Lower final LR for fine-tuning
    'momentum': 0.95,              # Higher momentum for stability
    'weight_decay': WEIGHT_DECAY,  # Reduced weight decay
    
    # Extended Warmup Strategy
    'warmup_epochs': 5.0,          # Extended warmup
    'warmup_momentum': 0.85,       # Higher warmup momentum
    'warmup_bias_lr': 0.05,        # Lower warmup bias LR
    
    # Optimized Loss Functions for Landmarks
    'box': 7.0,                    # Slightly lower box loss
    'cls': 0.3,                    # Lower classification loss
    'dfl': 1.2,                    # Lower DFL for fine localization
    
    # ENHANCED Augmentation for Landmark Preservation
    'hsv_h': 0.01,                # Minimal HSV hue
    'hsv_s': 0.05,                # Minimal saturation
    'hsv_v': 0.05,                # Minimal value
    'degrees': 0.0,               # NO rotation
    'translate': 0.03,            # Very small translation
    'scale': 0.05,                # Very small scale
    'shear': 0.0,                 # NO shear
    'perspective': 0.0,           # NO perspective
    'flipud': 0.0,                # NO vertical flip
    'fliplr': 0.05,               # Minimal horizontal flip
    
    # Conservative Advanced Augmentations
    'mosaic': 0.1,                # Minimal mosaic
    'mixup': 0.0,                 # NO mixup
    'copy_paste': 0.0,            # NO copy-paste
    'close_mosaic': 20,           # Close mosaic early
    
    # Fine Regularization
    'dropout': 0.0,               # No dropout
    'label_smoothing': 0.0,       # No label smoothing
}

PROJECT_DIR = Path('monuai_model')
RUN_NAME = 'YOLOv11m_teacher_enhanced'

print(f"[ENHANCED] Configuration for >80% Performance:")
print(f"  [EPOCHS] Extended: {EPOCHS} epochs")
print(f"  [LR] Fine-tuned: {LR0} (lower for precision)")
print(f"  [AUGMENTATION] Ultra-conservative for landmarks")
print(f"  [PATIENCE] Extended: {PATIENCE} for full convergence")
print(f"  [TARGET] Breaking 80% mAP@50-95 barrier")

# Load YOLOv11m model
print(f"\n[LOADING] Loading YOLOv11m for enhanced training...")
enhanced_model = YOLO('yolo11m.pt')
print("[SUCCESS] YOLOv11m model loaded for enhanced training")

# Enhanced Training
print(f"\n[TRAINING] Starting ENHANCED YOLOv11m training...")
print("Enhanced techniques applied:")
print("  ✓ Extended epochs for full convergence")
print("  ✓ Fine-tuned learning rate schedule")
print("  ✓ Ultra-conservative augmentation")
print("  ✓ Optimized loss functions for landmarks")
print("  ✓ Multi-scale training enabled")

start_time = time.time()

enhanced_results = enhanced_model.train(
    data=str(yaml_path),
    epochs=EPOCHS,
    imgsz=IMG_SIZE,
    batch=BATCH_SIZE,
    device=device,
    patience=PATIENCE,
    save=True,
    project=str(PROJECT_DIR),
    name=RUN_NAME,
    exist_ok=True,
    pretrained=True,
    optimizer=OPTIM,
    verbose=True,
    seed=42,
    val=True,
    plots=True,
    workers=4,
    cos_lr=True,               # Cosine learning rate
    amp=True,                  # Mixed precision
    fraction=1.0,              # Full dataset
    profile=False,
    freeze=None,
    multi_scale=True,          # ENABLED: Multi-scale for accuracy
    overlap_mask=True,
    mask_ratio=4,
    save_period=30,            # Save less frequently
    cache='disk',              # Deterministic caching
    **ENHANCED_CONFIG
)

enhanced_time = time.time() - start_time
print(f"\n[COMPLETE] Enhanced training completed in {enhanced_time/60:.1f} minutes!")

# Evaluate Enhanced Model
ENHANCED_DIR = PROJECT_DIR / RUN_NAME
ENHANCED_WEIGHTS = ENHANCED_DIR / 'weights' / 'best.pt'

if ENHANCED_WEIGHTS.exists():
    print(f"\n[EVALUATION] Enhanced YOLOv11m Performance")
    print("="*50)
    
    enhanced_eval_model = YOLO(str(ENHANCED_WEIGHTS))
    
    # Comprehensive validation with TTA
    enhanced_metrics = enhanced_eval_model.val(
        data=str(yaml_path),
        imgsz=IMG_SIZE,
        batch=BATCH_SIZE,
        device=device,
        plots=True,
        save_json=True,
        conf=0.001,
        iou=0.6,
        max_det=300,
        verbose=True
    )
    
    try:
        metrics = getattr(enhanced_metrics, 'results_dict', {}) or {}
        
        enhanced_map50_95 = metrics.get('metrics/mAP50-95(B)', 0)
        enhanced_map50 = metrics.get('metrics/mAP50(B)', 0)
        enhanced_map75 = metrics.get('metrics/mAP75(B)', 0)
        enhanced_precision = metrics.get('metrics/precision(B)', 0)
        enhanced_recall = metrics.get('metrics/recall(B)', 0)
        
        print(f"[ENHANCED RESULTS] YOLOv11m Enhanced Performance:")
        print(f"  [mAP@50-95] {enhanced_map50_95:.4f} (Target: >0.8000)")
        print(f"  [mAP@50] {enhanced_map50:.4f}")
        print(f"  [mAP@75] {enhanced_map75:.4f}")
        print(f"  [PRECISION] {enhanced_precision:.4f}")
        print(f"  [RECALL] {enhanced_recall:.4f}")
        
        # Check if we broke 80%
        if enhanced_map50_95 > 0.80:
            improvement = ((enhanced_map50_95 - 0.79) / 0.79) * 100
            print(f"\n🎉 [SUCCESS] BROKE 80% BARRIER!")
            print(f"  [ACHIEVEMENT] {enhanced_map50_95*100:.2f}% mAP@50-95")
            print(f"  [IMPROVEMENT] +{improvement:.1f}% over previous 79%")
        elif enhanced_map50_95 > 0.79:
            print(f"\n⬆ [PROGRESS] Improved over 79%: {enhanced_map50_95*100:.2f}%")
        else:
            print(f"\n [ANALYSIS] Current: {enhanced_map50_95*100:.2f}% - Need ensemble strategy")
            
    except Exception as e:
        print(f"Could not parse enhanced metrics: {e}")

print(f"\n[NEXT] Enhanced model ready for ensemble strategies!")

ENHANCED YOLOv11m TRAINING - TARGETING >80% mAP@50-95
Training device: cuda

[ENHANCED] YOLOv11m FOR >80% PERFORMANCE
[ENHANCED] Configuration for >80% Performance:
  [EPOCHS] Extended: 150 epochs
  [LR] Fine-tuned: 0.008 (lower for precision)
  [AUGMENTATION] Ultra-conservative for landmarks
  [PATIENCE] Extended: 25 for full convergence
  [TARGET] Breaking 80% mAP@50-95 barrier

[LOADING] Loading YOLOv11m for enhanced training...
[SUCCESS] YOLOv11m model loaded for enhanced training

[TRAINING] Starting ENHANCED YOLOv11m training...
Enhanced techniques applied:
  ✓ Extended epochs for full convergence
  ✓ Fine-tuned learning rate schedule
  ✓ Ultra-conservative augmentation
  ✓ Optimized loss functions for landmarks
  ✓ Multi-scale training enabled
New https://pypi.org/project/ultralytics/8.3.228 available  Update with 'pip install -U ultralytics'
Ultralytics 8.3.221  Python-3.12.5 torch-2.5.1+cu121 CUDA:0 (NVIDIA GeForce RTX 4090, 24564MiB)
[34m[1mengine\trainer: [0magnostic_nms=

---
## Strategy 2: YOLOv11s Complementary Ensemble Training

### Ensemble Model Architecture

This section trains a YOLOv11s model with complementary characteristics to work alongside the YOLOv11m teacher model. The ensemble approach uses different hyperparameters and training strategies to create model diversity, which improves overall detection performance through combined predictions.

#### Key Features:
- **Complementary Configuration**: Different hyperparameters (higher learning rate, larger batch size) to create diverse prediction patterns
- **Extended Training**: 120 epochs to ensure full convergence with the complementary strategy
- **Model Diversity**: Creates variations in feature extraction and detection patterns for robust ensemble performance

In [11]:
# STRATEGY 2: Ensemble Training - YOLOv11s Complementary Model
from ultralytics import YOLO
import torch
from pathlib import Path
import numpy as np
import time

print("="*80)
print("ENSEMBLE STRATEGY: TRAINING YOLOv11s AS COMPLEMENTARY MODEL")
print("="*80)

# Train YOLOv11s with different characteristics for ensemble
print("\n[ENSEMBLE] YOLOv11s Complementary Training")
print("="*50)

# Complementary configuration for YOLOv11s
ENSEMBLE_EPOCHS = 120
ENSEMBLE_BATCH = 20       # Larger batch for YOLOv11s
ENSEMBLE_LR = 0.012       # Higher LR for faster model

# Complementary augmentation strategy
COMPLEMENTARY_CONFIG = {
    # Different learning strategy
    'lr0': ENSEMBLE_LR,
    'lrf': 0.1,
    'momentum': 0.937,
    'weight_decay': 0.0005,
    
    'warmup_epochs': 3.0,
    'warmup_momentum': 0.8,
    'warmup_bias_lr': 0.1,
    
    # Slightly different loss weighting
    'box': 8.0,               # Higher box focus
    'cls': 0.4,               # Different class weighting
    'dfl': 1.3,               # Different DFL
    
    # Complementary augmentation (slightly more aggressive)
    'hsv_h': 0.02,
    'hsv_s': 0.15,
    'hsv_v': 0.15,
    'degrees': 0.0,
    'translate': 0.08,        # Slightly more translation
    'scale': 0.15,            # Slightly more scale
    'shear': 0.0,
    'perspective': 0.0,
    'flipud': 0.0,
    'fliplr': 0.15,           # More horizontal flip
    
    'mosaic': 0.3,            # More mosaic for diversity
    'mixup': 0.0,
    'copy_paste': 0.0,
    'close_mosaic': 25,
    
    'dropout': 0.0,
    'label_smoothing': 0.0,
}

ENSEMBLE_NAME = 'YOLOv11s_ensemble_complement'

print(f"[ENSEMBLE] YOLOv11s Complementary Configuration:")
print(f"  [PURPOSE] Ensemble partner for YOLOv11m")
print(f"  [STRATEGY] Different augmentation and learning")
print(f"  [BATCH] Larger batch: {ENSEMBLE_BATCH}")
print(f"  [LR] Higher LR: {ENSEMBLE_LR}")
print(f"  [EPOCHS] {ENSEMBLE_EPOCHS} epochs")

# Train YOLOv11s ensemble model
print(f"\n[TRAINING] YOLOv11s Ensemble Model...")
ensemble_model = YOLO('yolo11s.pt')

ensemble_start = time.time()

ensemble_results = ensemble_model.train(
    data=str(yaml_path),
    epochs=ENSEMBLE_EPOCHS,
    imgsz=IMG_SIZE,
    batch=ENSEMBLE_BATCH,
    device=device,
    patience=20,
    save=True,
    project=str(PROJECT_DIR),
    name=ENSEMBLE_NAME,
    exist_ok=True,
    pretrained=True,
    optimizer='AdamW',
    verbose=True,
    seed=123,                 # Different seed for diversity
    val=True,
    plots=True,
    workers=4,
    cos_lr=True,
    amp=True,
    fraction=1.0,
    profile=False,
    freeze=None,
    multi_scale=True,
    overlap_mask=True,
    mask_ratio=4,
    save_period=25,
    cache='disk',
    **COMPLEMENTARY_CONFIG
)

ensemble_time = time.time() - ensemble_start
print(f"\n[COMPLETE] YOLOv11s ensemble training completed in {ensemble_time/60:.1f} minutes!")

# Evaluate YOLOv11s ensemble model
ENSEMBLE_DIR = PROJECT_DIR / ENSEMBLE_NAME
ENSEMBLE_WEIGHTS = ENSEMBLE_DIR / 'weights' / 'best.pt'

if ENSEMBLE_WEIGHTS.exists():
    print(f"\n[EVALUATION] YOLOv11s Ensemble Performance")
    print("="*45)
    
    ensemble_eval_model = YOLO(str(ENSEMBLE_WEIGHTS))
    ensemble_metrics = ensemble_eval_model.val(
        data=str(yaml_path),
        imgsz=IMG_SIZE,
        batch=ENSEMBLE_BATCH,
        device=device,
        plots=True,
        save_json=True,
        conf=0.001,
        iou=0.6,
        max_det=300,
        verbose=True
    )
    
    try:
        ens_metrics = getattr(ensemble_metrics, 'results_dict', {}) or {}
        
        ens_map50_95 = ens_metrics.get('metrics/mAP50-95(B)', 0)
        ens_map50 = ens_metrics.get('metrics/mAP50(B)', 0)
        ens_precision = ens_metrics.get('metrics/precision(B)', 0)
        ens_recall = ens_metrics.get('metrics/recall(B)', 0)
        
        print(f"[ENSEMBLE RESULTS] YOLOv11s Performance:")
        print(f"  [mAP@50-95] {ens_map50_95:.4f}")
        print(f"  [mAP@50] {ens_map50:.4f}")
        print(f"  [PRECISION] {ens_precision:.4f}")
        print(f"  [RECALL] {ens_recall:.4f}")
        
        print(f"\n[ENSEMBLE ANALYSIS]:")
        if ens_map50_95 > 0.8:
            print(f"  EXCELLENT: YOLOv11s >80% - Great for ensemble")
        elif ens_map50_95 > 0.75:
            print(f"  GOOD: YOLOv11s >75% - Suitable for ensemble")
        else:
            print(f"  FAIR: YOLOv11s performance - May need tuning")
            
    except Exception as e:
        print(f"Could not parse ensemble metrics: {e}")

print(f"\n[READY] Both models trained - Ready for ensemble inference!")

ENSEMBLE STRATEGY: TRAINING YOLOv11s AS COMPLEMENTARY MODEL

[ENSEMBLE] YOLOv11s Complementary Training
[ENSEMBLE] YOLOv11s Complementary Configuration:
  [PURPOSE] Ensemble partner for YOLOv11m
  [STRATEGY] Different augmentation and learning
  [BATCH] Larger batch: 20
  [LR] Higher LR: 0.012
  [EPOCHS] 120 epochs

[TRAINING] YOLOv11s Ensemble Model...
[KDownloading https://github.com/ultralytics/assets/releases/download/v8.3.0/yolo11s.pt to 'yolo11s.pt': 100% ━━━━━━━━━━━━ 18.4MB 61.0MB/s 0.3s0.3s<0.2s
[KDownloading https://github.com/ultralytics/assets/releases/download/v8.3.0/yolo11s.pt to 'yolo11s.pt': 100% ━━━━━━━━━━━━ 18.4MB 61.0MB/s 0.3s
New https://pypi.org/project/ultralytics/8.3.228 available  Update with 'pip install -U ultralytics'
Ultralytics 8.3.221  Python-3.12.5 torch-2.5.1+cu121 CUDA:0 (NVIDIA GeForce RTX 4090, 24564MiB)
[34m[1mengine\trainer: [0magnostic_nms=False, amp=True, augment=False, auto_augment=randaugment, batch=20, bgr=0.0, box=8.0, cache=disk, cfg=None

---
## Strategy 3: Advanced Ensemble Inference Framework

### Multi-Model Prediction Fusion

This section implements advanced ensemble inference techniques to combine predictions from multiple trained models (Enhanced YOLOv11m, YOLOv11s Complement, Original YOLOv11m) to achieve superior detection performance targeting >80% mAP@50-95.

#### Ensemble Methodology:
- **Model Loading**: Dynamically loads all available trained models from the training pipeline
- **Prediction Fusion**: Combines predictions using weighted averaging and confidence-based selection
- **Performance Optimization**: Implements test-time augmentation (TTA) for enhanced robustness
- **Benchmark Evaluation**: Comprehensive evaluation against validation dataset with detailed metrics

In [None]:
# STRATEGY 3: Advanced Ensemble Inference - Breaking 80% Barrier
import torch
import numpy as np
from pathlib import Path
from ultralytics import YOLO
import cv2
from collections import defaultdict

print("="*80)
print("ADVANCED ENSEMBLE INFERENCE - TARGETING >80% mAP@50-95")
print("="*80)

# Load both models for ensemble
PROJECT_DIR = Path('monuai_model')

# Model paths
ENHANCED_M_WEIGHTS = PROJECT_DIR / 'YOLOv11m_teacher_enhanced' / 'weights' / 'best.pt'
ENSEMBLE_S_WEIGHTS = PROJECT_DIR / 'YOLOv11s_ensemble_complement' / 'weights' / 'best.pt'
ORIGINAL_M_WEIGHTS = PROJECT_DIR / 'YOLOv11m_teacher' / 'weights' / 'best.pt'

# Check available models
available_models = []
model_weights = {}

if ENHANCED_M_WEIGHTS.exists():
    available_models.append("Enhanced YOLOv11m")
    model_weights["enhanced_m"] = ENHANCED_M_WEIGHTS
    
if ENSEMBLE_S_WEIGHTS.exists():
    available_models.append("Ensemble YOLOv11s")
    model_weights["ensemble_s"] = ENSEMBLE_S_WEIGHTS
    
if ORIGINAL_M_WEIGHTS.exists():
    available_models.append("Original YOLOv11m")
    model_weights["original_m"] = ORIGINAL_M_WEIGHTS

print(f"[AVAILABLE MODELS] Found {len(available_models)} models:")
for model in available_models:
    print(f"  ✓ {model}")

if len(model_weights) >= 2:
    print(f"\n[ENSEMBLE STRATEGY] Multi-Model Inference")
    print("="*45)
    
    # Load all available models
    models = {}
    for key, weights_path in model_weights.items():
        print(f"Loading {key} model...")
        models[key] = YOLO(str(weights_path))
        
    print(f"✅ Loaded {len(models)} models for ensemble")
    
    # Test-Time Augmentation Ensemble Function
    def ensemble_predict(models, image_path, conf_threshold=0.8, iou_threshold=0.6):
        """Ensemble prediction with TTA"""
        all_predictions = []
        
        for model_name, model in models.items():
            # Standard prediction
            pred = model.predict(
                source=str(image_path),
                conf=conf_threshold,
                iou=iou_threshold,
                max_det=300,
                augment=False,  # We'll do our own TTA
                verbose=False
            )
            
            if pred and len(pred) > 0 and pred[0].boxes is not None:
                for box in pred[0].boxes:
                    all_predictions.append({
                        'model': model_name,
                        'cls': int(box.cls.item()),
                        'conf': float(box.conf.item()),
                        'bbox': box.xyxy[0].tolist(),
                        'xywhn': box.xywhn[0].tolist() if hasattr(box, 'xywhn') else None
                    })
                    
            # TTA - Horizontal flip prediction
            try:
                pred_flip = model.predict(
                    source=str(image_path),
                    conf=conf_threshold,
                    iou=iou_threshold,
                    max_det=300,
                    augment=True,  # Use built-in TTA
                    verbose=False
                )
                
                if pred_flip and len(pred_flip) > 0 and pred_flip[0].boxes is not None:
                    for box in pred_flip[0].boxes:
                        all_predictions.append({
                            'model': f"{model_name}_tta",
                            'cls': int(box.cls.item()),
                            'conf': float(box.conf.item()) * 0.9,  # Slightly lower weight for TTA
                            'bbox': box.xyxy[0].tolist(),
                            'xywhn': box.xywhn[0].tolist() if hasattr(box, 'xywhn') else None
                        })
            except:
                pass  # Skip TTA if it fails
                
        return all_predictions
    
    # Weighted ensemble function
    def weighted_ensemble_nms(predictions, weights=None, iou_threshold=0.5):
        """Apply weighted ensemble with NMS"""
        if not predictions:
            return []
            
        # Default weights
        if weights is None:
            weights = {
                'enhanced_m': 0.6,
                'original_m': 0.2,
                'ensemble_s': 0.1,
                'enhanced_m_tta': 0.07,
                'original_m_tta': 0.02,
                'ensemble_s_tta': 0.01
            }
        
        # Group by class
        class_predictions = defaultdict(list)
        for pred in predictions:
            model_weight = weights.get(pred['model'], 0.1)
            pred['weighted_conf'] = pred['conf'] * model_weight
            class_predictions[pred['cls']].append(pred)
        
        final_predictions = []
        
        # Apply weighted NMS per class
        for cls, cls_preds in class_predictions.items():
            if not cls_preds:
                continue
                
            # Sort by weighted confidence
            cls_preds.sort(key=lambda x: x['weighted_conf'], reverse=True)
            
            selected = []
            for pred in cls_preds:
                # Check IoU with already selected boxes
                keep = True
                for sel in selected:
                    if calculate_iou(pred['bbox'], sel['bbox']) > iou_threshold:
                        keep = False
                        break
                
                if keep:
                    selected.append(pred)
                    
            final_predictions.extend(selected)
        
        return final_predictions
    
    def calculate_iou(box1, box2):
        """Calculate IoU between two boxes"""
        x1_min, y1_min, x1_max, y1_max = box1
        x2_min, y2_min, x2_max, y2_max = box2
        
        # Intersection
        inter_x_min = max(x1_min, x2_min)
        inter_y_min = max(y1_min, y2_min)
        inter_x_max = min(x1_max, x2_max)
        inter_y_max = min(y1_max, y2_max)
        
        if inter_x_max <= inter_x_min or inter_y_max <= inter_y_min:
            return 0.0
            
        inter_area = (inter_x_max - inter_x_min) * (inter_y_max - inter_y_min)
        
        # Union
        area1 = (x1_max - x1_min) * (y1_max - y1_min)
        area2 = (x2_max - x2_min) * (y2_max - y2_min)
        union_area = area1 + area2 - inter_area
        
        return inter_area / union_area if union_area > 0 else 0.0
    
    print(f"\n[ENSEMBLE EVALUATION] Testing ensemble on validation set")
    print("="*55)
    
    # Evaluate ensemble on validation set
    DATASET_PATH = Path(r"D:\SIT\AAI3001 Computer Vision\Project\monuai_model\Project2_YOLO_wimages")
    val_img_dir = DATASET_PATH / "balanced_yolo_dataset" / "images" / "val"
    
    if val_img_dir.exists():
        val_images = list(val_img_dir.glob("*.jpg")) + list(val_img_dir.glob("*.png"))
        
        if len(val_images) > 0:
            print(f"Found {len(val_images)} validation images")
            
            # Test ensemble on a subset for speed
            test_images = val_images[:50]  # Test on 50 images
            
            ensemble_results = []
            
            print("Running ensemble prediction on validation subset...")
            for img_path in test_images[:10]:  # Quick test on 10 images
                try:
                    predictions = ensemble_predict(models, img_path)
                    final_preds = weighted_ensemble_nms(predictions)
                    
                    ensemble_results.append({
                        'image': img_path.name,
                        'predictions': len(final_preds),
                        'max_conf': max([p['weighted_conf'] for p in final_preds]) if final_preds else 0
                    })
                    
                except Exception as e:
                    print(f"Error processing {img_path.name}: {e}")
            
            if ensemble_results:
                avg_preds = np.mean([r['predictions'] for r in ensemble_results])
                avg_conf = np.mean([r['max_conf'] for r in ensemble_results])
                
                print(f"\n[ENSEMBLE RESULTS] Quick validation:")
                print(f"   Average predictions per image: {avg_preds:.1f}")
                print(f"   Average max confidence: {avg_conf:.3f}")
                
                print(f"\n[RECOMMENDATION] Ensemble Strategy:")
                if avg_conf > 0.8 and avg_preds > 0:
                    print(f"   EXCELLENT: High confidence ensemble")
                    print(f"   Likely to achieve >80% mAP@50-95")
                elif avg_conf > 0.7:
                    print(f"   GOOD: Solid ensemble performance")
                    print(f"   May achieve >79% mAP@50-95")
                else:
                    print(f"  ⚠️ NEEDS_TUNING: Adjust ensemble weights")
                    
        else:
            print("No validation images found for ensemble testing")
    else:
        print("Validation directory not found")
        
    print(f"\n[SUMMARY] Ensemble Strategy Summary:")
    print(f"   YOLOv11s Complement: Different perspective")
    print(f"   Test-Time Augmentation: Additional robustness")
    print(f"   Weighted ensemble: Optimized combination")
    print(f"   Multiple strategies: Maximum performance")
    
else:
    print(f"\n[ERROR] Need at least 2 models for ensemble")
    print(f"Available: {len(model_weights)} models")
    print(f"Train more models using the strategies above!")
    
print(f"\n[NEXT STEPS] If ensemble doesn't reach 80%:")
print(f"  1. Progressive resizing (start 416 → 640)")
print(f"  2. Longer training (200+ epochs)")
print(f"  3. Different optimizers (SGD vs AdamW)")
print(f"  4. Advanced loss functions")
print(f"  5. External data augmentation")

ADVANCED ENSEMBLE INFERENCE - TARGETING >80% mAP@50-95
[AVAILABLE MODELS] Found 3 models:
  ✓ Enhanced YOLOv11m
  ✓ Ensemble YOLOv11s
  ✓ Original YOLOv11m

[ENSEMBLE STRATEGY] Multi-Model Inference
Loading enhanced_m model...
Loading ensemble_s model...
Loading original_m model...
✅ Loaded 3 models for ensemble

[ENSEMBLE EVALUATION] Testing ensemble on validation set
Found 359 validation images
Running ensemble prediction on validation subset...

[ENSEMBLE RESULTS] Quick validation:
   Average predictions per image: 1.5
   Average max confidence: 0.550

[RECOMMENDATION] Ensemble Strategy:
  ⚠️ NEEDS_TUNING: Adjust ensemble weights

[SUMMARY] Ensemble Strategy Summary:
   YOLOv11s Complement: Different perspective
   Test-Time Augmentation: Additional robustness
   Weighted ensemble: Optimized combination
   Multiple strategies: Maximum performance

[NEXT STEPS] If ensemble doesn't reach 80%:
  1. Progressive resizing (start 416 → 640)
  2. Longer training (200+ epochs)
  3. Different

---
## Winner-Takes-All Ensemble Strategy

### Confidence-Based Model Selection

This section implements a winner-takes-all ensemble approach that preserves the highest raw confidence predictions from multiple models without averaging or multiplication. This strategy maintains prediction integrity by selecting the most confident detection for each object while avoiding the confidence dilution that can occur with traditional averaging methods.

#### Implementation Features:
- **Raw Confidence Preservation**: No confidence score multiplication or averaging that could reduce detection strength
- **Model Competition**: Each model competes based on raw confidence scores for optimal prediction selection
- **Performance Optimization**: Combines model diversity benefits while maintaining individual model confidence levels
- **Robust Detection**: Leverages different model strengths to achieve consistent landmark detection performance

In [8]:
# Winner-Takes-All Ensemble - Preserve Highest Raw Confidence
import torch
import numpy as np
from pathlib import Path
from ultralytics import YOLO
import cv2
from collections import defaultdict

print("="*80)
print("WINNER-TAKES-ALL ENSEMBLE - PRESERVE HIGHEST CONFIDENCE")
print("="*80)

# Load models for winner-takes-all ensemble
PROJECT_DIR = Path('monuai_model')

# Model paths
ENHANCED_M_WEIGHTS = PROJECT_DIR / 'YOLOv11m_teacher_enhanced' / 'weights' / 'best.pt'
ENSEMBLE_S_WEIGHTS = PROJECT_DIR / 'YOLOv11s_ensemble_complement' / 'weights' / 'best.pt'
ORIGINAL_M_WEIGHTS = PROJECT_DIR / 'YOLOv11m_teacher' / 'weights' / 'best.pt'

# Check available models
available_models = []
model_weights = {}

if ENHANCED_M_WEIGHTS.exists():
    available_models.append("Enhanced YOLOv11m")
    model_weights["enhanced_m"] = ENHANCED_M_WEIGHTS
    
if ENSEMBLE_S_WEIGHTS.exists():
    available_models.append("Ensemble YOLOv11s")
    model_weights["ensemble_s"] = ENSEMBLE_S_WEIGHTS
    
if ORIGINAL_M_WEIGHTS.exists():
    available_models.append("Original YOLOv11m")
    model_weights["original_m"] = ORIGINAL_M_WEIGHTS

print(f"[AVAILABLE MODELS] Found {len(available_models)} models:")
for model in available_models:
    print(f"  ✓ {model}")

if len(model_weights) >= 2:
    print(f"\n[WINNER-TAKES-ALL STRATEGY] Preserve Highest Raw Confidence")
    print("="*60)
    
    # Load all available models
    models = {}
    for key, weights_path in model_weights.items():
        print(f"Loading {key} model...")
        models[key] = YOLO(str(weights_path))
        
    print(f"✅ Loaded {len(models)} models for winner-takes-all ensemble")
    
    # Winner-Takes-All Ensemble Function
    def winner_takes_all_predict(models, image_path, conf_threshold=0.25, iou_threshold=0.4):
        """Winner-takes-all ensemble - keep highest confidence per spatial region"""
        all_predictions = []
        
        # Collect all predictions with raw confidences
        for model_name, model in models.items():
            pred = model.predict(
                source=str(image_path),
                conf=conf_threshold,
                iou=iou_threshold,
                max_det=300,
                augment=False,
                verbose=False
            )
            
            if pred and len(pred) > 0 and pred[0].boxes is not None:
                for box in pred[0].boxes:
                    all_predictions.append({
                        'model': model_name,
                        'cls': int(box.cls.item()),
                        'conf': float(box.conf.item()),  # RAW confidence - NO multiplication
                        'bbox': box.xyxy[0].tolist()
                    })
        
        return all_predictions
    
    # Winner-Takes-All NMS Function
    def winner_takes_all_nms(predictions, iou_threshold=0.4):
        """Apply winner-takes-all: keep highest confidence prediction per spatial region"""
        if not predictions:
            return []
        
        # Group predictions by class
        class_predictions = defaultdict(list)
        for pred in predictions:
            class_predictions[pred['cls']].append(pred)
        
        final_predictions = []
        
        # Apply winner-takes-all NMS per class
        for cls, cls_preds in class_predictions.items():
            if not cls_preds:
                continue
            
            # Sort by RAW confidence (highest first)
            cls_preds.sort(key=lambda x: x['conf'], reverse=True)
            
            selected = []
            for pred in cls_preds:
                # Check IoU with already selected boxes
                keep = True
                for sel in selected:
                    if calculate_iou_simple(pred['bbox'], sel['bbox']) > iou_threshold:
                        # WINNER-TAKES-ALL: Keep the one with higher confidence
                        if pred['conf'] > sel['conf']:
                            # Replace lower confidence prediction
                            selected.remove(sel)
                        else:
                            # Skip this prediction
                            keep = False
                        break
                
                if keep:
                    selected.append(pred)
            
            final_predictions.extend(selected)
        
        return final_predictions
    
    def calculate_iou_simple(box1, box2):
        """Calculate IoU between two boxes"""
        x1_min, y1_min, x1_max, y1_max = box1
        x2_min, y2_min, x2_max, y2_max = box2
        
        # Intersection
        inter_x_min = max(x1_min, x2_min)
        inter_y_min = max(y1_min, y2_min)
        inter_x_max = min(x1_max, x2_max)
        inter_y_max = min(y1_max, y2_max)
        
        if inter_x_max <= inter_x_min or inter_y_max <= inter_y_min:
            return 0.0
            
        inter_area = (inter_x_max - inter_x_min) * (inter_y_max - inter_y_min)
        
        # Union
        area1 = (x1_max - x1_min) * (y1_max - y1_min)
        area2 = (x2_max - x2_min) * (y2_max - y2_min)
        union_area = area1 + area2 - inter_area
        
        return inter_area / union_area if union_area > 0 else 0.0
    
    print(f"\n[WINNER-TAKES-ALL EVALUATION] Testing on validation subset")
    print("="*60)
    
    # Test on validation images
    DATASET_PATH = Path(r"D:\SIT\AAI3001 Computer Vision\Project\monuai_model\Project2_YOLO_wimages")
    val_img_dir = DATASET_PATH / "balanced_yolo_dataset" / "images" / "val"
    
    if val_img_dir.exists():
        val_images = list(val_img_dir.glob("*.jpg")) + list(val_img_dir.glob("*.png"))
        test_images = val_images[:10]  # Same subset as previous tests
        
        print(f"Testing winner-takes-all on {len(test_images)} validation images...")
        
        winner_results = []
        
        # Test winner-takes-all ensemble
        for img_path in test_images:
            try:
                # Get all predictions (raw confidences)
                all_preds = winner_takes_all_predict(models, img_path)
                
                # Apply winner-takes-all NMS
                final_preds = winner_takes_all_nms(all_preds)
                
                winner_results.append({
                    'image': img_path.name,
                    'predictions': len(final_preds),
                    'max_conf': max([p['conf'] for p in final_preds]) if final_preds else 0,
                    'avg_conf': np.mean([p['conf'] for p in final_preds]) if final_preds else 0,
                    'all_confs': [p['conf'] for p in final_preds]
                })
                
            except Exception as e:
                print(f"Error processing {img_path.name}: {e}")
        
        # Calculate statistics
        if winner_results:
            avg_preds = np.mean([r['predictions'] for r in winner_results])
            avg_max_conf = np.mean([r['max_conf'] for r in winner_results if r['max_conf'] > 0])
            avg_avg_conf = np.mean([r['avg_conf'] for r in winner_results if r['avg_conf'] > 0])
            
            print(f"\n🏆 [WINNER-TAKES-ALL RESULTS]:")
            print(f"   Average predictions per image: {avg_preds:.1f}")
            print(f"   Average max confidence: {avg_max_conf:.3f}")
            print(f"   Average overall confidence: {avg_avg_conf:.3f}")
            
            # Compare with previous weighted ensemble
            print(f"\n [COMPARISON] Winner-Takes-All vs Weighted Ensemble:")
            print(f"   WEIGHTED ENSEMBLE:")
            print(f"     - Average predictions: 1.5")
            print(f"     - Average max confidence: 0.550")
            print(f"     - Method: conf * weight (dilutes confidence)")
            
            print(f"\n   WINNER-TAKES-ALL:")
            print(f"     - Average predictions: {avg_preds:.1f}")
            print(f"     - Average max confidence: {avg_max_conf:.3f}")
            print(f"     - Method: preserve highest raw confidence")
            
            # Performance analysis
            improvement = (avg_max_conf - 0.550) / 0.550 * 100 if avg_max_conf > 0.550 else 0
            
            print(f"\n [ANALYSIS]:")
            if avg_max_conf > 0.8:
                print(f"    EXCELLENT: Winner-takes-all preserves high confidence!")
            elif avg_max_conf > 0.6:
                print(f"    GOOD: Significant improvement over weighted ensemble")
            else:
                print(f"    Similar issues: May need different approach")
            
            # Individual image analysis
            print(f"\n [PER-IMAGE BREAKDOWN]:")
            high_conf_images = [r for r in winner_results if r['max_conf'] > 0.8]
            medium_conf_images = [r for r in winner_results if 0.5 < r['max_conf'] <= 0.8]
            low_conf_images = [r for r in winner_results if r['max_conf'] <= 0.5]
            
            print(f"   High confidence (>0.8): {len(high_conf_images)}/{len(winner_results)} images")
            print(f"   Medium confidence (0.5-0.8): {len(medium_conf_images)}/{len(winner_results)} images")
            print(f"   Low confidence (<0.5): {len(low_conf_images)}/{len(winner_results)} images")
            
            if high_conf_images:
                avg_high_conf = np.mean([r['max_conf'] for r in high_conf_images])
                print(f"   Average high confidence: {avg_high_conf:.3f}")
            
            # Model contribution analysis
            print(f"\n [MODEL CONTRIBUTION ANALYSIS]:")
            print("   Checking which models contribute winning predictions...")
            
            model_wins = defaultdict(int)
            for img_path in test_images[:3]:  # Analyze first 3 images
                try:
                    all_preds = winner_takes_all_predict(models, img_path)
                    final_preds = winner_takes_all_nms(all_preds)
                    
                    for pred in final_preds:
                        model_wins[pred['model']] += 1
                        
                except:
                    pass
            
            if model_wins:
                print(f"   Model contribution to winning predictions:")
                for model, wins in sorted(model_wins.items(), key=lambda x: x[1], reverse=True):
                    print(f"     - {model}: {wins} winning predictions")
            
            # Final recommendation
            print(f"\n🎯 [RECOMMENDATION]:")
            if avg_max_conf > 0.8:
                print(f"    USE WINNER-TAKES-ALL ENSEMBLE")
            else:
                print(f"    Still not optimal - try single best model approach")
                print(f"    Enhanced YOLOv11m alone might be best (0.916 confidence)")
        
        else:
            print("❌ No results obtained")
    else:
        print("❌ Validation directory not found")
        
else:
    print(f"\n[ERROR] Need at least 2 models for ensemble")
    print(f"Available: {len(model_weights)} models")

print(f"\n[SUMMARY] Winner-Takes-All Key Benefits:")
print(f"   Preserves raw confidence scores (no multiplication)")
print(f"   Winner gets full confidence (no averaging)")
print(f"   Combines model diversity with confidence preservation")
print(f"   Should perform closer to individual model levels")

WINNER-TAKES-ALL ENSEMBLE - PRESERVE HIGHEST CONFIDENCE
[AVAILABLE MODELS] Found 3 models:
  ✓ Enhanced YOLOv11m
  ✓ Ensemble YOLOv11s
  ✓ Original YOLOv11m

[WINNER-TAKES-ALL STRATEGY] Preserve Highest Raw Confidence
Loading enhanced_m model...
Loading ensemble_s model...
Loading original_m model...
✅ Loaded 3 models for winner-takes-all ensemble

[WINNER-TAKES-ALL EVALUATION] Testing on validation subset
Testing winner-takes-all on 10 validation images...
✅ Loaded 3 models for winner-takes-all ensemble

[WINNER-TAKES-ALL EVALUATION] Testing on validation subset
Testing winner-takes-all on 10 validation images...

🏆 [WINNER-TAKES-ALL RESULTS]:
   Average predictions per image: 1.5
   Average max confidence: 0.920
   Average overall confidence: 0.914

 [COMPARISON] Winner-Takes-All vs Weighted Ensemble:
   WEIGHTED ENSEMBLE:
     - Average predictions: 1.5
     - Average max confidence: 0.550
     - Method: conf * weight (dilutes confidence)

   WINNER-TAKES-ALL:
     - Average predict

---
## Knowledge Distillation Framework: Student Model Training

### YOLOv11n Student Model Development

This section implements knowledge distillation techniques to train a compact YOLOv11n student model using the optimized YOLOv11m teacher model. The approach prioritizes maintaining detection accuracy while achieving significant model compression for deployment efficiency.

#### Distillation Methodology
The implementation follows a systematic approach:

1. **Teacher Model Integration**: Load the best-performing teacher weights from the YOLOv11m training phase
2. **Student Hyperparameter Configuration**: Optimized lightweight parameters suited for the compressed architecture  
3. **Knowledge Distillation Execution**: Attempt advanced teacher-guided training using Ultralytics' distillation API
4. **Fallback Training Strategy**: Standard training implementation if knowledge distillation is unavailable
5. **Model Export Pipeline**: Comprehensive export including quantized versions (FP16, INT8) and standard deployment formats (ONNX, TFLite)

#### Performance Objectives
- **Model Compression**: Achieve 4-8x parameter reduction while maintaining competitive accuracy
- **Deployment Optimization**: Generate models suitable for edge computing and mobile deployment
- **Knowledge Transfer Efficiency**: Maximize knowledge retention from teacher to student architecture

In [9]:
# Optimized Knowledge Distillation: YOLOv11n Student Using Best Teacher Practices
from ultralytics import YOLO
import torch
from pathlib import Path
import shutil
import os
import numpy as np
from tqdm import tqdm
from collections import defaultdict

print("="*80)
print("OPTIMIZED KNOWLEDGE DISTILLATION - LEVERAGING TEACHER SUCCESS")
print("="*80)

# Configuration and paths
DATASET_PATH = Path(r"D:\SIT\AAI3001 Computer Vision\Project\monuai_model\Project2_YOLO_wimages")
YOLO_DATASET_DIR = DATASET_PATH / "balanced_yolo_dataset"
yaml_path = YOLO_DATASET_DIR / "dataset.yaml"

if not yaml_path.exists():
    raise FileNotFoundError(f"Dataset not found at {yaml_path}. Run data preparation first.")

PROJECT_DIR = Path('monuai_model')

# Use ensemble teacher for pseudo-labels instead of single teacher
ENHANCED_M_WEIGHTS = PROJECT_DIR / 'YOLOv11m_teacher_enhanced' / 'weights' / 'best.pt'
ORIGINAL_M_WEIGHTS = PROJECT_DIR / 'YOLOv11m_teacher' / 'weights' / 'best.pt'
ENSEMBLE_S_WEIGHTS = PROJECT_DIR / 'YOLOv11s_ensemble_complement' / 'weights' / 'best.pt'

# Load available teacher models for ensemble pseudo-labeling
teacher_models = {}
if ENHANCED_M_WEIGHTS.exists():
    teacher_models["enhanced_m"] = ENHANCED_M_WEIGHTS
if ORIGINAL_M_WEIGHTS.exists():
    teacher_models["original_m"] = ORIGINAL_M_WEIGHTS
if ENSEMBLE_S_WEIGHTS.exists():
    teacher_models["ensemble_s"] = ENSEMBLE_S_WEIGHTS

if not teacher_models:
    raise FileNotFoundError("No teacher models found. Train teacher models first.")

print(f"Using {len(teacher_models)} teacher models for ensemble pseudo-labeling:")
for name, path in teacher_models.items():
    print(f"  - {name}: {path}")

device = 'cuda' if torch.cuda.is_available() else 'cpu'
print(f"Training device: {device}")

# Optimized Knowledge Distillation Configuration
print(f"\nOPTIMIZED KNOWLEDGE DISTILLATION CONFIGURATION")
print("="*50)

# Student model hyperparameters (aligned with teacher success)
STUDENT_CONFIG = {
    'epochs': 100,             # Extended training like enhanced teacher
    'imgsz': 640,              # Same as teacher
    'batch': 18,               # Match successful teacher batch size
    'lr0': 0.01,               # Match successful teacher LR
    'lrf': 0.1,                # Match teacher final LR
    'momentum': 0.937,         # Match teacher momentum
    'weight_decay': 0.0005,    # Match teacher weight decay
    'warmup_epochs': 3.0,      # Match teacher warmup
    'patience': 15,            # Reasonable patience for student
    'optimizer': 'AdamW',      # Match teacher optimizer
    'cos_lr': True,            # Match teacher LR scheduler
}

# High-confidence pseudo-label settings (leverage best teacher performance)
PSEUDO_LABEL_CONFIG = {
    'conf_threshold': 0.4,     # Higher threshold for quality pseudo-labels
    'iou_threshold': 0.4,      # Match winner-takes-all IoU
    'max_det': 300,            # Maximum detections per image
    'augment': False,          # Conservative approach like successful teachers
    'max_pseudo_per_image': 20, # Fewer but higher quality pseudo labels
}

# Conservative augmentation strategy (aligned with teacher success)
STUDENT_AUGMENTATION = {
    # Match successful teacher augmentation strategy
    'hsv_h': 0.015,            # Minimal HSV like enhanced teacher
    'hsv_s': 0.1,              # Conservative saturation
    'hsv_v': 0.1,              # Conservative value
    'degrees': 0.0,            # NO rotation (preserve landmarks)
    'translate': 0.05,         # Small translation like teacher
    'scale': 0.1,              # Small scale variation
    'shear': 0.0,              # NO shear (preserve shape)
    'perspective': 0.0,        # NO perspective
    'flipud': 0.0,             # NO vertical flip
    'fliplr': 0.1,             # Minimal horizontal flip
    
    # Conservative advanced augmentations
    'mosaic': 0.2,             # Moderate mosaic like teacher
    'mixup': 0.0,              # NO mixup for landmarks
    'copy_paste': 0.0,         # NO copy-paste
    'close_mosaic': 15,        # Close mosaic mid-training
    'auto_augment': None,      # Disable auto augment
    'erasing': 0.0,            # No random erasing
}

STUDENT_RUN = 'yolov11n_student_optimized'
print(f"Student model configuration:")
print(f"  Epochs: {STUDENT_CONFIG['epochs']} (match teacher)")
print(f"  Image size: {STUDENT_CONFIG['imgsz']}")
print(f"  Batch size: {STUDENT_CONFIG['batch']} (match successful teacher)")
print(f"  Learning rate: {STUDENT_CONFIG['lr0']} (match teacher)")
print(f"  Conservative augmentation: Enabled (landmark-preserving)")
print(f"  High-quality pseudo-labeling: conf_threshold={PSEUDO_LABEL_CONFIG['conf_threshold']}")

# Enhanced Pseudo-Label Generation using Ensemble
print(f"\nENSEMBLE PSEUDO-LABEL GENERATION")
print("="*40)

def winner_takes_all_pseudo_labels():
    """Generate high-quality pseudo labels using winner-takes-all ensemble"""
    
    # Load all teacher models
    teachers = {}
    for name, weights_path in teacher_models.items():
        print(f"Loading teacher: {name}")
        teachers[name] = YOLO(str(weights_path))
    
    def ensemble_predict_for_pseudo(teachers, image_path, conf_threshold=0.4, iou_threshold=0.4):
        """Winner-takes-all ensemble for pseudo-label generation"""
        all_predictions = []
        
        for model_name, model in teachers.items():
            pred = model.predict(
                source=str(image_path),
                conf=conf_threshold,
                iou=iou_threshold,
                max_det=300,
                augment=False,
                verbose=False
            )
            
            if pred and len(pred) > 0 and pred[0].boxes is not None:
                for box in pred[0].boxes:
                    all_predictions.append({
                        'model': model_name,
                        'cls': int(box.cls.item()),
                        'conf': float(box.conf.item()),
                        'bbox': box.xyxy[0].tolist(),
                        'img_shape': pred[0].orig_shape
                    })
        
        return all_predictions
    
    def winner_takes_all_nms_pseudo(predictions, iou_threshold=0.4):
        """Apply winner-takes-all NMS for pseudo-label generation"""
        if not predictions:
            return []
        
        class_predictions = defaultdict(list)
        for pred in predictions:
            class_predictions[pred['cls']].append(pred)
        
        final_predictions = []
        for cls, cls_preds in class_predictions.items():
            if not cls_preds:
                continue
            
            cls_preds.sort(key=lambda x: x['conf'], reverse=True)
            selected = []
            
            for pred in cls_preds:
                keep = True
                for sel in selected:
                    if calculate_iou_boxes(pred['bbox'], sel['bbox']) > iou_threshold:
                        if pred['conf'] > sel['conf']:
                            selected.remove(sel)
                        else:
                            keep = False
                        break
                
                if keep:
                    selected.append(pred)
            
            final_predictions.extend(selected)
        
        return final_predictions
    
    def calculate_iou_boxes(box1, box2):
        """Calculate IoU between two boxes"""
        x1_min, y1_min, x1_max, y1_max = box1
        x2_min, y2_min, x2_max, y2_max = box2
        
        inter_x_min = max(x1_min, x2_min)
        inter_y_min = max(y1_min, y2_min)
        inter_x_max = min(x1_max, x2_max)
        inter_y_max = min(y1_max, y2_max)
        
        if inter_x_max <= inter_x_min or inter_y_max <= inter_y_min:
            return 0.0
            
        inter_area = (inter_x_max - inter_x_min) * (inter_y_max - inter_y_min)
        area1 = (x1_max - x1_min) * (y1_max - y1_min)
        area2 = (x2_max - x2_min) * (y2_max - y2_min)
        union_area = area1 + area2 - inter_area
        
        return inter_area / union_area if union_area > 0 else 0.0
    
    # Create optimized KD dataset directory
    KD_DATASET_DIR = PROJECT_DIR / 'optimized_kd_dataset'
    KD_IMG_TRAIN = KD_DATASET_DIR / 'images' / 'train'
    KD_IMG_VAL = KD_DATASET_DIR / 'images' / 'val'
    KD_LABEL_TRAIN = KD_DATASET_DIR / 'labels' / 'train'
    KD_LABEL_VAL = KD_DATASET_DIR / 'labels' / 'val'
    
    for d in [KD_IMG_TRAIN, KD_IMG_VAL, KD_LABEL_TRAIN, KD_LABEL_VAL]:
        d.mkdir(parents=True, exist_ok=True)
    
    # Copy training images
    print("Copying training images...")
    train_img_dir = YOLO_DATASET_DIR / 'images' / 'train'
    for img_file in tqdm(list(train_img_dir.glob('*'))):
        if img_file.suffix.lower() in ['.jpg', '.jpeg', '.png']:
            shutil.copy(img_file, KD_IMG_TRAIN / img_file.name)
    
    # Copy validation images  
    print("Copying validation images...")
    val_img_dir = YOLO_DATASET_DIR / 'images' / 'val'
    for img_file in tqdm(list(val_img_dir.glob('*'))):
        if img_file.suffix.lower() in ['.jpg', '.jpeg', '.png']:
            shutil.copy(img_file, KD_IMG_VAL / img_file.name)
    
    # Generate high-quality pseudo labels using ensemble
    print("Generating high-quality ensemble pseudo labels...")
    train_images = sorted([p for p in KD_IMG_TRAIN.glob('*') 
                          if p.suffix.lower() in ['.jpg', '.jpeg', '.png']])
    
    pseudo_stats = {'total_images': len(train_images), 'total_pseudo_boxes': 0, 'high_conf_boxes': 0}
    
    for img_path in tqdm(train_images, desc="Processing training images"):
        # Load existing ground truth labels
        gt_label_file = YOLO_DATASET_DIR / 'labels' / 'train' / f"{img_path.stem}.txt"
        gt_boxes = []
        
        if gt_label_file.exists():
            with open(gt_label_file, 'r') as f:
                for line in f:
                    if line.strip():
                        gt_boxes.append(line.strip())
        
        # Generate ensemble predictions
        all_preds = ensemble_predict_for_pseudo(teachers, img_path, 
                                               PSEUDO_LABEL_CONFIG['conf_threshold'],
                                               PSEUDO_LABEL_CONFIG['iou_threshold'])
        
        # Apply winner-takes-all NMS
        final_preds = winner_takes_all_nms_pseudo(all_preds, PSEUDO_LABEL_CONFIG['iou_threshold'])
        
        # Convert to YOLO format
        pseudo_boxes = []
        for pred in final_preds[:PSEUDO_LABEL_CONFIG['max_pseudo_per_image']]:
            if pred['conf'] >= PSEUDO_LABEL_CONFIG['conf_threshold']:
                cls = pred['cls']
                x1, y1, x2, y2 = pred['bbox']
                img_h, img_w = pred['img_shape']
                
                # Convert to normalized xywh format
                w = (x2 - x1) / img_w
                h = (y2 - y1) / img_h
                xc = (x1 + x2) / (2 * img_w)
                yc = (y1 + y2) / (2 * img_h)
                
                # Quality filtering
                if 0.01 <= w <= 0.99 and 0.01 <= h <= 0.99:
                    pseudo_boxes.append(f"{cls} {xc:.6f} {yc:.6f} {w:.6f} {h:.6f}")
                    if pred['conf'] > 0.6:
                        pseudo_stats['high_conf_boxes'] += 1
        
        # Combine ground truth and high-quality pseudo labels
        all_boxes = gt_boxes + pseudo_boxes
        pseudo_stats['total_pseudo_boxes'] += len(pseudo_boxes)
        
        # Save enhanced labels
        output_label_file = KD_LABEL_TRAIN / f"{img_path.stem}.txt"
        with open(output_label_file, 'w') as f:
            for box in all_boxes:
                f.write(f"{box}\n")
    
    # Copy validation labels (no pseudo labels for validation)
    print("Copying validation labels...")
    val_label_dir = YOLO_DATASET_DIR / 'labels' / 'val'
    for label_file in tqdm(list(val_label_dir.glob('*.txt'))):
        shutil.copy(label_file, KD_LABEL_VAL / label_file.name)
    
    # Create optimized dataset YAML
    kd_yaml = KD_DATASET_DIR / 'dataset.yaml'
    with open(kd_yaml, 'w') as f:
        f.write(f'# Optimized Knowledge Distillation Dataset\n')
        f.write(f'path: {KD_DATASET_DIR.as_posix()}\n')
        f.write(f'train: images/train\n')
        f.write(f'val: images/val\n')
        f.write(f'test: images/val\n')
        f.write(f'nc: 4\n')
        f.write(f'names:\n')
        f.write(f'  0: ArtScience Museum\n')
        f.write(f'  1: Esplanade\n')
        f.write(f'  2: Marina Bay Sands\n')
        f.write(f'  3: Merlion\n')
    
    print(f"Optimized pseudo-label generation complete!")
    print(f"Statistics:")
    print(f"  Total training images: {pseudo_stats['total_images']}")
    print(f"  Total pseudo boxes added: {pseudo_stats['total_pseudo_boxes']}")
    print(f"  High confidence boxes (>0.6): {pseudo_stats['high_conf_boxes']}")
    print(f"  Average pseudo boxes per image: {pseudo_stats['total_pseudo_boxes']/pseudo_stats['total_images']:.2f}")
    print(f"  Optimized dataset: {KD_DATASET_DIR}")
    
    return kd_yaml

# Generate optimized pseudo labels
optimized_yaml = winner_takes_all_pseudo_labels()

# Train student model with optimized configuration
print(f"\nOPTIMIZED STUDENT TRAINING")
print("="*30)

print("Loading YOLOv11n student model...")
student = YOLO('yolo11n.pt')
print("Student model loaded")

print("Starting optimized knowledge distillation training...")
print("Optimizations applied:")
print("  - Ensemble teacher pseudo-labeling (winner-takes-all)")
print("  - Conservative augmentation (landmark-preserving)")
print("  - High-confidence pseudo-labels (conf >= 0.4)")
print("  - Teacher-aligned hyperparameters")
print("  - Extended training schedule")

# Train student with optimized configuration
student_results = student.train(
    data=str(optimized_yaml),
    project=str(PROJECT_DIR),
    name=STUDENT_RUN,
    exist_ok=True,
    seed=42,
    deterministic=False,
    single_cls=False,
    rect=False,
    amp=True,
    fraction=1.0,
    profile=False,
    freeze=None,
    multi_scale=False,          # Conservative approach like successful teachers
    overlap_mask=True,
    mask_ratio=4,
    dropout=0.0,                # No dropout like successful teachers
    val=True,
    plots=True,
    save=True,
    save_period=20,             # Save less frequently like teachers
    cache='disk',               # Use disk cache like teachers
    device=device,
    workers=4,                  # Match teacher workers
    verbose=True,
    **STUDENT_CONFIG,
    **STUDENT_AUGMENTATION
)

print(f"\nOPTIMIZED STUDENT TRAINING COMPLETE")
print("="*40)

# Evaluate optimized student model
STUDENT_DIR = PROJECT_DIR / STUDENT_RUN
STUDENT_BEST = STUDENT_DIR / 'weights' / 'best.pt'

if STUDENT_BEST.exists():
    print(f"Student model saved: {STUDENT_BEST}")
    
    # Load and evaluate student
    student_model = YOLO(str(STUDENT_BEST))
    
    print("Evaluating optimized student model performance...")
    student_metrics = student_model.val(
        data=str(optimized_yaml),
        imgsz=STUDENT_CONFIG['imgsz'],
        batch=STUDENT_CONFIG['batch'],
        device=device,
        conf=0.25,                # Use optimized confidence threshold
        iou=0.4,                  # Use optimized IoU threshold
        plots=True,
        save_json=True,
        verbose=True
    )
    
    print(f"\nOPTIMIZED KNOWLEDGE DISTILLATION RESULTS")
    print("="*45)
    
    try:
        # Student metrics
        s_metrics = getattr(student_metrics, 'results_dict', {})
        s_map50_95 = s_metrics.get('metrics/mAP50-95(B)', 0)
        s_map50 = s_metrics.get('metrics/mAP50(B)', 0)
        s_precision = s_metrics.get('metrics/precision(B)', 0)
        s_recall = s_metrics.get('metrics/recall(B)', 0)
        
        print(f"Optimized Student (YOLOv11n) Performance:")
        print(f"  mAP@50-95: {s_map50_95:.4f}")
        print(f"  mAP@50:    {s_map50:.4f}")
        print(f"  Precision: {s_precision:.4f}")
        print(f"  Recall:    {s_recall:.4f}")
        
        # Get teacher performance for comparison
        best_teacher_path = ENHANCED_M_WEIGHTS if ENHANCED_M_WEIGHTS.exists() else ORIGINAL_M_WEIGHTS
        if best_teacher_path.exists():
            teacher_size = best_teacher_path.stat().st_size / (1024**2)
            student_size = STUDENT_BEST.stat().st_size / (1024**2)
            compression_ratio = teacher_size / student_size
            
            print(f"\nModel Efficiency Analysis:")
            print(f"  Teacher size: {teacher_size:.2f} MB")
            print(f"  Student size: {student_size:.2f} MB")
            print(f"  Compression ratio: {compression_ratio:.1f}x smaller")
            
            # Performance retention analysis
            print(f"\nKnowledge Distillation Success Analysis:")
            if s_map50_95 > 0.75:
                print(f"  EXCELLENT: Student retained >75% teacher knowledge")
                print(f"  Ready for efficient deployment")
            elif s_map50_95 > 0.65:
                print(f"  GOOD: Student retained >65% teacher knowledge")
                print(f"  Suitable for mobile deployment")
            elif s_map50_95 > 0.55:
                print(f"  FAIR: Student retained >55% teacher knowledge")
                print(f"  May need further optimization")
            else:
                print(f"  NEEDS IMPROVEMENT: Student retention below expectations")
                
        # Deployment recommendation
        print(f"\nDeployment Recommendations:")
        print(f"  Model: YOLOv11n Student ({student_size:.1f} MB)")
        print(f"  Optimal thresholds: conf=0.25, iou=0.4")
        print(f"  Target platforms: Mobile, edge devices, real-time applications")
        print(f"  Expected inference speed: 3-5x faster than teacher")
            
    except Exception as e:
        print(f"Could not analyze metrics: {e}")
        
else:
    print(f"Student training failed. Check logs in {STUDENT_DIR}")

print(f"\nOPTIMIZED KNOWLEDGE DISTILLATION COMPLETE!")
print("="*45)
print(f"Optimizations Applied:")
print(f"  - Ensemble teacher pseudo-labeling")
print(f"  - Conservative landmark-preserving augmentation")
print(f"  - Teacher-aligned hyperparameters")
print(f"  - High-confidence pseudo-label filtering")
print(f"  - Winner-takes-all confidence preservation")
print(f"Student model ready for efficient deployment!")

OPTIMIZED KNOWLEDGE DISTILLATION - LEVERAGING TEACHER SUCCESS
Using 3 teacher models for ensemble pseudo-labeling:
  - enhanced_m: monuai_model\YOLOv11m_teacher_enhanced\weights\best.pt
  - original_m: monuai_model\YOLOv11m_teacher\weights\best.pt
  - ensemble_s: monuai_model\YOLOv11s_ensemble_complement\weights\best.pt
Training device: cuda

OPTIMIZED KNOWLEDGE DISTILLATION CONFIGURATION
Student model configuration:
  Epochs: 100 (match teacher)
  Image size: 640
  Batch size: 18 (match successful teacher)
  Learning rate: 0.01 (match teacher)
  Conservative augmentation: Enabled (landmark-preserving)
  High-quality pseudo-labeling: conf_threshold=0.4

ENSEMBLE PSEUDO-LABEL GENERATION
Loading teacher: enhanced_m
Loading teacher: original_m
Loading teacher: ensemble_s
Copying training images...


100%|██████████| 2552/2552 [00:13<00:00, 186.17it/s]
100%|██████████| 2552/2552 [00:13<00:00, 186.17it/s]


Copying validation images...


100%|██████████| 730/730 [00:03<00:00, 199.75it/s]
100%|██████████| 730/730 [00:03<00:00, 199.75it/s]


Generating high-quality ensemble pseudo labels...


Processing training images: 100%|██████████| 1276/1276 [06:47<00:00,  3.13it/s]
Processing training images: 100%|██████████| 1276/1276 [06:47<00:00,  3.13it/s]


Copying validation labels...


100%|██████████| 365/365 [00:01<00:00, 273.59it/s]



Optimized pseudo-label generation complete!
Statistics:
  Total training images: 1276
  Total pseudo boxes added: 1933
  High confidence boxes (>0.6): 1883
  Average pseudo boxes per image: 1.51
  Optimized dataset: monuai_model\optimized_kd_dataset

OPTIMIZED STUDENT TRAINING
Loading YOLOv11n student model...
Student model loaded
Starting optimized knowledge distillation training...
Optimizations applied:
  - Ensemble teacher pseudo-labeling (winner-takes-all)
  - Conservative augmentation (landmark-preserving)
  - High-confidence pseudo-labels (conf >= 0.4)
  - Teacher-aligned hyperparameters
  - Extended training schedule
New https://pypi.org/project/ultralytics/8.3.228 available  Update with 'pip install -U ultralytics'
Ultralytics 8.3.221  Python-3.12.5 torch-2.5.1+cu121 CUDA:0 (NVIDIA GeForce RTX 4090, 24564MiB)
New https://pypi.org/project/ultralytics/8.3.228 available  Update with 'pip install -U ultralytics'
Ultralytics 8.3.221  Python-3.12.5 torch-2.5.1+cu121 CUDA:0 (NVIDIA G



[K[34m[1mtrain: [0mScanning D:\SIT\AAI3001 Computer Vision\Project\monuai_model\monuai_model\optimized_kd_dataset\labels\train... 1276 images, 0 backgrounds, 0 corrupt: 100% ━━━━━━━━━━━━ 1276/1276 77.0it/s 16.6s0.5ss
[K[34m[1mtrain: [0mScanning D:\SIT\AAI3001 Computer Vision\Project\monuai_model\monuai_model\optimized_kd_dataset\labels\train... 1276 images, 0 backgrounds, 0 corrupt: 100% ━━━━━━━━━━━━ 1276/1276 77.0it/s 16.6s0.5s
[34m[1mtrain: [0mNew cache created: D:\SIT\AAI3001 Computer Vision\Project\monuai_model\monuai_model\optimized_kd_dataset\labels\train.cache
[34m[1mtrain: [0mNew cache created: D:\SIT\AAI3001 Computer Vision\Project\monuai_model\monuai_model\optimized_kd_dataset\labels\train.cache
[K[34m[1mtrain: [0mCaching images (43.5GB Disk): 100% ━━━━━━━━━━━━ 1276/1276 58.8it/s 21.7s0.3ss
[K[34m[1mtrain: [0mCaching images (43.5GB Disk): 100% ━━━━━━━━━━━━ 1276/1276 58.8it/s 21.7s0.3s
[34m[1malbumentations: [0mBlur(p=0.01, blur_limit=(3, 7)), MedianBl



[K[34m[1mval: [0mScanning D:\SIT\AAI3001 Computer Vision\Project\monuai_model\monuai_model\optimized_kd_dataset\labels\val... 365 images, 0 backgrounds, 0 corrupt: 100% ━━━━━━━━━━━━ 365/365 73.1it/s 5.0s<0.2s
[34m[1mval: [0mNew cache created: D:\SIT\AAI3001 Computer Vision\Project\monuai_model\monuai_model\optimized_kd_dataset\labels\val.cache
[K[34m[1mval: [0mScanning D:\SIT\AAI3001 Computer Vision\Project\monuai_model\monuai_model\optimized_kd_dataset\labels\val... 365 images, 0 backgrounds, 0 corrupt: 100% ━━━━━━━━━━━━ 365/365 73.1it/s 5.0s<0.2s
[34m[1mval: [0mNew cache created: D:\SIT\AAI3001 Computer Vision\Project\monuai_model\monuai_model\optimized_kd_dataset\labels\val.cache
[K[34m[1mval: [0mCaching images (13.0GB Disk): 100% ━━━━━━━━━━━━ 365/365 60.7it/s 6.0s<0.2s
[K[34m[1mval: [0mCaching images (13.0GB Disk): 100% ━━━━━━━━━━━━ 365/365 60.7it/s 6.0s<0.2s
Plotting labels to D:\SIT\AAI3001 Computer Vision\Project\monuai_model\monuai_model\yolov11n_student_o