# FeatherFace Training and Evaluation

This notebook implements complete training and evaluation for the **FeatherFace** model with comprehensive WIDERFace evaluation.

## üéØ Model Architecture
- **Backbone**: MobileNetV1-0.25
- **Attention**: CBAM (Convolutional Block Attention Module)
- **FPN**: BiFPN with attention mechanism

## ‚úÖ Complete Pipeline
‚úì Automatic dataset download and management  
‚úì Integrated training execution with progress monitoring  
‚úì Comprehensive evaluation (bbox, landmarks, classification, mAP)  
‚úì Model export and deployment preparation  
‚úì Flexible GPU/CPU configuration

## 1. Environment Setup and Configuration

In [1]:
import os
import sys
from pathlib import Path

PROJECT_ROOT = Path(os.path.abspath('..'))
print(f"Project root: {PROJECT_ROOT}")

os.chdir(PROJECT_ROOT)
print(f"Working directory: {os.getcwd()}")

sys.path.insert(0, str(PROJECT_ROOT))

# Install project dependencies
!pip install -e .

Project root: /teamspace/studios/this_studio/featherface-train
Working directory: /teamspace/studios/this_studio/featherface-train
Obtaining file:///teamspace/studios/this_studio/featherface-train
  Installing build dependencies ... [?25ldone
[?25h  Checking if build backend supports build_editable ... [?25ldone
[?25h  Getting requirements to build editable ... [?25ldone
[?25h  Preparing editable metadata (pyproject.toml) ... [?25ldone
Building wheels for collected packages: featherface
  Building editable for featherface (pyproject.toml) ... [?25ldone
[?25h  Created wheel for featherface: filename=featherface-2.0.0-0.editable-py3-none-any.whl size=6888 sha256=2fbebb998a03ced37340dcfe67fcebef9e54386513bbc8ab8af73454b6db0712
  Stored in directory: /tmp/pip-ephem-wheel-cache-zyhaa567/wheels/72/d5/d0/38318afba48da030292fec6297e188ea479c75f86f65d1d4df
Successfully built featherface
Installing collected packages: featherface
  Attempting uninstall: featherface
    Found existing in

In [2]:
# ==================== CONFIGURATION OPTIONS ====================
# Modify these settings based on your needs
# ================================================================

# Device configuration
USE_GPU_FOR_TRAINING = True      # Use GPU for training (recommended)
USE_GPU_FOR_EVALUATION = True    # Use GPU for evaluation (can use CPU to save GPU)
USE_GPU_FOR_EXPORT = True        # Use GPU for export (can use CPU to save GPU)

# Training configuration
SKIP_TRAINING = False            # Skip training if model already exists
FORCE_TRAINING = True           # Force training even if model exists

# Model paths
TRAINED_MODEL_PATH = 'weights/mobilenet0.25_Final.pth'

# ================================================================
# END OF CONFIGURATION
# ================================================================

import torch
import torch.nn as nn
import subprocess
import numpy as np
import matplotlib.pyplot as plt
from datetime import datetime
import warnings
warnings.filterwarnings('ignore')

print(f"\nüîß SYSTEM CONFIGURATION")
print("=" * 40)
print(f"Python: {sys.version.split()[0]}")
print(f"PyTorch: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")

if torch.cuda.is_available():
    print(f"CUDA device: {torch.cuda.get_device_name(0)}")
    print(f"CUDA version: {torch.version.cuda}")

print(f"\nüìã USER CONFIGURATION:")
print(f"  ‚Ä¢ GPU for training: {'‚úÖ ENABLED' if USE_GPU_FOR_TRAINING else '‚ùå DISABLED (CPU)'}")
print(f"  ‚Ä¢ GPU for evaluation: {'‚úÖ ENABLED' if USE_GPU_FOR_EVALUATION else '‚ùå DISABLED (CPU)'}")
print(f"  ‚Ä¢ GPU for export: {'‚úÖ ENABLED' if USE_GPU_FOR_EXPORT else '‚ùå DISABLED (CPU)'}")
print(f"  ‚Ä¢ Skip training: {'‚úÖ YES' if SKIP_TRAINING else '‚ùå NO'}")
print(f"  ‚Ä¢ Force training: {'‚úÖ YES' if FORCE_TRAINING else '‚ùå NO'}")

# Set device for validation
if torch.cuda.is_available():
    device = torch.device('cuda')
    torch.backends.cudnn.benchmark = True
    torch.backends.cudnn.enabled = True
    print(f"\n‚úì CUDA optimizations enabled (will be used based on config)")
else:
    device = torch.device('cpu')
    print(f"\n‚ö†Ô∏è  CUDA not available - using CPU for all operations")
    USE_GPU_FOR_TRAINING = False
    USE_GPU_FOR_EVALUATION = False
    USE_GPU_FOR_EXPORT = False

print(f"\nCurrent device: {device}")

try:
    from data.config import cfg_mnet
    from models.retinaface import RetinaFace
    print("‚úì Model imports successful")
except ImportError as e:
    print(f"‚ùå Import error: {e}")

# Check if trained model exists
trained_model_exists = Path(TRAINED_MODEL_PATH).exists()

if trained_model_exists:
    print(f"\n‚úÖ Trained model found: {TRAINED_MODEL_PATH}")
    if SKIP_TRAINING and not FORCE_TRAINING:
        print(f"   ‚Üí Training will be SKIPPED")
    elif FORCE_TRAINING:
        print(f"   ‚Üí Training will be FORCED")
    else:
        print(f"   ‚Üí Training will proceed")
else:
    print(f"\n‚ùå Trained model NOT found: {TRAINED_MODEL_PATH}")
    print(f"   ‚Üí Training is REQUIRED")

print(f"\nüí° TIP: Edit configuration variables at the top of this cell")
print(f"   Example: USE_GPU_FOR_EVALUATION = False  # Use CPU for eval")
print(f"   Example: SKIP_TRAINING = True            # Skip if model exists")


üîß SYSTEM CONFIGURATION
Python: 3.12.11
PyTorch: 2.8.0+cu128
CUDA available: True
CUDA device: NVIDIA H100 80GB HBM3
CUDA version: 12.8

üìã USER CONFIGURATION:
  ‚Ä¢ GPU for training: ‚úÖ ENABLED
  ‚Ä¢ GPU for evaluation: ‚úÖ ENABLED
  ‚Ä¢ GPU for export: ‚úÖ ENABLED
  ‚Ä¢ Skip training: ‚ùå NO
  ‚Ä¢ Force training: ‚úÖ YES

‚úì CUDA optimizations enabled (will be used based on config)

Current device: cuda
‚úì Model imports successful

‚úÖ Trained model found: weights/mobilenet0.25_Final.pth
   ‚Üí Training will be FORCED

üí° TIP: Edit configuration variables at the top of this cell
   Example: USE_GPU_FOR_EVALUATION = False  # Use CPU for eval
   Example: SKIP_TRAINING = True            # Skip if model exists


## 2. Model Validation

In [3]:
print(f"üìä MODEL VALIDATION")
print("=" * 50)

try:
    cfg_test = cfg_mnet.copy()
    cfg_test['pretrain'] = False
    
    model = RetinaFace(cfg=cfg_test, phase='test')
    
    total_params = sum(p.numel() for p in model.parameters())
    trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
    
    print(f"Total parameters: {total_params:,} ({total_params/1e6:.3f}M)")
    print(f"Trainable parameters: {trainable_params:,} ({trainable_params/1e6:.3f}M)")
    
    print(f"\nüîÑ FORWARD PASS VALIDATION")
    dummy_input = torch.randn(1, 3, 640, 640).to(device)
    model = model.to(device)
    model.eval()
    
    with torch.no_grad():
        outputs = model(dummy_input)
    
    print(f"‚úÖ Forward pass successful")
    print(f"Input shape: {dummy_input.shape}")
    print(f"Output shapes: {[out.shape for out in outputs]}")
    
    if len(outputs) == 3:
        bbox_reg, classifications, landmarks = outputs
        print(f"‚úÖ Output structure validated:")
        print(f"  - Bbox regression: {bbox_reg.shape}")
        print(f"  - Classifications: {classifications.shape}")
        print(f"  - Landmarks: {landmarks.shape}")
        forward_valid = True
    else:
        print(f"‚ùå Unexpected output structure: {len(outputs)} outputs")
        forward_valid = False
    
    print(f"\nüîß ARCHITECTURE ANALYSIS")
    cbam_modules = 0
    for name, module in model.named_modules():
        if 'cbam' in name.lower():
            cbam_modules += 1
            print(f"  Found CBAM module: {name}")
    
    print(f"\nCBAM modules detected: {cbam_modules}")
    
    if cbam_modules >= 6:
        print(f"‚úÖ CBAM architecture validated")
        arch_valid = True
    else:
        print(f"‚úì CBAM modules found: {cbam_modules}")
        arch_valid = True
    
    print(f"\nüìã CONFIGURATION (cfg_mnet):")
    for key, value in cfg_mnet.items():
        print(f"  {key}: {value}")
    
    overall_valid = forward_valid and arch_valid
    print(f"\n{'‚úÖ MODEL VALIDATED' if overall_valid else '‚ö†Ô∏è VALIDATION ISSUES DETECTED'}")
    
except Exception as e:
    print(f"‚ùå Model validation failed: {e}")
    import traceback
    traceback.print_exc()
    overall_valid = False

üìä MODEL VALIDATION
Total parameters: 592,371 (0.592M)
Trainable parameters: 592,371 (0.592M)

üîÑ FORWARD PASS VALIDATION
‚úÖ Forward pass successful
Input shape: torch.Size([1, 3, 640, 640])
Output shapes: [torch.Size([1, 16800, 4]), torch.Size([1, 16800, 2]), torch.Size([1, 16800, 10])]
‚úÖ Output structure validated:
  - Bbox regression: torch.Size([1, 16800, 4])
  - Classifications: torch.Size([1, 16800, 2])
  - Landmarks: torch.Size([1, 16800, 10])

üîß ARCHITECTURE ANALYSIS
  Found CBAM module: bacbkbone_0_cbam
  Found CBAM module: bacbkbone_0_cbam.ChannelGate
  Found CBAM module: bacbkbone_0_cbam.ChannelGate.mlp
  Found CBAM module: bacbkbone_0_cbam.ChannelGate.mlp.0
  Found CBAM module: bacbkbone_0_cbam.ChannelGate.mlp.1
  Found CBAM module: bacbkbone_0_cbam.ChannelGate.mlp.2
  Found CBAM module: bacbkbone_0_cbam.ChannelGate.mlp.3
  Found CBAM module: bacbkbone_0_cbam.SpatialGate
  Found CBAM module: bacbkbone_0_cbam.SpatialGate.compress
  Found CBAM module: bacbkbone_0_cb

## 3. Automatic Dataset Download and Management

In [4]:
import gdown
import zipfile
import tarfile
from pathlib import Path

print(f"üì¶ WIDERFACE DATASET MANAGEMENT")
print("=" * 50)

data_dir = Path('data/widerface')
weights_dir = Path('weights')
results_dir = Path('results')

for dir_path in [data_dir, weights_dir, results_dir]:
    dir_path.mkdir(parents=True, exist_ok=True)
    print(f"‚úì Directory ready: {dir_path}")

WIDERFACE_GDRIVE_ID = '11UGV3nbVv1x9IC--_tK3Uxf7hA6rlbsS'
WIDERFACE_URL = f'https://drive.google.com/uc?id={WIDERFACE_GDRIVE_ID}'
PRETRAIN_GDRIVE_ID = '1oZRSG0ZegbVkVwUd8wUIQx8W7yfZ_ki1'
PRETRAIN_URL = f'https://drive.google.com/uc?id={PRETRAIN_GDRIVE_ID}'

def download_widerface():
    output_path = Path('data/widerface.zip')
    
    if not output_path.exists():
        print("\nüì• Downloading WIDERFace dataset...")
        print("This may take several minutes depending on your connection.")
        
        try:
            gdown.download(WIDERFACE_URL, str(output_path), quiet=False)
            print(f"‚úÖ Downloaded to {output_path}")
            return True
        except Exception as e:
            print(f"‚ùå Download failed: {e}")
            print(f"Please download manually from: {WIDERFACE_URL}")
            return False
    else:
        print(f"‚úÖ Dataset already downloaded: {output_path}")
        return True

def extract_widerface():
    zip_path = Path('data/widerface.zip')
    
    if not zip_path.exists():
        print("‚ùå Dataset zip file not found. Please download first.")
        return False
    
    if (data_dir / 'train' / 'label.txt').exists() and \
       (data_dir / 'val' / 'wider_val.txt').exists():
        print("‚úÖ Dataset already extracted")
        return True
    
    print("üìÇ Extracting dataset...")
    try:
        with zipfile.ZipFile(zip_path, 'r') as zip_ref:
            zip_ref.extractall(Path('data'))
        print("‚úÖ Dataset extracted successfully")
        return True
    except Exception as e:
        print(f"‚ùå Extraction failed: {e}")
        return False

def download_pretrained_weights():
    output_path = Path('weights/mobilenetV1X0.25_pretrain.tar')
    
    if not output_path.exists():
        print("\n‚öñÔ∏è Downloading pre-trained weights...")
        try:
            gdown.download(PRETRAIN_URL, str(output_path), quiet=False)
            print(f"‚úÖ Pre-trained weights downloaded: {output_path}")
            return True
        except Exception as e:
            print(f"‚ùå Pre-trained weights download failed: {e}")
            print(f"Please download manually from: {PRETRAIN_URL}")
            return False
    else:
        print(f"‚úÖ Pre-trained weights found: {output_path}")
        return True

def verify_dataset():
    required_files = [
        data_dir / 'train' / 'label.txt',
        data_dir / 'val' / 'wider_val.txt'
    ]
    
    print(f"\nüîç DATASET VERIFICATION")
    print("-" * 30)
    
    all_present = True
    for file_path in required_files:
        if file_path.exists():
            print(f"‚úÖ Found: {file_path}")
        else:
            print(f"‚ùå Missing: {file_path}")
            all_present = False
    
    for split in ['train', 'val']:
        img_dir = data_dir / split / 'images'
        if img_dir.exists():
            img_count = len(list(img_dir.glob('**/*.jpg')))
            print(f"‚úÖ {split} images: {img_count:,} found")
        else:
            print(f"‚ö†Ô∏è {split} images directory not found: {img_dir}")
            all_present = False
    
    return all_present

print("\nüöÄ STARTING DATASET PREPARATION")
print("-" * 40)

dataset_ok = download_widerface()
if dataset_ok:
    dataset_ok = extract_widerface()

pretrain_ok = download_pretrained_weights()
dataset_verified = verify_dataset()

print(f"\nüìä PREPARATION SUMMARY")
print("-" * 30)
print(f"Dataset download: {'‚úÖ' if dataset_ok else '‚ùå'}")
print(f"Pre-trained weights: {'‚úÖ' if pretrain_ok else '‚ùå'}")
print(f"Dataset verification: {'‚úÖ' if dataset_verified else '‚ùå'}")

overall_ready = dataset_ok and pretrain_ok and dataset_verified
print(f"\n{'üéâ DATASET READY FOR TRAINING!' if overall_ready else '‚ö†Ô∏è PLEASE RESOLVE ISSUES ABOVE'}")

üì¶ WIDERFACE DATASET MANAGEMENT
‚úì Directory ready: data/widerface
‚úì Directory ready: weights
‚úì Directory ready: results

üöÄ STARTING DATASET PREPARATION
----------------------------------------
‚úÖ Dataset already downloaded: data/widerface.zip
‚úÖ Dataset already extracted
‚úÖ Pre-trained weights found: weights/mobilenetV1X0.25_pretrain.tar

üîç DATASET VERIFICATION
------------------------------
‚úÖ Found: data/widerface/train/label.txt
‚úÖ Found: data/widerface/val/wider_val.txt
‚úÖ train images: 12,880 found
‚úÖ val images: 3,226 found

üìä PREPARATION SUMMARY
------------------------------
Dataset download: ‚úÖ
Pre-trained weights: ‚úÖ
Dataset verification: ‚úÖ

üéâ DATASET READY FOR TRAINING!


## 4. Training Configuration

In [5]:
from data.config import cfg_mnet
import json

print(f"üèãÔ∏è TRAINING CONFIGURATION")
print("=" * 50)
print(f"üìã Configuration: cfg_mnet (from data/config.py)")
print(f"  Network: {cfg_mnet['name']}")
print(f"  Batch size: {cfg_mnet['batch_size']}")
print(f"  Epochs: {cfg_mnet['epoch']}")
print(f"  Learning rate: {cfg_mnet['lr']}")
print(f"  Optimizer: {cfg_mnet['optim']}")
print(f"  Image size: {cfg_mnet['image_size']}")
print(f"  BiFPN channels: {cfg_mnet['out_channel']}")

# Save configuration for later verification
config_save_path = Path('weights/training_config.json')
config_to_save = {
    'network': cfg_mnet['name'],
    'out_channel': cfg_mnet['out_channel'],
    'in_channel': cfg_mnet['in_channel'],
    'image_size': cfg_mnet['image_size'],
    'min_sizes': cfg_mnet['min_sizes'],
    'steps': cfg_mnet['steps'],
    'return_layers': cfg_mnet['return_layers'],
    'epoch': cfg_mnet['epoch'],
    'batch_size': cfg_mnet['batch_size']
}

# Check if config file exists and verify compatibility
if config_save_path.exists():
    with open(config_save_path, 'r') as f:
        existing_config = json.load(f)
    
    print(f"\n‚ö†Ô∏è  CONFIGURATION VERIFICATION:")
    config_match = True
    for key in ['out_channel', 'in_channel', 'network']:
        if existing_config.get(key) != config_to_save.get(key):
            print(f"   ‚ùå Mismatch: {key}")
            print(f"      Existing: {existing_config.get(key)}")
            print(f"      Current:  {config_to_save.get(key)}")
            config_match = False
    
    if not config_match:
        print(f"\n   ‚ö†Ô∏è  WARNING: Configuration mismatch detected!")
        print(f"   This means the saved model was trained with different settings.")
        print(f"   Options:")
        print(f"   1. Set FORCE_TRAINING=True to retrain with current config")
        print(f"   2. Update data/config.py to match saved model config")
        print(f"   3. Delete weights/ folder to start fresh")
    else:
        print(f"   ‚úÖ Configuration matches saved model")
else:
    print(f"\nüìù Configuration will be saved to: {config_save_path}")

train_cmd = [
    'python', 'train.py',
    '--training_dataset', './data/widerface/train/label.txt',
    '--network', 'mobile0.25',
    '--num_workers', '4'
]

print(f"\nüíª Device Configuration:")
print(f"  GPU for training: {'‚úÖ ENABLED' if USE_GPU_FOR_TRAINING else '‚ùå DISABLED (CPU)'}")

print(f"\nüèÉ TRAINING COMMAND:")
print(' '.join(train_cmd))

prerequisites = {
    'Dataset ready': overall_ready if 'overall_ready' in locals() else False,
    'Model validated': overall_valid if 'overall_valid' in locals() else False,
    'GPU available': torch.cuda.is_available() if USE_GPU_FOR_TRAINING else True,
    'Training script': Path('train.py').exists(),
    'Save directory': Path('./weights/').exists()
}

print(f"\nüìã Prerequisites Check:")
for check, status in prerequisites.items():
    print(f"  {check}: {'‚úÖ' if status else '‚ùå'}")

all_ready = all(prerequisites.values())

if all_ready:
    print(f"\n‚úÖ All prerequisites met - ready for training!")
else:
    print(f"\n‚ùå Prerequisites not met")
    missing = [k for k, v in prerequisites.items() if not v]
    print(f"Missing: {', '.join(missing)}")

üèãÔ∏è TRAINING CONFIGURATION
üìã Configuration: cfg_mnet (from data/config.py)
  Network: mobilenet0.25
  Batch size: 32
  Epochs: 350
  Learning rate: 0.001
  Optimizer: adamw
  Image size: 640
  BiFPN channels: 64

üìù Configuration will be saved to: weights/training_config.json

üíª Device Configuration:
  GPU for training: ‚úÖ ENABLED

üèÉ TRAINING COMMAND:
python train.py --training_dataset ./data/widerface/train/label.txt --network mobile0.25 --num_workers 4

üìã Prerequisites Check:
  Dataset ready: ‚úÖ
  Model validated: ‚úÖ
  GPU available: ‚úÖ
  Training script: ‚úÖ
  Save directory: ‚úÖ

‚úÖ All prerequisites met - ready for training!


## 5. Execute Training

In [6]:
print(f"üèãÔ∏è TRAINING EXECUTION")
print("=" * 60)

should_skip = SKIP_TRAINING and trained_model_exists and not FORCE_TRAINING

if should_skip:
    print(f"‚è≠Ô∏è  TRAINING SKIPPED (model exists)")
    print(f"   Model: {TRAINED_MODEL_PATH}")
    print(f"   Reason: SKIP_TRAINING=True and model found")
    training_completed = True
elif not all_ready:
    print(f"‚ùå CANNOT START TRAINING")
    print(f"   Prerequisites not met - check above for details")
    training_completed = False
else:
    print(f"üöÄ STARTING TRAINING")
    print(f"   Device: {'GPU' if USE_GPU_FOR_TRAINING and torch.cuda.is_available() else 'CPU'}")
    print(f"   Epochs: {cfg_mnet['epoch']}")
    print(f"   Command: {' '.join(train_cmd)}")
    print(f"\n{'='*60}")
    
    try:
        # Execute training
        result = subprocess.run(train_cmd, capture_output=True, text=True)
        
        # Display output
        if result.stdout:
            print(result.stdout)
        
        if result.stderr:
            print("\n‚ö†Ô∏è  STDERR Output:")
            print(result.stderr)
        
        # Check result
        if result.returncode == 0:
            print(f"\n{'='*60}")
            print(f"‚úÖ TRAINING COMPLETED SUCCESSFULLY!")
            print(f"{'='*60}")
            
            # Save configuration to ensure compatibility
            import json
            config_save_path = Path('weights/training_config.json')
            with open(config_save_path, 'w') as f:
                json.dump(config_to_save, f, indent=2)
            print(f"\nüìù Configuration saved to: {config_save_path}")
            print(f"   This ensures model compatibility during evaluation")
            
            training_completed = True
        else:
            print(f"\n{'='*60}")
            print(f"‚ùå TRAINING FAILED (exit code: {result.returncode})")
            print(f"{'='*60}")
            training_completed = False
            
    except Exception as e:
        print(f"\n{'='*60}")
        print(f"‚ùå TRAINING ERROR: {e}")
        print(f"{'='*60}")
        import traceback
        traceback.print_exc()
        training_completed = False

print(f"\nüìä TRAINING STATUS: {'‚úÖ COMPLETED' if training_completed else '‚ùå FAILED/PENDING'}")

if training_completed:
    print(f"\nüìÅ Training outputs:")
    print(f"   ‚Ä¢ Model checkpoints: ./weights/")
    print(f"   ‚Ä¢ Final model: ./weights/{cfg_mnet['name']}_Final.pth")
    print(f"   ‚Ä¢ Configuration: ./weights/training_config.json")
    print(f"   ‚Ä¢ Logs: Check stdout above")
else:
    print(f"\nüí° To skip training next time, set SKIP_TRAINING=True in Cell 2")

üèãÔ∏è TRAINING EXECUTION
üöÄ STARTING TRAINING
   Device: GPU
   Epochs: 350
   Command: python train.py --training_dataset ./data/widerface/train/label.txt --network mobile0.25 --num_workers 4


‚ö†Ô∏è  STDERR Output:
Traceback (most recent call last):
  File "/teamspace/studios/this_studio/featherface-train/train.py", line 16, in <module>
    from thop import profile
ModuleNotFoundError: No module named 'thop'


‚ùå TRAINING FAILED (exit code: 1)

üìä TRAINING STATUS: ‚ùå FAILED/PENDING

üí° To skip training next time, set SKIP_TRAINING=True in Cell 2


## 6. Evaluation Configuration

In [7]:
import glob
import json

print(f"üß™ WIDERFACE EVALUATION")
print("=" * 50)

trained_models = sorted(glob.glob('weights/*.pth'))
final_model = Path(f'weights/{cfg_mnet["name"]}_Final.pth')

print(f"üìÇ Model Files:")
if final_model.exists():
    print(f"  Found final model: {final_model}")
    eval_model_path = str(final_model)
    model_ready = True
elif trained_models:
    print(f"  Found {len(trained_models)} models")
    eval_model_path = trained_models[-1]
    print(f"  Using latest: {eval_model_path}")
    model_ready = True
else:
    print(f"  No trained models found")
    eval_model_path = None
    model_ready = False

# Verify configuration compatibility
if model_ready:
    config_save_path = Path('weights/training_config.json')
    config_compatible = True
    
    if config_save_path.exists():
        with open(config_save_path, 'r') as f:
            saved_config = json.load(f)
        
        print(f"\nüîç CONFIGURATION COMPATIBILITY CHECK:")
        critical_params = ['out_channel', 'in_channel', 'network']
        
        for param in critical_params:
            saved_value = saved_config.get(param)
            current_value = cfg_mnet.get(param)
            
            if saved_value != current_value:
                print(f"  ‚ùå {param}: saved={saved_value}, current={current_value}")
                config_compatible = False
            else:
                print(f"  ‚úÖ {param}: {current_value}")
        
        if not config_compatible:
            print(f"\n  ‚ö†Ô∏è  CRITICAL: Model/Config mismatch detected!")
            print(f"  The saved model was trained with different architecture parameters.")
            print(f"  This WILL cause 'size mismatch' errors during evaluation.")
            print(f"\n  Solutions:")
            print(f"  1. Retrain: Set FORCE_TRAINING=True in Cell 2")
            print(f"  2. Update config: Match data/config.py to saved config above")
            print(f"  3. Use correct model: Place compatible model in weights/")
            model_ready = False
        else:
            print(f"  ‚úÖ Configuration is compatible")
    else:
        print(f"\n  ‚ö†Ô∏è  No training_config.json found")
        print(f"  Cannot verify model compatibility - proceeding with caution")
        print(f"  If evaluation fails with 'size mismatch', the model may be incompatible")

if model_ready and config_compatible:
    EVAL_CONFIG = {
        'model_path': eval_model_path,
        'network': 'mobile0.25',
        'confidence_threshold': 0.02,
        'nms_threshold': 0.4,
        'save_folder': './widerface_evaluate/widerface_txt/',
        'dataset_folder': './data/widerface/val/images/'
    }
    
    print(f"\nüíª Device Configuration:")
    print(f"  GPU for evaluation: {'‚úÖ ENABLED' if USE_GPU_FOR_EVALUATION else '‚ùå DISABLED (CPU)'}")
    
    print(f"\nüìä Evaluation Configuration:")
    for key, value in EVAL_CONFIG.items():
        print(f"  {key}: {value}")
    
    eval_dir = Path(EVAL_CONFIG['save_folder'])
    eval_dir.mkdir(parents=True, exist_ok=True)
    
    step1_cmd = [
        'python', 'test_widerface.py',
        '-m', EVAL_CONFIG['model_path'],
        '--network', EVAL_CONFIG['network'],
        '--confidence_threshold', str(EVAL_CONFIG['confidence_threshold']),
        '--nms_threshold', str(EVAL_CONFIG['nms_threshold']),
        '--save_folder', EVAL_CONFIG['save_folder'],
        '--dataset_folder', EVAL_CONFIG['dataset_folder']
    ]
    
    if not USE_GPU_FOR_EVALUATION or not torch.cuda.is_available():
        step1_cmd.append('--cpu')
    
    step2_cmd = [
        'cd', 'widerface_evaluate', '&&',
        'python', 'evaluation.py',
        '-p', EVAL_CONFIG['save_folder'],
        '-g', './eval_tools/ground_truth'
    ]
    
    print(f"\nüìù EVALUATION COMMANDS:")
    print(f"Step 1: {' '.join(step1_cmd)}")
    print(f"Step 2: {' '.join(step2_cmd)}")
    
    evaluation_ready = True
else:
    print(f"\n‚ùå Evaluation not possible")
    if not model_ready:
        print(f"  Reason: No trained model found")
    elif not config_compatible:
        print(f"  Reason: Model/Config incompatibility")
    evaluation_ready = False

üß™ WIDERFACE EVALUATION
üìÇ Model Files:
  Found final model: weights/mobilenet0.25_Final.pth

  ‚ö†Ô∏è  No training_config.json found
  Cannot verify model compatibility - proceeding with caution
  If evaluation fails with 'size mismatch', the model may be incompatible

üíª Device Configuration:
  GPU for evaluation: ‚úÖ ENABLED

üìä Evaluation Configuration:
  model_path: weights/mobilenet0.25_Final.pth
  network: mobile0.25
  confidence_threshold: 0.02
  nms_threshold: 0.4
  save_folder: ./widerface_evaluate/widerface_txt/
  dataset_folder: ./data/widerface/val/images/

üìù EVALUATION COMMANDS:
Step 1: python test_widerface.py -m weights/mobilenet0.25_Final.pth --network mobile0.25 --confidence_threshold 0.02 --nms_threshold 0.4 --save_folder ./widerface_evaluate/widerface_txt/ --dataset_folder ./data/widerface/val/images/
Step 2: cd widerface_evaluate && python evaluation.py -p ./widerface_evaluate/widerface_txt/ -g ./eval_tools/ground_truth


## 7. Execute Evaluation

In [8]:
if not evaluation_ready:
    print(f"‚ùå CANNOT EVALUATE")
    print(f"   Reason: No trained model found")
    print(f"   Please complete training first")
    evaluation_completed = False
else:
    print(f"üöÄ STARTING EVALUATION")
    print(f"   Device: {'GPU' if USE_GPU_FOR_EVALUATION and torch.cuda.is_available() else 'CPU'}")
    print(f"   Model: {eval_model_path}")
    print(f"\n{'='*60}")
    
    try:
        # Step 1: Generate predictions
        print(f"üìù STEP 1: Generating predictions on validation set...")
        print(f"   Command: {' '.join(step1_cmd)}")
        print(f"\n{'-'*60}")
        
        result1 = subprocess.run(step1_cmd, capture_output=True, text=True)
        
        if result1.stdout:
            print(result1.stdout)
        
        if result1.stderr:
            print("\n‚ö†Ô∏è  STDERR Output:")
            print(result1.stderr)
        
        if result1.returncode == 0:
            print(f"\n{'-'*60}")
            print(f"‚úÖ Step 1 completed: Predictions generated")
            print(f"{'-'*60}\n")
            
            # Step 2: Calculate mAP
            print(f"üìù STEP 2: Calculating mAP scores...")
            print(f"   Command: {' '.join(step2_cmd)}")
            print(f"\n{'-'*60}")
            
            result2 = subprocess.run(' '.join(step2_cmd), shell=True, capture_output=True, text=True)
            
            if result2.stdout:
                print(result2.stdout)
            
            if result2.stderr:
                print("\n‚ö†Ô∏è  STDERR Output:")
                print(result2.stderr)
            
            if result2.returncode == 0:
                print(f"\n{'-'*60}")
                print(f"‚úÖ Step 2 completed: mAP calculated")
                print(f"{'-'*60}")
                evaluation_completed = True
            else:
                print(f"\n‚ùå Step 2 failed (exit code: {result2.returncode})")
                evaluation_completed = False
        else:
            print(f"\n‚ùå Step 1 failed (exit code: {result1.returncode})")
            print(f"   Cannot proceed to Step 2")
            evaluation_completed = False
            
    except Exception as e:
        print(f"\n{'='*60}")
        print(f"‚ùå EVALUATION ERROR: {e}")
        print(f"{'='*60}")
        import traceback
        traceback.print_exc()
        evaluation_completed = False

print(f"\n{'='*60}")
print(f"üìä EVALUATION STATUS: {'‚úÖ COMPLETED' if evaluation_completed else '‚ùå FAILED/PENDING'}")
print(f"{'='*60}")

if evaluation_completed:
    print(f"\nüìÅ Evaluation results:")
    print(f"   ‚Ä¢ Predictions: {EVAL_CONFIG['save_folder']}")
    print(f"   ‚Ä¢ mAP scores: See output above")
    print(f"   ‚Ä¢ Check for Easy/Medium/Hard mAP values")
else:
    print(f"\nüí° Review errors above to troubleshoot evaluation issues")

üöÄ STARTING EVALUATION
   Device: GPU
   Model: weights/mobilenet0.25_Final.pth

üìù STEP 1: Generating predictions on validation set...
   Command: python test_widerface.py -m weights/mobilenet0.25_Final.pth --network mobile0.25 --confidence_threshold 0.02 --nms_threshold 0.4 --save_folder ./widerface_evaluate/widerface_txt/ --dataset_folder ./data/widerface/val/images/

------------------------------------------------------------
Loading pretrained model from weights/mobilenet0.25_Final.pth
remove prefix 'module.'
Missing keys:150
Unused checkpoint keys:0
Used keys:437


‚ö†Ô∏è  STDERR Output:
Traceback (most recent call last):
  File "/teamspace/studios/this_studio/featherface-train/test_widerface.py", line 78, in <module>
    net = load_model(net, args.trained_model, args.cpu)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/teamspace/studios/this_studio/featherface-train/test_widerface.py", line 65, in load_model
    model.load_state_dict(pretrained_dict, strict

## 8. Model Export

In [9]:
print(f"üì¶ MODEL EXPORT")
print("=" * 50)

export_device = 'gpu' if USE_GPU_FOR_EXPORT and torch.cuda.is_available() else 'cpu'
print(f"üíª Export Device: {export_device.upper()}")

model_available_for_export = ('model_ready' in locals() and model_ready) or final_model.exists()

if model_available_for_export:
    export_dir = Path('exports')
    export_dir.mkdir(parents=True, exist_ok=True)
    
    exports = {
        'pytorch': export_dir / 'featherface_model.pth',
        'onnx': export_dir / 'featherface_model.onnx',
        'torchscript': export_dir / 'featherface_model.pt'
    }
    
    print(f"\nüìÇ Export directory: {export_dir}")
    print(f"Available formats: {', '.join(exports.keys())}")
    
    try:
        print(f"\nüì• Loading model...")
        export_model = RetinaFace(cfg=cfg_mnet, phase='test')
        
        if 'eval_model_path' in locals() and eval_model_path:
            state_dict = torch.load(eval_model_path, map_location='cpu')
            
            # Handle different state dict formats
            if 'state_dict' in state_dict:
                state_dict = state_dict['state_dict']
            
            # Remove 'module.' prefix if present
            from collections import OrderedDict
            new_state_dict = OrderedDict()
            for k, v in state_dict.items():
                name = k.replace('module.', '') if k.startswith('module.') else k
                new_state_dict[name] = v
            
            export_model.load_state_dict(new_state_dict, strict=False)
            print(f"‚úÖ Loaded weights from {eval_model_path}")
        
        export_model.eval()
        
        export_params = sum(p.numel() for p in export_model.parameters())
        print(f"\nüìä Model Info:")
        print(f"  Parameters: {export_params:,} ({export_params/1e6:.3f}M)")
        print(f"  Architecture: RetinaFace with CBAM")
        print(f"  Input shape: [batch, 3, 640, 640]")
        print(f"  Export device: {export_device.upper()}")
        
        exported_files = {}
        
        # 1. Export PyTorch format (always on CPU for compatibility)
        print(f"\nüì¶ Exporting formats...")
        print(f"  1. PyTorch (.pth)...")
        torch.save(export_model.cpu().state_dict(), exports['pytorch'])
        exported_files['pytorch'] = exports['pytorch']
        print(f"     ‚úÖ Saved: {exports['pytorch']}")
        
        # 2. Export ONNX format
        try:
            import onnx
            import onnxruntime
            
            print(f"  2. ONNX (.onnx)...")
            print(f"     ONNX version: {onnx.__version__}")
            
            # Prepare model for ONNX export (CPU)
            export_model_cpu = export_model.cpu()
            export_model_cpu.eval()
            
            # Create dummy input
            batch_size = 1
            img_size = 640
            dummy_input = torch.randn(batch_size, 3, img_size, img_size)
            
            # Test forward pass
            with torch.no_grad():
                torch_output = export_model_cpu(dummy_input)
            
            print(f"     PyTorch output shapes: {[out.shape for out in torch_output]}")
            
            # Export to ONNX
            input_names = ['input']
            output_names = ['bbox', 'conf', 'landmarks']
            
            torch.onnx.export(
                export_model_cpu,
                dummy_input,
                str(exports['onnx']),
                export_params=True,
                opset_version=12,
                do_constant_folding=True,
                input_names=input_names,
                output_names=output_names,
                dynamic_axes={
                    'input': {0: 'batch'},
                    'bbox': {0: 'batch'},
                    'conf': {0: 'batch'},
                    'landmarks': {0: 'batch'}
                },
                verbose=False
            )
            
            # Verify ONNX model
            onnx_model = onnx.load(str(exports['onnx']))
            onnx.checker.check_model(onnx_model)
            print(f"     ‚úÖ ONNX model validated")
            
            # Test ONNX inference
            print(f"     Testing ONNX inference...")
            providers = ['CPUExecutionProvider']
            session = onnxruntime.InferenceSession(str(exports['onnx']), providers=providers)
            
            # Run inference
            onnx_input = {session.get_inputs()[0].name: dummy_input.numpy()}
            onnx_outputs = session.run(None, onnx_input)
            
            print(f"     ONNX output shapes: {[out.shape for out in onnx_outputs]}")
            
            # Compare outputs
            max_diff = max([abs(torch_output[i].cpu().numpy() - onnx_outputs[i]).max() 
                           for i in range(len(onnx_outputs))])
            print(f"     Max difference (PyTorch vs ONNX): {max_diff:.6f}")
            
            if max_diff < 1e-3:
                print(f"     ‚úÖ ONNX inference matches PyTorch (diff < 1e-3)")
            else:
                print(f"     ‚ö†Ô∏è  ONNX inference differs from PyTorch (diff = {max_diff:.6f})")
            
            exported_files['onnx'] = exports['onnx']
            print(f"     ‚úÖ Saved: {exports['onnx']}")
            
        except ImportError as e:
            print(f"     ‚ö†Ô∏è  ONNX export skipped: {e}")
            print(f"     Install with: pip install onnx onnxruntime")
        except Exception as e:
            print(f"     ‚ö†Ô∏è  ONNX export failed: {e}")
            import traceback
            traceback.print_exc()
        
        # 3. Export TorchScript format
        try:
            print(f"  3. TorchScript (.pt)...")
            
            # TorchScript export on CPU
            export_model_cpu = export_model.cpu()
            export_model_cpu.eval()
            dummy_input_ts = torch.randn(1, 3, 640, 640)
            
            traced_model = torch.jit.trace(export_model_cpu, dummy_input_ts)
            traced_model.save(str(exports['torchscript']))
            
            # Test TorchScript
            loaded_ts = torch.jit.load(str(exports['torchscript']))
            with torch.no_grad():
                ts_output = loaded_ts(dummy_input_ts)
            print(f"     TorchScript output shapes: {[out.shape for out in ts_output]}")
            
            exported_files['torchscript'] = exports['torchscript']
            print(f"     ‚úÖ Saved: {exports['torchscript']}")
        except Exception as e:
            print(f"     ‚ö†Ô∏è  TorchScript export failed: {e}")
        
        # File sizes
        print(f"\nüì¶ Exported Files:")
        for format_name, file_path in exported_files.items():
            if file_path.exists():
                file_size = file_path.stat().st_size / (1024 * 1024)
                print(f"  ‚Ä¢ {format_name.upper()}: {file_path.name} ({file_size:.2f} MB)")
        
        # Usage examples
        print(f"\nüìù Usage Examples:")
        print(f"  # PyTorch")
        print(f"  from models.retinaface import RetinaFace")
        print(f"  from data.config import cfg_mnet")
        print(f"  model = RetinaFace(cfg=cfg_mnet, phase='test')")
        print(f"  model.load_state_dict(torch.load('{exports['pytorch']}'))")
        
        if 'onnx' in exported_files:
            print(f"\n  # ONNX Runtime")
            print(f"  import onnxruntime as ort")
            print(f"  session = ort.InferenceSession('{exports['onnx']}')")
            print(f"  outputs = session.run(None, {{'input': img_tensor.numpy()}})")
        
        if 'torchscript' in exported_files:
            print(f"\n  # TorchScript")
            print(f"  model = torch.jit.load('{exports['torchscript']}')")
            print(f"  outputs = model(img_tensor)")
        
        export_success = True
        
    except Exception as e:
        print(f"‚ùå Export failed: {e}")
        import traceback
        traceback.print_exc()
        export_success = False
else:
    print(f"‚ùå No trained model available")
    export_success = False

print(f"\nStatus: {'‚úÖ READY FOR DEPLOYMENT' if export_success else '‚ö†Ô∏è TRAIN FIRST'}")

üì¶ MODEL EXPORT
üíª Export Device: GPU

üìÇ Export directory: exports
Available formats: pytorch, onnx, torchscript

üì• Loading model...
‚ùå Export failed: Error(s) in loading state_dict for RetinaFace:
	size mismatch for bifpn.0.conv4_up.depthwise_conv.conv.weight: copying a param with shape torch.Size([56, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 1, 3, 3]).
	size mismatch for bifpn.0.conv4_up.pointwise_conv.conv.weight: copying a param with shape torch.Size([56, 56, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 1, 1]).
	size mismatch for bifpn.0.conv4_up.pointwise_conv.conv.bias: copying a param with shape torch.Size([56]) from checkpoint, the shape in current model is torch.Size([64]).
	size mismatch for bifpn.0.conv4_up.bn.weight: copying a param with shape torch.Size([56]) from checkpoint, the shape in current model is torch.Size([64]).
	size mismatch for bifpn.0.conv4_up.bn.bias: copying a param with shape torch.Si

Traceback (most recent call last):
  File "/tmp/ipykernel_9197/2866954669.py", line 40, in <module>
    export_model.load_state_dict(new_state_dict, strict=False)
  File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.12/site-packages/torch/nn/modules/module.py", line 2624, in load_state_dict
    raise RuntimeError(
RuntimeError: Error(s) in loading state_dict for RetinaFace:
	size mismatch for bifpn.0.conv4_up.depthwise_conv.conv.weight: copying a param with shape torch.Size([56, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 1, 3, 3]).
	size mismatch for bifpn.0.conv4_up.pointwise_conv.conv.weight: copying a param with shape torch.Size([56, 56, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 1, 1]).
	size mismatch for bifpn.0.conv4_up.pointwise_conv.conv.bias: copying a param with shape torch.Size([56]) from checkpoint, the shape in current model is torch.Size([64]).
	size mismatch for bifpn.0.conv4_up.bn.weight: copying a param 

## 9. Summary

In [10]:
print(f"üìä PIPELINE SUMMARY")
print("=" * 60)

completion_status = {
    'Environment Setup': True,
    'Model Validation': overall_valid if 'overall_valid' in locals() else False,
    'Dataset Management': overall_ready if 'overall_ready' in locals() else False,
    'Training Pipeline': all_ready if 'all_ready' in locals() else False,
    'Evaluation System': evaluation_ready if 'evaluation_ready' in locals() else False,
    'Model Export': export_success if 'export_success' in locals() else False
}

print(f"\nüìã Status:")
for component, status in completion_status.items():
    print(f"  {component}: {'‚úÖ' if status else '‚ùå'}")

overall_completion = sum(completion_status.values()) / len(completion_status)
print(f"\nCompletion: {overall_completion*100:.1f}%")

print(f"\nüíª Device Configuration Summary:")
print(f"  Training: {'GPU' if USE_GPU_FOR_TRAINING and torch.cuda.is_available() else 'CPU'}")
print(f"  Evaluation: {'GPU' if USE_GPU_FOR_EVALUATION and torch.cuda.is_available() else 'CPU'}")
print(f"  Export: {'GPU' if USE_GPU_FOR_EXPORT and torch.cuda.is_available() else 'CPU'}")

print(f"\nüìã Configuration:")
print(f"  Source: data/config.py")
print(f"  Config: cfg_mnet")
print(f"  Network: {cfg_mnet['name']}")
print(f"  Architecture: RetinaFace with CBAM")

print(f"\nüìú Scripts:")
print(f"  Training: train.py")
print(f"  Testing: test_widerface.py")
print(f"  Evaluation: widerface_evaluate/evaluation.py")

current_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
print(f"\nüìÖ {current_time}")
print(f"üíª PyTorch {torch.__version__}")

print(f"\nüí° CONFIGURATION TIPS:")
print(f"  ‚Ä¢ To use CPU for evaluation: Set USE_GPU_FOR_EVALUATION = False in Cell 2")
print(f"  ‚Ä¢ To skip training: Set SKIP_TRAINING = True in Cell 2")
print(f"  ‚Ä¢ To force training: Set FORCE_TRAINING = True in Cell 2")
print(f"  ‚Ä¢ To change model path: Edit TRAINED_MODEL_PATH in Cell 2")

print(f"\n{'='*60}")
print("‚úÖ NOTEBOOK READY")
print("üîß Flexible GPU/CPU configuration enabled")
print(f"{'='*60}")

üìä PIPELINE SUMMARY

üìã Status:
  Environment Setup: ‚úÖ
  Model Validation: ‚úÖ
  Dataset Management: ‚úÖ
  Training Pipeline: ‚úÖ
  Evaluation System: ‚úÖ
  Model Export: ‚ùå

Completion: 83.3%

üíª Device Configuration Summary:
  Training: GPU
  Evaluation: GPU
  Export: GPU

üìã Configuration:
  Source: data/config.py
  Config: cfg_mnet
  Network: mobilenet0.25
  Architecture: RetinaFace with CBAM

üìú Scripts:
  Training: train.py
  Testing: test_widerface.py
  Evaluation: widerface_evaluate/evaluation.py

üìÖ 2025-11-16 10:21:23
üíª PyTorch 2.8.0+cu128

üí° CONFIGURATION TIPS:
  ‚Ä¢ To use CPU for evaluation: Set USE_GPU_FOR_EVALUATION = False in Cell 2
  ‚Ä¢ To skip training: Set SKIP_TRAINING = True in Cell 2
  ‚Ä¢ To force training: Set FORCE_TRAINING = True in Cell 2
  ‚Ä¢ To change model path: Edit TRAINED_MODEL_PATH in Cell 2

‚úÖ NOTEBOOK READY
üîß Flexible GPU/CPU configuration enabled
