# ONNX Model Conversion for Mobile Deployment

This notebook converts YOLO models to ONNX format optimized for mobile/CPU inference.

## Benefits of ONNX:
- **Faster inference** on CPU
- **Smaller model size** with quantization
- **Mobile compatibility** (Android/iOS)
- **Cross-platform deployment**

In [1]:
# Install required packages
!pip install ultralytics onnx onnxruntime



In [None]:
import os
from ultralytics import YOLO
import onnx
import onnxruntime as ort
import numpy as np
import cv2
import time

# Get current directory
current_dir = os.getcwd()
print(f"Working directory: {current_dir}")

# Check available models
models_dir = os.path.join(current_dir, 'models')
if os.path.exists(models_dir):
    print(f"\nAvailable models in {models_dir}:")
    for file in os.listdir(models_dir):
        if file.endswith('.pt'):
            print(f"  - {file}")
else:
    print(f"Models directory not found: {models_dir}")

Working directory: e:\PROJECT\Computer Vision\Random\PersonTracker\MVP

Available models in e:\PROJECT\Computer Vision\Random\PersonTracker\MVP\models:
  - yolo11n.pt
  - yolov8n-face-lindevs.pt
  - yolov8n.pt


## 1. Convert Person Detection Model (YOLOv8n)

In [None]:
# Load and convert person detection model
print("Converting Person Detection Model to ONNX...")

# Try to find the model
person_model_path = None
possible_paths = [
    'yolov8n.pt',
    os.path.join('models', 'archive', 'yolov8n.pt'),
    os.path.join('models', 'yolov8n.pt')
]

for path in possible_paths:
    if os.path.exists(path):
        person_model_path = path
        break

if person_model_path:
    print(f"Found model at: {person_model_path}")
    
    # Load model
    person_model = YOLO(person_model_path)
    
    # Export to ONNX with mobile-optimized settings
    onnx_path = person_model.export(
        format='onnx',
        imgsz=416,           # Smaller input size for mobile
        half=False,          # Use FP32 for better CPU compatibility
        dynamic=False,       # Fixed input size for mobile optimization
        simplify=True,       # Simplify the model graph
        opset=11,           # ONNX opset version (compatible with most runtimes)
        verbose=True
    )
    
    print(f"✅ Person detection model exported to: {onnx_path}")
    
    # Check model size
    original_size = os.path.getsize(person_model_path) / (1024 * 1024)
    onnx_size = os.path.getsize(onnx_path) / (1024 * 1024)
    print(f"Original model size: {original_size:.2f} MB")
    print(f"ONNX model size: {onnx_size:.2f} MB")
    
else:
    print("❌ Person detection model not found!")
    print("Please ensure yolov8n.pt is in the current directory or models folder.")

Converting Person Detection Model to ONNX...
Found model at: models\yolov8n.pt
Ultralytics 8.3.85  Python-3.11.11 torch-2.2.2+cpu CPU (Intel Core(TM) i3-10100F 3.60GHz)
YOLOv8n summary (fused): 72 layers, 3,151,904 parameters, 0 gradients, 8.7 GFLOPs
YOLOv8n summary (fused): 72 layers, 3,151,904 parameters, 0 gradients, 8.7 GFLOPs

[34m[1mPyTorch:[0m starting from 'models\yolov8n.pt' with input shape (1, 3, 416, 416) BCHW and output shape(s) (1, 84, 3549) (6.2 MB)

[34m[1mONNX:[0m starting export with onnx 1.17.0 opset 11...

[34m[1mPyTorch:[0m starting from 'models\yolov8n.pt' with input shape (1, 3, 416, 416) BCHW and output shape(s) (1, 84, 3549) (6.2 MB)

[34m[1mONNX:[0m starting export with onnx 1.17.0 opset 11...
[34m[1mONNX:[0m slimming with onnxslim 0.1.48...
[34m[1mONNX:[0m slimming with onnxslim 0.1.48...
[34m[1mONNX:[0m export success  2.7s, saved as 'models\yolov8n.onnx' (12.1 MB)

Export complete (3.1s)
Results saved to [1mE:\PROJECT\Computer Vision\R

## 2. Convert Face Detection Model

In [None]:
# Convert face detection model
print("\nConverting Face Detection Model to ONNX...")

face_model_path = os.path.join('models', 'archive', 'yolov8n-face-lindevs.pt')

if os.path.exists(face_model_path):
    print(f"Found face model at: {face_model_path}")
    
    # Load face detection model
    face_model = YOLO(face_model_path)
    
    # Export to ONNX with mobile-optimized settings
    face_onnx_path = face_model.export(
        format='onnx',
        imgsz=320,           # Even smaller for face detection
        half=False,          # Use FP32 for better CPU compatibility
        dynamic=False,       # Fixed input size
        simplify=True,       # Simplify the model graph
        opset=11,           # ONNX opset version
        verbose=True
    )
    
    print(f"✅ Face detection model exported to: {face_onnx_path}")
    
    # Check model size
    original_size = os.path.getsize(face_model_path) / (1024 * 1024)
    onnx_size = os.path.getsize(face_onnx_path) / (1024 * 1024)
    print(f"Original model size: {original_size:.2f} MB")
    print(f"ONNX model size: {onnx_size:.2f} MB")
    
else:
    print(f"❌ Face detection model not found at: {face_model_path}")
    print("Please ensure the face detection model is in the models folder.")


Converting Face Detection Model to ONNX...
Found face model at: models\yolov8n-face-lindevs.pt
Ultralytics 8.3.85  Python-3.11.11 torch-2.2.2+cpu CPU (Intel Core(TM) i3-10100F 3.60GHz)
Model summary (fused): 72 layers, 3,005,843 parameters, 0 gradients, 8.1 GFLOPs
Model summary (fused): 72 layers, 3,005,843 parameters, 0 gradients, 8.1 GFLOPs

[34m[1mPyTorch:[0m starting from 'models\yolov8n-face-lindevs.pt' with input shape (1, 3, 320, 320) BCHW and output shape(s) (1, 5, 2100) (6.0 MB)

[34m[1mONNX:[0m starting export with onnx 1.17.0 opset 11...

[34m[1mPyTorch:[0m starting from 'models\yolov8n-face-lindevs.pt' with input shape (1, 3, 320, 320) BCHW and output shape(s) (1, 5, 2100) (6.0 MB)

[34m[1mONNX:[0m starting export with onnx 1.17.0 opset 11...
[34m[1mONNX:[0m slimming with onnxslim 0.1.48...
[34m[1mONNX:[0m slimming with onnxslim 0.1.48...
[34m[1mONNX:[0m export success  1.3s, saved as 'models\yolov8n-face-lindevs.onnx' (11.6 MB)

Export complete (1.5s)

## 3. Test ONNX Model Performance

In [5]:
def benchmark_model(model_path, input_size, model_type="PyTorch"):
    """Benchmark inference speed of a model"""
    print(f"\n🚀 Benchmarking {model_type} Model: {os.path.basename(model_path)}")
    
    # Create dummy input
    dummy_input = np.random.randint(0, 255, (input_size, input_size, 3), dtype=np.uint8)
    
    if model_type == "PyTorch":
        # Test PyTorch model
        model = YOLO(model_path)
        
        # Warmup
        for _ in range(3):
            _ = model(dummy_input, verbose=False)
        
        # Benchmark
        times = []
        for i in range(10):
            start_time = time.time()
            _ = model(dummy_input, verbose=False)
            times.append(time.time() - start_time)
            
    elif model_type == "ONNX":
        # Test ONNX model
        session = ort.InferenceSession(model_path)
        input_name = session.get_inputs()[0].name
        
        # Prepare input (ONNX expects NCHW format)
        input_tensor = np.transpose(dummy_input, (2, 0, 1)).astype(np.float32) / 255.0
        input_tensor = np.expand_dims(input_tensor, axis=0)
        
        # Warmup
        for _ in range(3):
            _ = session.run(None, {input_name: input_tensor})
        
        # Benchmark
        times = []
        for i in range(10):
            start_time = time.time()
            _ = session.run(None, {input_name: input_tensor})
            times.append(time.time() - start_time)
    
    avg_time = np.mean(times) * 1000  # Convert to milliseconds
    fps = 1000 / avg_time
    
    print(f"Average inference time: {avg_time:.2f} ms")
    print(f"Estimated FPS: {fps:.1f}")
    
    return avg_time, fps

In [6]:
# Benchmark PyTorch vs ONNX models if they exist
print("📊 Performance Comparison")
print("=" * 50)

# Test person detection models
if person_model_path and os.path.exists(onnx_path):
    print("\n🧑 PERSON DETECTION MODEL COMPARISON:")
    pytorch_time, pytorch_fps = benchmark_model(person_model_path, 416, "PyTorch")
    onnx_time, onnx_fps = benchmark_model(onnx_path, 416, "ONNX")
    
    speedup = pytorch_time / onnx_time
    print(f"\n🎯 ONNX Speedup: {speedup:.2f}x faster")
    print(f"🎯 FPS Improvement: {pytorch_fps:.1f} → {onnx_fps:.1f} (+{onnx_fps-pytorch_fps:.1f})")

# Test face detection models if available
if os.path.exists(face_model_path) and 'face_onnx_path' in locals():
    print("\n👤 FACE DETECTION MODEL COMPARISON:")
    pytorch_time_face, pytorch_fps_face = benchmark_model(face_model_path, 320, "PyTorch")
    onnx_time_face, onnx_fps_face = benchmark_model(face_onnx_path, 320, "ONNX")
    
    speedup_face = pytorch_time_face / onnx_time_face
    print(f"\n🎯 ONNX Speedup: {speedup_face:.2f}x faster")
    print(f"🎯 FPS Improvement: {pytorch_fps_face:.1f} → {onnx_fps_face:.1f} (+{onnx_fps_face-pytorch_fps_face:.1f})")

📊 Performance Comparison

🧑 PERSON DETECTION MODEL COMPARISON:

🚀 Benchmarking PyTorch Model: yolov8n.pt
Average inference time: 97.76 ms
Estimated FPS: 10.2

🚀 Benchmarking ONNX Model: yolov8n.onnx
Average inference time: 97.76 ms
Estimated FPS: 10.2

🚀 Benchmarking ONNX Model: yolov8n.onnx
Average inference time: 26.08 ms
Estimated FPS: 38.3

🎯 ONNX Speedup: 3.75x faster
🎯 FPS Improvement: 10.2 → 38.3 (+28.1)

👤 FACE DETECTION MODEL COMPARISON:

🚀 Benchmarking PyTorch Model: yolov8n-face-lindevs.pt
Average inference time: 26.08 ms
Estimated FPS: 38.3

🎯 ONNX Speedup: 3.75x faster
🎯 FPS Improvement: 10.2 → 38.3 (+28.1)

👤 FACE DETECTION MODEL COMPARISON:

🚀 Benchmarking PyTorch Model: yolov8n-face-lindevs.pt
Average inference time: 91.41 ms
Estimated FPS: 10.9

🚀 Benchmarking ONNX Model: yolov8n-face-lindevs.onnx
Average inference time: 91.41 ms
Estimated FPS: 10.9

🚀 Benchmarking ONNX Model: yolov8n-face-lindevs.onnx
Average inference time: 13.85 ms
Estimated FPS: 72.2

🎯 ONNX Speedu

## 4. Model Information & Mobile Deployment Tips

In [7]:
def analyze_onnx_model(model_path):
    """Analyze ONNX model structure and properties"""
    if not os.path.exists(model_path):
        print(f"Model not found: {model_path}")
        return
    
    print(f"\n📋 Analyzing: {os.path.basename(model_path)}")
    print("-" * 40)
    
    # Load model
    model = onnx.load(model_path)
    
    # Model info
    print(f"Model version: {model.model_version}")
    print(f"ONNX opset version: {model.opset_import[0].version}")
    
    # Input/Output info
    session = ort.InferenceSession(model_path)
    
    print(f"\nInput shape: {session.get_inputs()[0].shape}")
    print(f"Input type: {session.get_inputs()[0].type}")
    
    print(f"\nOutput shapes:")
    for i, output in enumerate(session.get_outputs()):
        print(f"  Output {i}: {output.shape} ({output.type})")
    
    # File size
    size_mb = os.path.getsize(model_path) / (1024 * 1024)
    print(f"\nModel size: {size_mb:.2f} MB")
    
    # Mobile compatibility check
    print(f"\n📱 Mobile Deployment Readiness:")
    print(f"  ✅ Fixed input size: {not any('unk' in str(dim) for dim in session.get_inputs()[0].shape)}")
    print(f"  ✅ ONNX opset ≤ 13: {model.opset_import[0].version <= 13}")
    print(f"  ✅ Model size < 50MB: {size_mb < 50}")
    print(f"  ✅ FP32 precision: {session.get_inputs()[0].type == 'tensor(float)'}")

# Analyze converted models
if 'onnx_path' in locals() and os.path.exists(onnx_path):
    analyze_onnx_model(onnx_path)

if 'face_onnx_path' in locals() and os.path.exists(face_onnx_path):
    analyze_onnx_model(face_onnx_path)


📋 Analyzing: yolov8n.onnx
----------------------------------------
Model version: 0
ONNX opset version: 11

Input shape: [1, 3, 416, 416]
Input type: tensor(float)

Output shapes:
  Output 0: [1, 84, 3549] (tensor(float))

Model size: 12.15 MB

📱 Mobile Deployment Readiness:
  ✅ Fixed input size: True
  ✅ ONNX opset ≤ 13: True
  ✅ Model size < 50MB: True
  ✅ FP32 precision: True

📋 Analyzing: yolov8n-face-lindevs.onnx
----------------------------------------
Model version: 0
ONNX opset version: 11

Input shape: [1, 3, 320, 320]
Input type: tensor(float)

Output shapes:
  Output 0: [1, 5, 2100] (tensor(float))

Model size: 11.56 MB

📱 Mobile Deployment Readiness:
  ✅ Fixed input size: True
  ✅ ONNX opset ≤ 13: True
  ✅ Model size < 50MB: True
  ✅ FP32 precision: True


## 5. Mobile Deployment Guide

### For Android:
1. Use **ONNX Runtime Mobile** for Android
2. Convert models to **ORT format** for even better mobile performance:
   ```bash
   python -m onnxruntime.tools.convert_onnx_models_to_ort ./
   ```

### For iOS:
1. Use **ONNX Runtime** or **Core ML**
2. For Core ML, convert ONNX to Core ML format:
   ```python
   import coremltools as ct
   model = ct.convert('model.onnx')
   model.save('model.mlmodel')
   ```

### Performance Tips:
- Use **CPU optimized builds** of ONNX Runtime
- Enable **parallel execution** for multi-core devices
- Consider **quantization** for even smaller models
- Profile on target device for optimal settings

In [8]:
# Generate summary of all converted models
print("\n🎉 ONNX Conversion Summary")
print("=" * 50)

onnx_files = [f for f in os.listdir('.') if f.endswith('.onnx')]
models_onnx_files = []
if os.path.exists('models'):
    models_onnx_files = [f for f in os.listdir('models') if f.endswith('.onnx')]

all_onnx_files = onnx_files + [os.path.join('models', f) for f in models_onnx_files]

if all_onnx_files:
    print(f"\n✅ Successfully converted {len(all_onnx_files)} model(s):")
    for file in all_onnx_files:
        size = os.path.getsize(file) / (1024 * 1024)
        print(f"  📦 {file} ({size:.2f} MB)")
    
    print("\n📱 Ready for mobile deployment!")
    print("💡 Next steps:")
    print("   1. Test models on your target mobile device")
    print("   2. Optimize inference settings for your hardware")
    print("   3. Consider further quantization if needed")
else:
    print("❌ No ONNX models found. Please run the conversion cells above.")


🎉 ONNX Conversion Summary

✅ Successfully converted 2 model(s):
  📦 models\yolov8n-face-lindevs.onnx (11.56 MB)
  📦 models\yolov8n.onnx (12.15 MB)

📱 Ready for mobile deployment!
💡 Next steps:
   1. Test models on your target mobile device
   2. Optimize inference settings for your hardware
   3. Consider further quantization if needed
