# üöÄ YOLOv11 Model Conversion to TFLite

Convert **best.pt (Epoch170)** to TFLite format for mobile deployment in Flutter.

**Why Google Colab?**
- TFLite conversion requires specific dependencies (onnx2tf, tf_keras)
- Dependency conflicts on Windows environment
- Colab has all required packages pre-installed

**Expected Output:**
- `best.onnx` - ONNX format (10-11 MB)
- `best_fp16.tflite` - FP16 quantized (~10 MB)
- `best_int8.tflite` - INT8 quantized (~4-6 MB, fastest on mobile)

## üì¶ Step 1: Install Dependencies

In [None]:
# Install Ultralytics YOLO
!pip install -q ultralytics

# Install TFLite conversion dependencies
!pip install -q tensorflow==2.16.2
!pip install -q onnx>=1.12.0
!pip install -q onnxsim>=0.4.1
!pip install -q onnxruntime>=1.16.0

print("‚úÖ All dependencies installed!")
print("   TensorFlow: 2.16.2")
print("   ONNX: >=1.12.0")
print("   ONNX Runtime: >=1.16.0")

## üì§ Step 2: Upload Model File

Upload `best.pt` from your local machine to Colab.

In [None]:
from google.colab import files
import os

print("üì§ Please upload best.pt file...")
uploaded = files.upload()

# Verify upload
if 'best.pt' in uploaded:
    file_size = os.path.getsize('best.pt') / (1024 * 1024)
    print(f"\n‚úÖ best.pt uploaded successfully!")
    print(f"   Size: {file_size:.2f} MB")
else:
    print("\n‚ùå Error: best.pt not found!")
    print("   Please upload the correct file.")

## üîç Step 3: Verify Model

In [None]:
from ultralytics import YOLO

# Load model
print("üì• Loading model...")
model = YOLO('best.pt')

# Display model info
print("\n‚úÖ Model loaded successfully!")
print("\nüìä Model Information:")
model.info()

print("\nüéØ Expected Performance:")
print("   Precision: 81.64%")
print("   mAP50: 49.14%")
print("   Speed: 1.30ms (PyTorch on RTX 3080 Ti)")

## üîÑ Step 4: Export to ONNX

In [None]:
import time

print("üîÑ Exporting to ONNX format...")
start_time = time.time()

# Export to ONNX
onnx_path = model.export(
    format='onnx',
    imgsz=640,
    simplify=True,
    opset=12,
    dynamic=False
)

export_time = time.time() - start_time
onnx_size = os.path.getsize(onnx_path) / (1024 * 1024)

print(f"\n‚úÖ ONNX export complete!")
print(f"   File: {onnx_path}")
print(f"   Size: {onnx_size:.2f} MB")
print(f"   Export time: {export_time:.1f}s")

## üì± Step 5: Export to TFLite (FP16 Quantization)

In [None]:
print("üì± Exporting to TFLite (FP16)...")
start_time = time.time()

# Export to TFLite with FP16 quantization
tflite_fp16_path = model.export(
    format='tflite',
    imgsz=640,
    int8=False,
    half=True  # FP16 quantization
)

export_time = time.time() - start_time
tflite_size = os.path.getsize(tflite_fp16_path) / (1024 * 1024)

print(f"\n‚úÖ TFLite FP16 export complete!")
print(f"   File: {tflite_fp16_path}")
print(f"   Size: {tflite_size:.2f} MB")
print(f"   Export time: {export_time:.1f}s")
print(f"\nüìä Quantization: FP16 (half precision)")
print(f"   Expected performance: ~81% accuracy, 20-30ms mobile inference")

## üöÄ Step 6: Export to TFLite (INT8 Quantization - Fastest)

In [None]:
print("üöÄ Exporting to TFLite (INT8)...")
print("‚ö†Ô∏è  This will take longer (~2-5 minutes)")
start_time = time.time()

# Export to TFLite with INT8 quantization
tflite_int8_path = model.export(
    format='tflite',
    imgsz=640,
    int8=True,  # INT8 quantization
    data='coco128.yaml'  # Use sample dataset for calibration
)

export_time = time.time() - start_time
tflite_int8_size = os.path.getsize(tflite_int8_path) / (1024 * 1024)

print(f"\n‚úÖ TFLite INT8 export complete!")
print(f"   File: {tflite_int8_path}")
print(f"   Size: {tflite_int8_size:.2f} MB")
print(f"   Export time: {export_time:.1f}s")
print(f"\nüìä Quantization: INT8 (integer precision)")
print(f"   Expected performance: ~80% accuracy, 10-20ms mobile inference")
print(f"   Size reduction: {((tflite_size - tflite_int8_size) / tflite_size * 100):.1f}% smaller than FP16")

## üìä Step 7: Export Summary

In [None]:
import pandas as pd

# Create summary table
summary_data = {
    'Format': ['PyTorch', 'ONNX', 'TFLite FP16', 'TFLite INT8'],
    'Filename': ['best.pt', onnx_path, tflite_fp16_path, tflite_int8_path],
    'Size (MB)': [
        f"{os.path.getsize('best.pt') / (1024 * 1024):.2f}",
        f"{onnx_size:.2f}",
        f"{tflite_size:.2f}",
        f"{tflite_int8_size:.2f}"
    ],
    'Platform': ['Desktop', 'Mobile/Server', 'Mobile', 'Mobile (Optimized)'],
    'Est. Speed': ['1.30ms', '20-50ms', '20-30ms', '10-20ms'],
    'Accuracy': ['81.64%', '81.64%', '~81%', '~80%']
}

summary_df = pd.DataFrame(summary_data)

print("\n" + "="*80)
print("üìä EXPORT SUMMARY")
print("="*80)
print(summary_df.to_string(index=False))
print("="*80)

print("\nüéØ Recommendations:")
print("   ‚Ä¢ Use ONNX for server/desktop deployment (best accuracy)")
print("   ‚Ä¢ Use TFLite FP16 for mobile (good balance)")
print("   ‚Ä¢ Use TFLite INT8 for mobile (fastest, smallest)")

## üì• Step 8: Download Converted Models

In [None]:
from google.colab import files
import os

print("üì• Download converted models:")
print("\n1Ô∏è‚É£ ONNX format (recommended for server):")
files.download(onnx_path)
print(f"   ‚úÖ Downloaded: {onnx_path}")

print("\n2Ô∏è‚É£ TFLite FP16 (good for mobile):")
files.download(tflite_fp16_path)
print(f"   ‚úÖ Downloaded: {tflite_fp16_path}")

print("\n3Ô∏è‚É£ TFLite INT8 (fastest for mobile):")
files.download(tflite_int8_path)
print(f"   ‚úÖ Downloaded: {tflite_int8_path}")

print("\n‚úÖ All models downloaded successfully!")

## üß™ Step 9: Test Inference (Optional)

In [None]:
# Test with sample image
print("üß™ Testing inference on sample image...")

# Upload test image
print("\nüì§ Upload a test image (optional):")
test_upload = files.upload()

if test_upload:
    test_image = list(test_upload.keys())[0]
    
    # Run inference with PyTorch model
    print(f"\nüîç Running inference on {test_image}...")
    results = model.predict(test_image, conf=0.25, verbose=False)
    
    # Show results
    print(f"\n‚úÖ Detection complete!")
    print(f"   Plates detected: {len(results[0].boxes)}")
    
    for i, box in enumerate(results[0].boxes, 1):
        conf = float(box.conf[0])
        print(f"   Plate {i}: {conf:.2%} confidence")
    
    # Show annotated image
    annotated = results[0].plot()
    from IPython.display import Image, display
    import cv2
    cv2.imwrite('result.jpg', annotated)
    display(Image('result.jpg'))
else:
    print("\n‚ÑπÔ∏è  No test image uploaded, skipping inference test.")

## üì± Flutter Integration Guide

### Option 1: TFLite Flutter Plugin

```yaml
# pubspec.yaml
dependencies:
  tflite_flutter: ^0.10.0
  image: ^4.0.0
```

```dart
// main.dart
import 'package:tflite_flutter/tflite_flutter.dart';

class PlateDetector {
  late Interpreter interpreter;
  
  Future<void> loadModel() async {
    interpreter = await Interpreter.fromAsset('assets/best_int8.tflite');
    print('Model loaded: ${interpreter.getInputTensors()}');
  }
  
  Future<List<Detection>> detectPlate(Image image) async {
    // Preprocess: resize to 640x640
    var input = preprocessImage(image);
    
    // Run inference
    var output = List.filled(1 * 5 * 8400, 0.0).reshape([1, 5, 8400]);
    interpreter.run(input, output);
    
    // Post-process
    return parseDetections(output);
  }
}
```

### Option 2: ONNX Runtime Flutter

```yaml
# pubspec.yaml
dependencies:
  onnxruntime: ^1.15.0
```

```dart
import 'package:onnxruntime/onnxruntime.dart';

final session = OrtSession.fromAsset('assets/best.onnx');
final output = session.run([input]);
```

### Performance Tips

1. **Use INT8 model** for fastest inference (10-20ms)
2. **Enable GPU delegate** if device supports
3. **Pre-allocate buffers** for inference
4. **Batch processing** for multiple images
5. **Rotation correction** before inference (use same algorithm from Python)

---

**‚úÖ Conversion complete! Your models are ready for mobile deployment.**