# Week 4: Model Optimization for Deployment

**Objective:** To prepare our trained model for efficient deployment on various platforms, including edge devices.

This notebook covers:
- Converting the Keras model to the TensorFlow Lite (TFLite) format.
- Applying post-training INT8 quantization to reduce model size and latency.
- Evaluating the performance trade-offs of the optimized models.

In [None]:
import os
import sys
import tensorflow as tf
from tensorflow import keras

sys.path.append(os.path.abspath(os.path.join(os.path.dirname("__file__"), "..", "..")))
from config import *
from Week1_Data_and_Baseline.utils.data_utils import create_data_generators
from Week4_Deployment.utils.optimization_utils import convert_to_tflite, quantize_model, evaluate_tflite_model

## 1.1 - Load the Trained Model and Test Data

In [None]:
# Load the final trained Keras model
model = keras.models.load_model(get_model_path("mobilenetv2", "final"))

# Create a data generator for the test set
_, test_ds = create_data_generators(TRAIN_DIR, TEST_DIR, IMG_SIZE, BATCH_SIZE, RANDOM_SEED)

## 1.2 - Convert to TensorFlow Lite (FP32)

In [None]:
tflite_fp32_path = get_model_path("mobilenetv2_fp32", "tflite")
convert_to_tflite(model, tflite_fp32_path)

## 1.3 - Apply Post-Training Quantization (INT8)

In [None]:
tflite_int8_path = get_model_path("mobilenetv2_int8", "tflite")
quantize_model(model, tflite_int8_path, test_ds)

## 1.4 - Compare Model Sizes and Performance

In [None]:
keras_model_size = os.path.getsize(get_model_path("mobilenetv2", "final")) / (1024 * 1024)
tflite_fp32_size = os.path.getsize(tflite_fp32_path) / (1024 * 1024)
tflite_int8_size = os.path.getsize(tflite_int8_path) / (1024 * 1024)

print(f"Keras Model size: {keras_model_size:.2f} MB")
print(f"TFLite FP32 Model size: {tflite_fp32_size:.2f} MB")
print(f"TFLite INT8 Model size: {tflite_int8_size:.2f} MB")

print('
Evaluating TFLite FP32 model...')
fp32_accuracy = evaluate_tflite_model(tflite_fp32_path, test_ds)
print(f"TFLite FP32 Accuracy: {fp32_accuracy:.2%}")

print('
Evaluating TFLite INT8 model...')
int8_accuracy = evaluate_tflite_model(tflite_int8_path, test_ds)
print(f"TFLite INT8 Accuracy: {int8_accuracy:.2%}")