## Quantizing keras models
In this notebook, we illustrate how keras models can be quantized using random calibration data. For real application use, actual training or validation data should be used for calibration. The quantized models can then be compiled and run on SW and HW to verify that the model runs properly. Performance estimate can also be made using utility tools.

In [None]:
import tensorflow as tf
from tensorflow.keras.applications import ResNet50
import nnexpress.compiler as n2xc
from nnexpress_utils import compare_n2x_sw_hw, estimate_fps
from quant_utils import representative_dataset, modelQuantizeImages

In [None]:
image_size=(224,224,3)
float_model=ResNet50(input_shape=image_size,weights='imagenet',include_top=True)

In [None]:
dataset_generator = lambda : representative_dataset(image_size)
model_quant = modelQuantizeImages(float_model, dataset_generator)
with open('quant_model.tflite', 'wb') as f:
    f.write(model_quant)

In [None]:
tflite_path='quant_model.tflite'
n2x_sw_path='quant_model_sw.n2x'
n2x_hw_path='quant_model_hw.n2x'

In [None]:
sw_model=n2xc.Compiler(tflite_path, device='SW')
hw_model=n2xc.Compiler(tflite_path, device='HW')
sw_model.save(n2x_sw_path)
hw_model.save(n2x_hw_path)

**Note**: This below cells can run only on a device equipped with ORCA

In [None]:
match=compare_n2x_sw_hw(n2x_sw_path,n2x_hw_path)
if match:
    print('N2X SW and HW match')
else:
    print('N2X SW and HW do not match')  

In [None]:
fps=estimate_fps(n2x_hw_path)
print('FPS=',fps)