# Converting Keras/Tensorflow Model to tflite model 
- This Demo showing how to converting the model into a tflite format which Nuvoton MCU can support
-  More detail and reference from here: [post_training_quantization](https://ai.google.dev/edge/litert/models/post_training_quantization?hl=zh-cn)

## 1. Load your model
- Here we use mobilenetv2 model of image classification task as an example.

In [1]:
import tensorflow as tf
import numpy as np
import os
import sys
import logging

IMG_SHAPE = (160, 160, 3)
custom_model = tf.keras.applications.MobileNetV2(input_shape=IMG_SHAPE,
                                                include_top=True,
                                                weights='imagenet',
                                                alpha=0.35)

- Another example: keras3 model
- Need to install keras3

In [None]:
import keras

custom_model_path = 'example_model.keras'
keras_model = tf.keras.models.load_model(custom_model_path)

## 2. Create the representative dataset
- This is needed if you wnat to convert to int8 full quantitation model (PTQ method).
- Below is the example random dataset, you need to create your representative dataset follow the same inputs format of model.

In [2]:
def representative_dataset():
    for _ in range(100):
      data = np.random.rand(1, 160, 160, 3)
      yield [data.astype(np.float32)]

## 3. Converting to tflite
- We show 3 differents tflite models, No quantization, Full integer quantization, and INT8 Full integer quantization.
- All models can support CPU device.
- For NPU device like M55M1, it only supports INT8 Full integer quantization and need vela compiler.


### No Quantization

In [15]:
MODEL_NAME = 'MNV2'
tf.get_logger().setLevel(logging.ERROR)

converter = tf.lite.TFLiteConverter.from_keras_model(custom_model)
tflite_model = converter.convert()
output_location = os.path.join('model', (MODEL_NAME + r'.tflite'))
with open(output_location, 'wb') as f:
    f.write(tflite_model)
    print("The tflite output location: {}".format(output_location))

Saved artifact at 'C:\Users\cychen38\AppData\Local\Temp\tmplu7wb4n6'. The following endpoints are available:

* Endpoint 'serve'
  args_0 (POSITIONAL_ONLY): TensorSpec(shape=(None, 160, 160, 3), dtype=tf.float32, name='keras_tensor')
Output Type:
  TensorSpec(shape=(None, 1000), dtype=tf.float32, name=None)
Captures:
  2853761443344: TensorSpec(shape=(), dtype=tf.resource, name=None)
  2853761635552: TensorSpec(shape=(), dtype=tf.resource, name=None)
  2853761637664: TensorSpec(shape=(), dtype=tf.resource, name=None)
  2853761630976: TensorSpec(shape=(), dtype=tf.resource, name=None)
  2853761633264: TensorSpec(shape=(), dtype=tf.resource, name=None)
  2853761645056: TensorSpec(shape=(), dtype=tf.resource, name=None)
  2853761644000: TensorSpec(shape=(), dtype=tf.resource, name=None)
  2853761640480: TensorSpec(shape=(), dtype=tf.resource, name=None)
  2853761637312: TensorSpec(shape=(), dtype=tf.resource, name=None)
  2853761645232: TensorSpec(shape=(), dtype=tf.resource, name=None)
 

### Full integer quantization

In [6]:
MODEL_NAME = 'MNV2'
converter = tf.lite.TFLiteConverter.from_keras_model(custom_model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
tflite_model = converter.convert()
output_location = os.path.join('model', (MODEL_NAME + r'_fullquant.tflite'))
with open(output_location, 'wb') as f:
    f.write(tflite_model)
    print("The tflite output location: {}".format(output_location))

Saved artifact at 'C:\Users\cychen38\AppData\Local\Temp\tmp8dw3_z__'. The following endpoints are available:

* Endpoint 'serve'
  args_0 (POSITIONAL_ONLY): TensorSpec(shape=(None, 160, 160, 3), dtype=tf.float32, name='keras_tensor')
Output Type:
  TensorSpec(shape=(None, 1000), dtype=tf.float32, name=None)
Captures:
  2853761443344: TensorSpec(shape=(), dtype=tf.resource, name=None)
  2853761635552: TensorSpec(shape=(), dtype=tf.resource, name=None)
  2853761637664: TensorSpec(shape=(), dtype=tf.resource, name=None)
  2853761630976: TensorSpec(shape=(), dtype=tf.resource, name=None)
  2853761633264: TensorSpec(shape=(), dtype=tf.resource, name=None)
  2853761645056: TensorSpec(shape=(), dtype=tf.resource, name=None)
  2853761644000: TensorSpec(shape=(), dtype=tf.resource, name=None)
  2853761640480: TensorSpec(shape=(), dtype=tf.resource, name=None)
  2853761637312: TensorSpec(shape=(), dtype=tf.resource, name=None)
  2853761645232: TensorSpec(shape=(), dtype=tf.resource, name=None)
 



The tflite output location: model\MNV2_fullquant.tflite


### Int8 Full integer quantization
- including input and output int8. For ARM NPU device.

In [7]:
MODEL_NAME = 'MNV2'
converter = tf.lite.TFLiteConverter.from_keras_model(custom_model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8, tf.lite.OpsSet.TFLITE_BUILTINS]
converter.representative_dataset = representative_dataset
converter.inference_input_type = tf.int8  # or tf.uint8
converter.inference_output_type = tf.int8  # or tf.uint8
tflite_model = converter.convert()
output_location = os.path.join('model', (MODEL_NAME + r'_int8quant.tflite'))
with open(output_location, 'wb') as f:
    f.write(tflite_model)
    print("The tflite output location: {}".format(output_location))

Saved artifact at 'C:\Users\cychen38\AppData\Local\Temp\tmp1h6aw8rh'. The following endpoints are available:

* Endpoint 'serve'
  args_0 (POSITIONAL_ONLY): TensorSpec(shape=(None, 160, 160, 3), dtype=tf.float32, name='keras_tensor')
Output Type:
  TensorSpec(shape=(None, 1000), dtype=tf.float32, name=None)
Captures:
  2853761443344: TensorSpec(shape=(), dtype=tf.resource, name=None)
  2853761635552: TensorSpec(shape=(), dtype=tf.resource, name=None)
  2853761637664: TensorSpec(shape=(), dtype=tf.resource, name=None)
  2853761630976: TensorSpec(shape=(), dtype=tf.resource, name=None)
  2853761633264: TensorSpec(shape=(), dtype=tf.resource, name=None)
  2853761645056: TensorSpec(shape=(), dtype=tf.resource, name=None)
  2853761644000: TensorSpec(shape=(), dtype=tf.resource, name=None)
  2853761640480: TensorSpec(shape=(), dtype=tf.resource, name=None)
  2853761637312: TensorSpec(shape=(), dtype=tf.resource, name=None)
  2853761645232: TensorSpec(shape=(), dtype=tf.resource, name=None)
 



The tflite output location: model\MNV2_int8quant.tflite


# 4. Change the tflite model to C binary format
- First we convert (A.) No Quantization & (B.) Full integer quantization model to support cortex-M CPU
- Second we convert (C.) Int8 Full integer quantization model to support NPU (with ARM vela compiler)

### (A.) No Quantization

In [16]:
TFLITE_PATH = os.path.join('model', (MODEL_NAME + r'.tflite'))
OUT_FILE = os.path.join('model', (MODEL_NAME + r'_tflite.cc'))

! python tflite_to_tflu_para.py --tflite_path $TFLITE_PATH --output_path $OUT_FILE

### (B.) Full integer quantization

In [17]:
TFLITE_PATH = os.path.join('model', (MODEL_NAME + r'_fullquant.tflite'))
OUT_FILE = os.path.join('model', (MODEL_NAME + r'_fullquant_tflite.cc'))

! python tflite_to_tflu_para.py --tflite_path $TFLITE_PATH --output_path $OUT_FILE

#### 1. Move the tflite model to your C++ project
- Tflite model C++ file can be replaced in model or generated directory.
- Please check the detail in [OpenNuvoton/ML_M460_SampleCode](https://github.com/OpenNuvoton/ML_M460_SampleCode/tree/master)

### (C.) Int8 Full integer quantization + vela compiler for NPU device

#### 1. Edit `vela/variables.bat`

- Set `MODEL_SRC_DIR` to the directory saves the model
  - Here is `..\model` 
- Update `MODEL_SRC_FILE` to your int8 full interger quantization tflite model
  - For example: `MODEL_SRC_FILE=MNV2_int8quant.tflite`

#### 2. Execute `vela/gen_model_cpp.bat`

- The output vela tflite & vela tflite C++ files are in `vela/generated`

#### 3. Move the tflite model to your C++ project

- Vela tflite model file can loaded in SD card.
- Vela tflite model C++ file can be replaced in model or generated directory.
- Please check the detail in [OpenNuvoton/ML_M55M1_SampleCode](https://github.com/OpenNuvoton/ML_M55M1_SampleCode/tree/master)