# Run TFLite Converter from Arachne

Here, we explain how to use the TFLite Converter from Arachne especially focusing on controlling the tool behavior.

## Prepare a Model

First, we have to prepare a model to be used in this tutorial.
Here, we will use a ResNet-50 v2 model tuning for the `tf_flowers` dataset.

In [2]:

import tensorflow as tf
import tensorflow_datasets as tfds

# Initialize a model
model = tf.keras.applications.resnet_v2.ResNet50V2(weights=None, classes=5)
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=["accuracy"])
model.summary()

# Load the tf_flowers dataset
train_dataset, val_dataset = tfds.load(
    "tf_flowers", split=["train[:90%]", "train[90%:]"], as_supervised=True
)

# Preprocess the datasets
def preprocess_dataset(is_training=True):
    def _pp(image, label):
        if is_training:
            image = tf.image.resize(image, (280, 280))
            image = tf.image.random_crop(image, (224, 224, 3))
            image = tf.image.random_flip_left_right(image)
        else:
            image = tf.image.resize(image, (224, 224))
        image = tf.keras.applications.imagenet_utils.preprocess_input(x=image, mode='tf')
        label = tf.one_hot(label, depth=5)
        return image, label
    return _pp


def prepare_dataset(dataset, is_training=True):
    dataset = dataset.map(preprocess_dataset(is_training), num_parallel_calls=tf.data.AUTOTUNE)
    return dataset.batch(16).prefetch(tf.data.AUTOTUNE)

train_dataset = prepare_dataset(train_dataset, True)
val_dataset = prepare_dataset(val_dataset, False)

# Training
model.fit(train_dataset, validation_data=val_dataset, epochs=20)

model.evaluate(val_dataset)

model.save("/tmp/resnet50-v2.h5")

Model: "resnet50v2"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_2 (InputLayer)            [(None, 224, 224, 3) 0                                            
__________________________________________________________________________________________________
conv1_pad (ZeroPadding2D)       (None, 230, 230, 3)  0           input_2[0][0]                    
__________________________________________________________________________________________________
conv1_conv (Conv2D)             (None, 112, 112, 64) 9472        conv1_pad[0][0]                  
__________________________________________________________________________________________________
pool1_pad (ZeroPadding2D)       (None, 114, 114, 64) 0           conv1_conv[0][0]                 
_________________________________________________________________________________________



## Run TFLite Converter from Arachne

Now, let's convert the model into a TFLite model by Arachne.
To use the TFLite Converter, we have to specify `+tools=tflite_converter` to `arachne.driver.cli`.
Available options can be seen by adding `--help`.

In [8]:
%%bash

python -m arachne.driver.cli +tools=tflite_converter --help

cli is powered by Hydra.

== Configuration groups ==
Compose your configuration from those groups (group=option)

tools: onnx_simplifier, onnx_tf, openvino2tf, openvino_mo, tflite_converter, tftrt, torch2onnx, torch2trt, tvm
tvm_target: dgx-1, dgx-s, jetson-nano, jetson-xavier-nx, rasp4b64


== Config ==
Override anything in the config (foo.bar=value)

input: ???
input_spec: null
output: ???
tools:
  tflite_converter:
    enable_tf_ops: false
    allow_custom_ops: true
    ptq:
      method: none
      representative_dataset: null


Powered by Hydra (https://hydra.cc)
Use --hydra-help to view Hydra specific help




### Convert with FP32 Precision

First, we will start with the simplest case.
You can convert a TF model into a TFLite mode without the post-training quantization (PTQ) by the following command.

In [11]:
%%bash

python -m arachne.driver.cli +tools=tflite_converter model_file=/tmp/resnet50-v2.h5 output_path=/tmp/output_fp32.tar

2022-03-22 04:22:19.715347: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE3 SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-03-22 04:22:20.396518: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 30554 MB memory:  -> device: 0, name: NVIDIA Tesla V100-SXM2-32GB, pci bus id: 0000:89:00.0, compute capability: 7.0
2022-03-22 04:22:33.644336: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
INFO:tensorflow:Assets written to: /tmp/tmplf6w_hhu/assets
2022-03-22 04:22:48.343187: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability

To check the converted model, please unpack the output TAR file and inspect the tflite model file by a model viewer like the Netron.

In [13]:
%%bash

tar xf /tmp/output_fp32.tar -C /tmp
ls /tmp/model_0.tflite

/tmp/model_0.tflite


### Convert with Dynamic-Range or FP16 Precision

To convert with the dynamic range or FP16 precision, just set `dynamic_range` or `fp16` to the `tools.tflite_converter.ptq.method` option.

In [14]:
%%bash

python -m arachne.driver.cli +tools=tflite_converter model_file=/tmp/resnet50-v2.h5 output_path=/tmp/output_dr.tar \
    tools.tflite_converter.ptq.method=dynamic_range

python -m arachne.driver.cli +tools=tflite_converter model_file=/tmp/resnet50-v2.h5 output_path=/tmp/output_fp16.tar \
    tools.tflite_converter.ptq.method=fp16

2022-03-22 04:29:32.815027: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE3 SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-03-22 04:29:33.622712: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 30554 MB memory:  -> device: 0, name: NVIDIA Tesla V100-SXM2-32GB, pci bus id: 0000:89:00.0, compute capability: 7.0
2022-03-22 04:29:47.994655: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
INFO:tensorflow:Assets written to: /tmp/tmp7d2kfrk0/assets
2022-03-22 04:30:03.780610: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability

### Convert with INT8 Precision

To convert with INT8 precision, we need calibrate or estimate the range of all floating-point tensors in the model.
We provide an interface to feed the dataset to be used in the calibration.
First, we have to prepare a NPY file that contains a list of `np.ndarray` which is a dataset used for calibration.

In [24]:
import numpy as np
calib_dataset = []

for image, label in val_dataset.unbatch().batch(1).take(100):
    calib_dataset.append(image.numpy())
np.save("/tmp/calib_dataset.npy", calib_dataset)

Next, specify `int8` to the `tools.tflite_converter.ptq.method` option and pass the NPY file to the `tools.tflite_converter.ptq.representative_dataset`.

In [25]:
%%bash

python -m arachne.driver.cli +tools=tflite_converter model_file=/tmp/resnet50-v2.h5 output_path=/tmp/output_int8.tar \
    tools.tflite_converter.ptq.method=int8 tools.tflite_converter.ptq.representative_dataset=/tmp/calib_dataset.npy


2022-03-22 04:49:56.246132: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE3 SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-03-22 04:49:56.959197: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 30554 MB memory:  -> device: 0, name: NVIDIA Tesla V100-SXM2-32GB, pci bus id: 0000:89:00.0, compute capability: 7.0
2022-03-22 04:50:11.664946: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
INFO:tensorflow:Assets written to: /tmp/tmpb3inm9cn/assets
2022-03-22 04:50:28.988691: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability

## Run TFLite Converter from Arachne Python Interface

The following code shows an example of using the TFLite Converter from Arachne Python interface.

In [3]:
from arachne.utils.model_utils import init_from_file, save_model
from arachne.tools.tflite_converter import TFLiteConverter, TFLiteConverterConfig

model_file_path = "/tmp/resnet50-v2.h5"
input = init_from_file(model_file_path)

cfg = TFLiteConverterConfig()

# plz modify the config object to control the converter behavior
# cfg.ptq.method = "FP16"

output = TFLiteConverter.run(input, cfg)

save_model(model=output, output_path="/tmp/output.tar")

INFO:tensorflow:Assets written to: /tmp/tmpne2tina6/assets


INFO:tensorflow:Assets written to: /tmp/tmpne2tina6/assets
