<a href="https://colab.research.google.com/github/fkiller/NightWatch/blob/poc-sandbox/onnx_to_tflite.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 1 Introduction

This notebook demonstrates the conversion process from an **.ONNX** model _(exported from MATLAB)_ to a **.tflite** model _(to be used within TensorFlow Lite, on an Android or iOS device.)_ In addition to conversion, this notebook contains cells for running inference using a set of test images to validate that predictions remain consistent across converted models.

> **Note:** TensorFlow's API is constantly evolving. This notebook was written in November of 2019, during the transition period from TF 1.X to TF 2.X, so it is likely that relevant APIs will have updated since.

### 1.1 Initializing input files

Several files are used to demonstrate the conversion process.

- `hasCircularShape_chartObjects_googlenet.onnx`, a demo model exported from MATLAB.
- `test_images.zip`, a .zip with two class directories, each containing 25 images.

The files themselves are provided in the associated GitHub repository for this notebook.

Once uploaded, please validate that the files have been stored in the notebook correctly by running the cell below. Cells can be run by first clicking on them, then using either the **"Run Cell"** button ( ▷ ), or typing **Ctrl**+**Enter**.




In [1]:
# Set path variables
onnx_path = 'hasCircularShape_chartObjects_googlenet.onnx'
img_zip_path = 'test_images.zip'

# Check that correct files have been uploaded
import os

assert os.path.exists(onnx_path)
assert os.path.exists(img_zip_path) 

print("Files uploaded successfully.")

AssertionError: ignored

# 2 ONNX (.onnx) -> TensorFlow FrozenGraph (.pb)

Now that the .onnx model file is stored within the notebook, it can be converted to a .pb model file for use within TensorFlow.

### 2.1 Background Information

**ONNX** is an open-source format for AI models created Facebook and Microsoft [[1]](https://onnx.ai/). The goal of the ONNX format is to provide interoperability between frameworks. The ONNX project provides conversion tools between the ONNX format and formats from other frameworks [[2]](http://onnx.ai/supported-tools). 

**MATLAB** allows model exporting to a file _(serialization)_ in the ONNX format only [[3]](https://www.mathworks.com/help/deeplearning/ref/exportonnxnetwork.html), so conversion is necessary to use MATLAB models with other frameworks.

**TensorFlow** provides support for three different types of non-mobile serialized model formats, depending on the version of TensorFlow installed:
1. FrozenGraph .pb files _(TensorFlow 1.X only)_ [[4]](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py)
2. SavedModel directories _(TensorFlow 1.X and 2.0)_ [[5]](https://www.tensorflow.org/guide/saved_model)
3. HDF5 .h5 files _(TensorFlow 1.X and 2.0)_ [[6]](https://www.tensorflow.org/tutorials/keras/save_and_load)

The **onnx-tf conversion tool** [[7]](https://github.com/onnx/onnx-tensorflow) was created prior to the release of TensorFlow 2.0, thus converted models are provided in the FrozenGraph .pb format only. The tool currently requires TensorFlow 1.X to be installed for conversion to work correctly [[8]](https://github.com/onnx/onnx-tensorflow/issues/521). 

Running the cell below ensures the correct TensorFlow version is imported and also installs the onnx-tf conversion tool.

In [None]:
# onnx-tf was designed for TensorFlow 1.X, so force this version.
%tensorflow_version 1.x 
import tensorflow as tf

# "!" allows command-line input. Use pip package manager to install the 
# conversion package
!pip install onnx-tf

### 2.2 Converting the Model

The conversion process is quite noisy, due to three types of warnings:
1. Deprecation warnings for functions in TensorFlow 1.x not supported in TensorFlow 2.x
2. onnx-tf "Fail to get since_version" warnings.
3. "Unknown op" warnings for two operators in the model graph: ConstantFill and ImageScaler.

Warning **type 1** and **type 2** are harmless and can be suppressed [[9]](https://github.com/tensorflow/tensorflow/issues/27023) [[10]](https://github.com/onnx/onnx-tensorflow/issues/246). 

Warning **type 3** refers to experimental operators that are no longer included in operator set "9", but are still in use by MATLAB's ONNX converter, which uses the operator set "8" by default. This warning arises for two operators: **ConstantFill** [[11]](https://github.com/onnx/onnx/pull/1434) and **ImageScaler** [[12]](https://github.com/onnx/models/issues/76). These operators are still supported in conversion for backwards compatibility purposes [[13]](https://github.com/onnx/models/issues/76#issuecomment-498327977) [[14]](https://github.com/microsoft/onnxruntime/blob/master/docs/Versioning.md#Backwards-compatibility), so this warning type is not critical as of writing this notebook. In the future, however, when operator set "8" is no longer supported by ONNX, this notebook may become out of date. Using operator set "9" in MATLAB could potentially prevent warning type 3. 

With the warnings having been made clear, the conversion process itself requires only a few commands.

In [None]:
import onnx
from onnx_tf.backend import prepare

onnx_model = onnx.load(onnx_path)
tf_rep = prepare(onnx_model)

These lines of code produce **tf_rep**, which is a python class containing four attributes: 
1. tf_rep.graph
2. tf_rep.inputs
3. tf_rep.outputs
4. tf_rep.tensor_dict

These attributes can be used to identify input/output nodes, run inference, and export the intermediate model to a .pb file. 

### 2.3 Exporting the Model to a .pb File

Now that a tf_rep variable has been created, the converted model can be exported to a .pb file and stored within this notebook.

In [None]:
pb_path = "hasCircularShape_chartObjects_googlenet.pb"
tf_rep.export_graph(pb_path)

assert os.path.exists(pb_path)
print(".pb model converted successfully.")

If you would like to save the converted model to your local disk, please run the cell below.

In [None]:
from google.colab import files

files.download(pb_path)

# 3 TensorFlow FrozenGraph (.pb) -> TensorFlow Lite (.tflite)

Now a .pb model has been stored within the notebook, it can be prepared for Android/iOS deployment using the .tflite model format.

### 3.1 Background Information

**TensorFlow Lite** is an open source deep learning framework for on-device inference [[15]](https://www.tensorflow.org/lite). It consists of two main components: the TensorFlow Lite Converter and the TensorFlow Lite Interpreter.

The **TensorFlow Lite converter** is a tool accessible using a Python API that converts trained TensorFlow models into the TensorFlow Lite format (.tflite) [[16]](https://www.tensorflow.org/lite/guide/get_started#2_convert_the_model_format). TensorFlow Lite serializes model data using the open source FlatBuffer format, which has many advantages for mobile applications [[17]](https://google.github.io/flatbuffers/).

The **TensorFlow Lite interpreter** is a library that allows for inference to be run using converted TensorFlow lite models [[18]](https://www.tensorflow.org/lite/guide/get_started#3_run_inference_with_the_model). The interpreter works across multiple platforms and provides an API for running TensorFlow Lite models using Java, Swift, Objective-C, C++, and Python. Thus, a converted model can be evaluated within this notebook using the Python API, and later deployed using an API more suited for Android/iOS development.

> **Note:** The FrozenGraph format (.pb) is supported by TensorFlow 1.X versions only. TensorFlow 1.X code can still be used in TensorFlow 2.0, however. If migrating the code below to a TensorFlow 2.0 environment, backwards compatibility can be ensured by modifying any deprecated function calls [[19]](https://www.tensorflow.org/guide/migrate).

### 3.2 Converting the Model

To use the TFLite converter to convert a FrozenGraph (.pb) file, the input and output nodes of the graph must be explicitly specified. The names of these nodes can be accessed easily using the existing tf_rep object created in **Section 2**.

In [None]:
input_nodes = tf_rep.inputs
output_nodes = tf_rep.outputs
print("The names of the input nodes are: {}".format(input_nodes))
print("The names of the output nodes are: {}".format(output_nodes))

With this information, the TFLiteConverter class can now be called, producing a **tflite_rep** variable which contains converted model data serialized in the TFLite FlatBuffer format.

In [None]:
converter = tf.lite.TFLiteConverter.from_frozen_graph(pb_path,
                                                      input_arrays=input_nodes,
                                                      output_arrays=output_nodes)
tflite_rep = converter.convert()

### 3.3 Exporting the Model to a .tflite File

Now that a tflite_rep variable has been created, the converted model can be exported to a .tflite file and stored within this notebook.

In [None]:
tflite_path = "hasCircularShape_chartObjects_googlenet.tflite"
open(tflite_path, "wb").write(tflite_rep)

assert os.path.exists(tflite_path)
print(".tflite model converted successfully.")

If you would like to save the converted model to your local disk, please run the cell below.

In [None]:
from google.colab import files

files.download(tflite_path)

# 4 Validation of Converted Models

Now that the conversion process has finished, and various model files are stored within this notebook, the validation process can begin.

### 4.1 Switching from TensorFlow 1.x to 2.x

While **TensorFlow 1.x** was needed for the conversion process, **TensorFlow 2.x** will be better supported as time goes on. Thus, the remaining sections will use code that is compatible with TensorFlow 2.x versions. This ensures that further steps will be more in line with current documentation, and should make reusing this code easier.

To switch versions from TensorFlow 1.x to TensorFlow 2.x within this notebook, please first restart the Colab Notebook runtime. This can be done by selecting `Runtime -> Restart runtime...` in the upper menu, as shown in the following image.

> ![Restart Runtime](https://i.imgur.com/O9uZ95H.png)

Once completed, please run the following cell to re-import the correct version of TensorFlow.

In [None]:
%tensorflow_version 2.x
import tensorflow as tf

> **Note:** Writing the following sections largely involved finding documentation, tutorials and StackOverflow questions written for TensorFlow 1.0, then creating a functional equivalent for that code in TensorFlow 2.0 using TensorFlow's migration guide [[19]](https://www.tensorflow.org/guide/migrate). This was challenging! If, in the future, the **onnx-tf** conversion tool allows for export in a non-".pb" format, then it would allow for working with TensorFlow 2.0 from the outset, which could be preferable. This need has been acknowledged in recent activity on the onnx-tensorflow GitHub page [[20]](https://github.com/onnx/onnx-tensorflow/pull/531). 

### 4.2 Initializing Files and Variables

The .zip archive uploaded in **Section 1** of this notebook must first be extracted to a directory within the notebook workspace. This can be done using the cell below.

In [None]:
from zipfile import ZipFile
from glob import glob
import os

# Extract the images to a folder within the workspace
img_zip_path = 'test_images.zip'
img_dir_path = 'img/'
if not os.path.exists(img_dir_path):
    with ZipFile(img_zip_path, 'r') as zip_ref:
        zip_ref.extractall(img_dir_path)

# Check that the .zip archive contains all 50 images
TOTAL_IMAGES = 50
assert len(glob(img_dir_path + '*/*.png')) == TOTAL_IMAGES
print("The .zip was successfully extracted, and contains the required images.")

Additionally, some constant values must be specified which define how the images are processed.


* `CLASS_NAMES`: This will be used to convert a class name string (taken from the directory an image is in) into a vector of True/False values, which represents the image's label.
* `IMG_WIDTH`, `IMG_HEIGHT`: Specifies image resizing dimensions.
* `NUM_CHANNELS`: Specifies whether input images should be considered as grayscale or RGB.

In [None]:
import numpy as np
CLASS_NAMES = np.sort(np.array([os.path.basename(path) 
                                for path in glob(img_dir_path+'/*')]))

IMG_WIDTH = 224
IMG_HEIGHT = 224
NUM_CHANNELS = 3

### 4.3 Loading Images into a Dataset

Images will be loaded into an object of the TensorFlow class **Dataset** [[21]](https://www.tensorflow.org/api_docs/python/tf/data/Dataset). The Dataset class contains methods such as **batch**, **shuffle**, and **filter** which make it easier to manipulate a larger dataset. Dataset objects are used heavily within TensorFlow tutorials and guides, such as ones for building an input pipeline for training and testing [[22]](https://www.tensorflow.org/guide/data).

First, a Dataset object is created which contains the filepath of each image.


In [None]:
path_ds = tf.data.Dataset.list_files(str(img_dir_path+'*/*'))

Then, a custom function is created to load an image from its filepath and preprocess it. 

> **Note:** Within the function below is the use of the **transpose** operation. For TensorFlow and Python packages such as matplotlib, images are defined as arrays of size `(3, height, width)`, where 3 is the number of channels. However, within MATLAB, images are defined as arrays of size `(height, width, 3)`. For loaded images to be used as input to the MATLAB model, a transpose operation is needed to shape the arrays into the correct form.

In [None]:
def process_image_path(image_path):
    # Read and preprocess the image
    img = tf.io.read_file(image_path)
    img = tf.image.decode_png(img, channels=NUM_CHANNELS)
    img = tf.image.resize(img, [IMG_WIDTH, IMG_HEIGHT])
    img = tf.transpose(img, perm=[2, 0, 1])  # (224, 224, 3) -> (3, 224, 224)

    # Extract the class name from the directory
    label_name = tf.strings.split(image_path, os.path.sep)[-2]
    # Compare the string to the list of classes to get a True/False label vector
    label_bool = (label_name == CLASS_NAMES)
    # Change shape from (2,) array to (2, 1) array
    label_bool = tf.expand_dims(label_bool, axis=0)

    return img, label_bool

The Dataset method **map** is then used to apply the custom function, such that items within the filepath Dataset are mapped to an image/label Dataset. 

In [None]:
labeled_ds = path_ds.map(process_image_path)

To demonstrate that the images have been imported correctly, the following cell displays all 50 images in a 5x10 grid.

In [None]:
import matplotlib.pyplot as plt

def show_dataset(ds):
    plt.figure(figsize=(24,12))

    # Iterate through subset of images
    ds_iterator = iter(ds)
    for n in range(TOTAL_IMAGES):
        # Returns (1, 3, 224, 224) image tensor and (1, 2) label tensor
        x, y = next(ds_iterator)

        # Convert (1, 2) label tensor into (2) label array
        y = np.squeeze(y.numpy())

        # Convert (1, 3, 224, 224) tensor into (224, 224, 3) image
        x = np.transpose(np.squeeze(x.numpy()), [1, 2, 0])

        # Scale image to convert from [0, 255] to [0, 1]
        x = x/255

        # Plot image with its label
        ax = plt.subplot(5,10,n+1)
        plt.imshow(x)
        plt.title(CLASS_NAMES[y==1][0].title())
        plt.axis('off')

show_dataset(labeled_ds)

### 4.4 Validation of FrozenGraph .pb Model

To run inference on a FrozenGraph, the TensorFlow 2.0 migration guide [[19]](https://www.tensorflow.org/guide/migrate#a_graphpb_or_graphpbtxt) recommends wrapping the entire graph in a `concrete_function`. The TensorFlow documentation is unclear on what a `concrete_function` is _(the link within the documentation results in a 404 Not Found error)_, but in practice doing this turns the graph into a **callable function for running inference**.

To do this, three steps are needed:
1. Loading the FrozenGraph .pb file into a `graph_def` variable.

In [None]:
pb_path = "hasCircularShape_chartObjects_googlenet.pb"
graph_def = tf.compat.v1.GraphDef()
loaded = graph_def.ParseFromString(open(pb_path,'rb').read())

2. Specifying a custom function for wrapping the loaded frozen graph. *(This is taken directly from the TensorFlow 2.0 migration guide [[19]](https://www.tensorflow.org/guide/migrate#a_graphpb_or_graphpbtxt).)*

In [None]:
def wrap_frozen_graph(graph_def, inputs, outputs):
    def _imports_graph_def():
        tf.compat.v1.import_graph_def(graph_def, name="")

    wrapped_import = tf.compat.v1.wrap_function(_imports_graph_def, [])
    import_graph = wrapped_import.graph

    return wrapped_import.prune(
        tf.nest.map_structure(import_graph.as_graph_element, inputs),
        tf.nest.map_structure(import_graph.as_graph_element, outputs))

3. Finall, using this function to wrap the graph. It requires explicitly naming the input and output nodes, which were found during model conversion in **Section 2**. The ":0" appended to the name specifies which output is desired [[23]](https://stackoverflow.com/questions/40925652/in-tensorflow-whats-the-meaning-of-0-in-a-variables-name). This becomes relevant for nodes with more than one output, but is not relevant here.

In [None]:
if loaded:
    pb_func = wrap_frozen_graph(graph_def, inputs='data:0', outputs='softmax:0')

The function `pb_func` can be called while iterating through the labeled dataset created earlier. It returns the values within the output node, which in this case is a **softmax** layer. The final result, `y_likelihoods`, is a list of likelihood values for each image in the dataset. This list is the same shape as `y_labels`, the list of label values determined as part of the creation of the dataset.

In [None]:
y_likelihoods_pb, y_labels_pb = [], []
for x, y in labeled_ds:
    y_softmax_pb = pb_func(x)  # Run inference

    y_likelihoods_pb.extend(y_softmax_pb.numpy())  # Get arrays from tensors
    y_labels_pb.extend(y.numpy())

The predicted labels and actual labels can be determined from these lists using the **argmax** function. 

In [None]:
import numpy as np

y_predictions_pb = np.argmax(y_likelihoods_pb, axis=1)
y_actual_pb = np.argmax(y_labels_pb, axis=1)

print("Predicted labels: {}".format(y_predictions_pb))
print("Actual labels: {}".format(y_actual_pb))

Finally, evaluation metrics can be calculated from the pair of label lists.

In [None]:
TP = tf.math.count_nonzero(y_predictions_pb * y_actual_pb)
TN = tf.math.count_nonzero((y_predictions_pb - 1) * (y_actual_pb - 1))
FP = tf.math.count_nonzero(y_predictions_pb * (y_actual_pb - 1))
FN = tf.math.count_nonzero((y_predictions_pb - 1) * y_actual_pb)

precision_pb = TP / (TP + FP)
recall_pb = TP / (TP + FN)
f1_pb = 2 * precision_pb * recall_pb / (precision_pb + recall_pb)

print("Precision: {}, Recall: {}, F1 Score: {}".format(precision_pb, 
                                                       recall_pb, 
                                                       f1_pb))

### 4.5 Validation of TensorFlow Lite .tflite Model

Running inference on the .tflite model is straightforward, and done using the **TensorFlow Lite Interpreter** first introduced in **Section 3.1**. 

The steps shown below _(using the Python API)_ do not vary substantially from what would be used in the C++, Java, Swift, or Objective-C APIs [[24]](https://www.tensorflow.org/lite/guide/inference#load_and_run_a_model_in_c) [[25]](https://www.tensorflow.org/lite/guide/inference#load_and_run_a_model_in_java). The only significant distinction between the Python code below and a theoretical mobile implentation would be the loading and preprocessing of input images from the mobile device's local storage.

The first step for running inference using the .tflite model involves creating an interpreter instance from the model path.

In [None]:
tflite_path = "hasCircularShape_chartObjects_googlenet.tflite"
interpreter = tf.lite.Interpreter(tflite_path)

Next, memory is allocated for the input and output tensors.

In [None]:
interpreter.allocate_tensors()

Following this, the index of the model's input and output nodes are extracted from the interpreter. These are used to feed input images into the model, and save likelihoods from the output node.

In [None]:
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

input_index = input_details[0]['index']
output_index = output_details[0]['index']

As with running inference on the previous .pb model, the same labeled dataset is iterated through. There are some distinctions in this case, however:
* The input image must be reshaped from a `(3, 224, 224)` array to a `(1, 3, 224, 224)` array. This may just be a quirk of the TensorFlow Lite Interpreter, as the .pb interpreter did not have this requirement.
* The interpreter requires explicitly setting the input, invoking the interpreter, and getting the output. For the .pb inference, these steps were bundled together into a single function call.
* The numpy array variable `x` must be cleared before the interpreter is invoked. This is done using `del(x)`. 


In [None]:
y_likelihoods_tflite, y_labels_tflite = [], []

for x, y in labeled_ds:
    # (3, 224, 224) -> (1, 3, 224, 224)
    x = tf.expand_dims(x, 0)

    # Explicitly set input tensor
    interpreter.set_tensor(input_index, x)
  
    # Free up numpy reference to internal tensor, then invoke
    del(x)
    interpreter.invoke()

    # Explicitly get the value stored within the output tensor
    y_softmax_tflite = interpreter.get_tensor(output_index)

    y_likelihoods_tflite.extend(y_softmax_tflite)
    y_labels_tflite.extend(y.numpy())

The remaining cells for evaluating the inference results are identical to those found within **Section 4.4**.



In [None]:
import numpy as np

y_predictions_tflite = np.argmax(y_likelihoods_tflite, axis=1)
y_actual_tflite = np.argmax(y_labels_tflite, axis=1)

print("Predicted labels: {}".format(y_predictions_tflite))
print("Actual labels: {}".format(y_actual_tflite))

In [None]:
TP = tf.math.count_nonzero(y_predictions_tflite * y_actual_tflite)
TN = tf.math.count_nonzero((y_predictions_tflite - 1) * (y_actual_tflite - 1))
FP = tf.math.count_nonzero(y_predictions_tflite * (y_actual_tflite - 1))
FN = tf.math.count_nonzero((y_predictions_tflite - 1) * y_actual_tflite)

precision_tflite = TP / (TP + FP)
recall_tflite = TP / (TP + FN)
f1_tflite = 2 * precision_tflite * recall_tflite / (precision_tflite + recall_tflite)

print("Precision: {}, Recall: {}, F1 Score: {}".format(precision_tflite, 
                                                       recall_tflite, 
                                                       f1_tflite))

# 5 Future Work

Now that the .tflite file has been validated as producing the expected inference results, the model can be further optimized, or incorporated into an Android/iOS application.

### 5.1 Further Optimization

TensorFlow provides additional tools for further optimizing the model for computation requirements and disk usage [[26]](https://www.tensorflow.org/lite/guide/get_started#4_optimize_your_model_optional).

### 5.2 Integration into Mobile Application

TensorFlow provides two QuickStart guides for integrating .tflite models into Android [[27]](https://www.tensorflow.org/lite/guide/android) and iOS [[28]](https://www.tensorflow.org/lite/guide/ios).