# Converting PyTorch to TensorFlow Lite for xCORE Using ONNX

ONNX is an open format built to represent machine learning models. We can convert from PyTorch to ONNX, then from ONNX to TensorFlow, then from TensorFlow to TensorFlow Lite, and finally, run it through xformer to optimise it for xCORE.

In [19]:
!pip install torch
!pip install tensorflow
!pip install onnx
!pip install nvidia-pyindex
!pip install onnx-graphsurgeon
!pip install polygraphy
!pip install onnxruntime
!pip install onnxsim
!pip install simple_onnx_processing_tools
!pip install protobuf==3.20.3
!pip install h5py==3.7
!pip install onnx2tf
!pip install onnx-tf
!pip install tensorflow-probability

Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com


Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com


Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Collecting tensorflow-probability
  Downloading tensorflow_probability-0.19.0-py2.py3-none-any.whl (6.7 MB)
[K     |████████████████████████████████| 6.7 MB 3.7 MB/s eta 0:00:01
Collecting dm-tree
  Downloading dm_tree-0.1.8-cp38-cp38-macosx_10_9_x86_64.whl (115 kB)
[K     |████████████████████████████████| 115 kB 39.7 MB/s eta 0:00:01
Installing collected packages: dm-tree, tensorflow-probability
Successfully installed dm-tree-0.1.8 tensorflow-probability-0.19.0


## Import PyTorch Model

For this example, we use mobilenet_v2.

In [4]:
import torch

model = torch.hub.load('pytorch/vision:v0.10.0', 'mobilenet_v2', pretrained=True)

Using cache found in /Users/salmankhan/.cache/torch/hub/pytorch_vision_v0.10.0


## Convert to ONNX


In [5]:
batch_size = 8
channels = 3
height = 224
width = 224

sample_input = torch.rand((batch_size, channels, height, width))

onnx_model_path = "mobilenet_v2.onnx"

torch.onnx.export(
    model,
    sample_input,
    onnx_model_path,
    input_names=['input'],
    output_names=['output']
)

## Representative Dataset

To convert a model into to a TFLite flatbuffer, a representative dataset is required to help in quantisation. Refer to [Converting a keras model into an xcore optimised tflite model](https://colab.research.google.com/github/xmos/ai_tools/blob/develop/docs/notebooks/keras_to_xcore.ipynb) for more details on this.

In [6]:
import numpy as np
def representative_dataset():
    batch_size = 8
    for _ in range(100):
      data = np.random.uniform(-0.1, 0.001, (batch_size, height, width, channels))
      yield [data.astype(np.float32)]

## Using onnx-tensorflow (no longer maintained)

Official ONNX package, however no longer officially maintained: https://github.com/onnx/onnx-tensorflow

In [30]:
import onnx
from onnx_tf.backend import prepare

saved_model_path = "saved_model"
onnx_model = onnx.load(onnx_model_path)
prepare(onnx_model).export_graph(saved_model_path)



In [31]:
import tensorflow as tf

converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_path)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8 
converter.inference_output_type = tf.int8

tflite_model = converter.convert()

# Save the model.
tflite_model_path = 'mobilenet_v2.tflite'
with open(tflite_model_path, 'wb') as f:
  f.write(tflite_model)

RuntimeError: tensorflow/lite/kernels/conv.cc:351 input_channel % filter_input_channel != 0 (2 != 0)Node number 2 (CONV_2D) failed to prepare.

## Using onnx2tf

Using unofficial package: https://github.com/PINTO0309/onnx2tf

### Convert ONNX to Keras

### Convert Keras to TFLite

In [11]:
import tensorflow as tf

converter = tf.lite.TFLiteConverter.from_keras_model(keras_model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8 
converter.inference_output_type = tf.int8

tflite_model = converter.convert()

# Save the model.
tflite_model_path = 'mobilenet_v2.tflite'
with open(tflite_model_path, 'wb') as f:
  f.write(tflite_model)

fully_quantize: 0, inference_type: 6, input_inference_type: INT8, output_inference_type: INT8


# Analysing Models

Defined below is a function to print out the operator counts of each model.

In [12]:
import io
from contextlib import redirect_stdout

def get_operator_counts(model_content):
    with io.StringIO() as buf, redirect_stdout(buf):
        tf.lite.experimental.Analyzer.analyze(model_content=model_content)
        model_structure = buf.getvalue()

    operators = [op.strip().split(" ")[1].split("(")[0] for op in model_structure.split("\n") if "Op#" in op]
    op_counts = {}
    for operator in operators:
        if operator in op_counts:
            op_counts[operator] = op_counts[operator]+1
        else:
            op_counts[operator] = 1
        
    return (len(operators), op_counts)

def print_operator_counts(model_content):
    total_op_count, op_counts = get_operator_counts(model_content)
    print(f"{'Operator'.upper():<20} {'Count'.upper():>6}")
    print("-"*20 + " " + "-"*6)
    
    for operator, count in op_counts.items():
        print(f"{operator.lower():<20} {count:>6}")
        
    print("-"*20 + " " + "-"*6)
    print(f"{'Total'.upper():<20} {total_op_count:>6}")
    print("-"*20 + " " + "-"*6)

In [13]:
print_operator_counts(tflite_model)

OPERATOR              COUNT
-------------------- ------
pad                       5
conv_2d                  35
depthwise_conv_2d        17
add                      10
mean                      1
fully_connected           1
-------------------- ------
TOTAL                    69
-------------------- ------
