# Tensorflow or Keras Model to TensorRT Using ONNX

This notebook show workflow of optimziing Tensorflow or Keras model with ONNX and TensorRT. Please refere to [this tutorial from Nvidia](https://developer.nvidia.com/blog/speeding-up-deep-learning-inference-using-tensorflow-onnx-and-tensorrt/) for more information

The steps needed to optimzie Tensorflow/Keras model with ONNX and TensorRT:
1. Convert the TensorFlow/Keras model to a .pb file.
2. Convert the .pb file to the ONNX format.
3. Create a TensorRT engine. 
4. Run inference from the TensorRT engine.


## Step 1: Convert the TensorFlow/Keras model to a .pb file.
In this step will freeze the graph and save it as pb fromat
kears_to_pb()
take 3 arguments:
    model: The Keras model.
    output_filename: The output .pb file name.
    output_node_names: The output nodes of the network. If None, then 
    the function gets the last layer name as the output node.

In [1]:
# %load_ext autoreload
# %autoreload 2

from keras_to_pb_tf2  import keras_to_pb
from keras.models import load_model

#User defined values
#Input file path
MODEL_PATH = '/home/cps/Desktop/tf2trt_with_onnx-master/facenet_keras.h5'
#output files paths
PB_FILE_PATH = '/home/cps/Desktop/tf2trt_with_onnx-master/keras-facenet-20230208T110222Z-001/facenet_freezed.pb'
ONNX_FILE_PATH = '/home/cps/Desktop/tf2trt_with_onnx-master/keras-facenet-20230208T110222Z-001/facenet_onnx.onnx'
TRT_ENGINE_PATH = '/home/cps/Desktop/tf2trt_with_onnx-master/keras-facenet-20230208T110222Z-001/facenet_engine.plan'
#End user defined values



Using TensorFlow backend.


In [2]:
model = load_model(MODEL_PATH)
input_name, output_node_names = keras_to_pb(model, PB_FILE_PATH, None)


Instructions for updating:
If using Keras pass *_constraint arguments to layers.
Instructions for updating:
Use `tf.compat.v1.graph_util.convert_variables_to_constants`
Instructions for updating:
Use `tf.compat.v1.graph_util.extract_sub_graph`




INFO:tensorflow:Froze 490 variables.
INFO:tensorflow:Converted 490 variables to const ops.


## Step 2: Convert the .pb file to the ONNX format.

Second step is to convert .pb file to ONNX fromat using `tf2onnx`. First install [tf2onnx](https://github.com/onnx/tensorflow-onnx).
To install `tf2onnx`use this command `pip install -U tf2onnx`

This may take more than 10 min to finish.  
If command crash try to run it in terminal after closing Jupyter notebook and all other applications.  

```
python -m tf2onnx.convert --input /home/jetson-tx2/code/onnx/models/facenet.pb --inputs input_1:0[1,160,160,3] --outputs Bottleneck_BatchNorm/batchnorm_1/add_1:0 --output facenet.onnx
```

In [3]:
!pip3 install -U tf2onnx

Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Collecting protobuf<4,>=3.20.2
  Downloading protobuf-3.20.3-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.1/1.1 MB[0m [31m3.9 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m0m
Installing collected packages: protobuf
  Attempting uninstall: protobuf
    Found existing installation: protobuf 3.19.6
    Uninstalling protobuf-3.19.6:
      Successfully uninstalled protobuf-3.19.6
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorflow 2.11.0 requires protobuf<3.20,>=3.9.2, but you have protobuf 3.20.3 which is incompatible.[0m[31m
[0mSuccessfully installed protobuf-3.20.3

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[

In [4]:
!python -m tf2onnx.convert --input {PB_FILE_PATH} --inputs {input_name}:0[1,160,160,3] --outputs {output_node_names[0]}:0 --output {ONNX_FILE_PATH}

/home/cps/anaconda3/envs/test_env/bin/python: Error while finding module specification for 'tf2onnx.convert' (ModuleNotFoundError: No module named 'tf2onnx')


## Step 3: Create a TensorRT engine from ONNX

In [6]:
from onnx_to_trt import create_engine

create_engine(ONNX_PATH, TRT_ENGINE_PATH)

ModuleNotFoundError: No module named 'tensorrt'

## Step 4: Run inference from the TensorRT engine

The TensorRT engine runs inference in the following workflow: 

1. Allocate buffers for inputs and outputs in the GPU.
2. Copy data from the host to the allocated input buffers in the GPU.
3. Run inference in the GPU. 
4. Copy results from the GPU to the host. 
5. Reshape the results as necessary. 

Note: this is the code needed for inference. To test FacenetTRT with real image check script file `test_facenet_trt.py`


In [None]:
import inference as inf

TRT_LOGGER = trt.Logger(trt.Logger.INTERNAL_ERROR)
trt_runtime = trt.Runtime(TRT_LOGGER)

engine = eng.load_engine(trt_runtime, engine_path)
print('Engine loaded successfully...')

h_input, d_input, h_output, d_output, stream = inf.allocate_buffers(engine, 1, trt.float32)
out = inf.do_inference(engine, samples, h_input, d_input, h_output, d_output, stream, 1, 160, 160)

