**Converting Segmentation Models into ONNX Format**

Th nvidia **deepstream** sdk currently supports model formats like **caffe, onnx and uff** only. So we need to convert the **keras and tensorflow** models to **onnx nchw** format for deployment.

In [0]:
# Install tensorflow-gpu 1.15 
!pip install tensorflow-gpu==1.15

**Keras to ONNX: Prisma-net**

Install **keras2onnx** package for converting the **prisma-net keras model to onnx** format.

In [0]:
!pip install git+https://github.com/microsoft/onnxconverter-common
!pip install git+https://github.com/onnx/keras-onnx

In [0]:
from tensorflow.keras.models import load_model
import tensorflow as tf
from tensorflow.keras.layers import Activation, Lambda, Reshape, Permute
from tensorflow.keras.models import Model

In [0]:
import onnx, os
os.environ['TF_KERAS'] = '1' # USe tf.keras backend
import numpy as np
import keras2onnx

In [0]:
def bilinear_resize(x, rsize):
  return tf.image.resize_bilinear(x, [rsize,rsize], align_corners=True)

Load the keras **prisma-net** model

In [0]:
prisma_model=load_model('/content/prisma-net-15-0.08.hdf5')

Add a **permute** laye for converting **output** into **NCHW** format.

In [0]:
nchw=Permute((3,1,2))(prisma_model.output)
nchw_model=Model(inputs=prisma_model.input, outputs=nchw)
nchw_model.summary()

Convert the keras model to **onnx** format

In [0]:
onnx_prisma = keras2onnx.convert_keras(nchw_model,'prisma_nchw', channel_first_inputs=['input_3'], target_opset=7)
keras2onnx.save_model(onnx_prisma, 'prisma_nchw.onnx')
onnx.checker.check_model(onnx_prisma)

**Note:** Keep the **target opset** to **minimum** value, so that it runs with the IR version of **inference engine** parser.

**Tensorflow to ONNX: DeeplabV3+**

Install **tf2onnx** package for converting the **deeplab tensorflow frozen model to onnx** format.

In [0]:
!pip install -U tf2onnx

Add a **permute** layer  to get the output in **NCHW** format

In [0]:
import numpy as np
import tensorflow as tf
from tensorflow.python.platform import gfile

GRAPH_PB_PATH = '/content/transform_deeplab_graph_fin4.pb'
with tf.Session() as sess:

   print("Loading graph...")
   with gfile.FastGFile(GRAPH_PB_PATH,'rb') as f:
       graph_def = tf.GraphDef()
   graph_def.ParseFromString(f.read())
   sess.graph.as_default()
   tf.import_graph_def(graph_def, name='')
   
   # Add transpose layer for nchw output
   output2 = tf.transpose(tf.get_default_graph().get_tensor_by_name("ResizeBilinear_2:0"), perm=[0,3,1,2])

   print("Writing graph...")
   tf.train.write_graph(tf.get_default_graph().as_graph_def(), 'deeplab_nchw_frozen','deeplab_nchw.pb',as_text=False)

**Convert** the keras model to **onnx** format

In [0]:
!python -m tf2onnx.convert --graphdef /content/deeplab_nchw_frozen/deeplab_nchw.pb --output deeplab_channel.onnx --inputs MobilenetV2/MobilenetV2/input:0 --inputs-as-nchw MobilenetV2/MobilenetV2/input:0 --outputs transpose:0

**Optimize** the onnx model by **folding batch norm** layers

In [0]:
import onnx
from onnx import optimizer

# Load the model to be optimized.
model_path = '/content/deeplab_channel.onnx'
original_model = onnx.load(model_path)

# Fuse the batchnorm layers into conv
optim_passes = ['fuse_bn_into_conv']

# Apply the optimization on the original model
optimized_deeplab = optimizer.optimize(original_model, optim_passes)
onnx.checker.check_model(optimized_deeplab)
onnx.save(optimized_deeplab, '/content/deeplab_nchw.onnx')

**Note:** Ensure that the **inputs are in nchw** format during conversion by using appropriate commnand-line arguments eg: **inputs-as-nchw**. Now for outputs, add a **permute** layer to convert them to NCHW format. These steps helps us to **prevent** the addition of **redundant transpose** layers in the converted model. Now, we can run these onnx model directly using the **nvidia deepstream** inference engine.