<a href="https://colab.research.google.com/github/saffie91/yolov7-tflite-conversion/blob/main/Yolov7_Model_Conversion_Tutorial.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# YoloV7 TFlite Conversion

First of all we're going to need to export the onnx from the YoloV7 repo export.py using:

In [None]:
!python3 export.py --weights best.pt --grid --end2end --simplify --topk-all 100 --conf-thres 0.35 --img-size 320 320 --max-wh 320

Make sure that you have the right versions of these libraries. Tf-nightly is required to convert the model to tflite properly.

In [None]:
import onnx
import onnxruntime as ort
import time
import cv2
import numpy as np
import random
from PIL import Image
import tensorflow as tf
import coremltools
import matplotlib.pyplot as plt

Check the onnx model to see if there were any issues with the export. If the model passes the checker make an inference session.

In [None]:
#check model
onnx_model = onnx.load("best.onnx")
onnx.checker.check_model(onnx_model)

In [None]:
#make session
so = ort.SessionOptions()
session = ort.InferenceSession('best.onnx')

We need to make sure that the input is right. We will be resizing every image to 320 since thats what we want to use for our TFlite image size.

Put the names of the classes for your model here.

When the resizing is being done make sure that you are keeping the aspect ratio by using padding.

Rescale your image and get your input ready for the Onnx model

In [None]:
#prepare the input
def letterbox(im, new_shape=(320, 320), color=(114, 114, 114), auto=True, scaleup=True, stride=32):
    # Resize and pad image while meeting stride-multiple constraints
    shape = im.shape[:2]  # current shape [height, width]
    if isinstance(new_shape, int):
        new_shape = (new_shape, new_shape)

    # Scale ratio (new / old)
    r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
    if not scaleup:  # only scale down, do not scale up (for better val mAP)
        r = min(r, 1.0)

    # Compute padding
    new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
    dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1]  # wh padding

    if auto:  # minimum rectangle
        dw, dh = np.mod(dw, stride), np.mod(dh, stride)  # wh padding

    dw /= 2  # divide padding into 2 sides
    dh /= 2

    if shape[::-1] != new_unpad:  # resize
        im = cv2.resize(im, new_unpad, interpolation=cv2.INTER_LINEAR)
    top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
    left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
    im = cv2.copyMakeBorder(im, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color)  # add border
    return im, r, (dw, dh)

names = []
colors = {name:[random.randint(0, 255) for _ in range(3)] for i,name in enumerate(names)}

image=cv2.imread('inference/images/bus.jpg')
img = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
image = img.copy()
image, ratio, dwdh = letterbox(image, auto=False)
image = image.transpose((2, 0, 1))
image = np.expand_dims(image, 0)
image = np.ascontiguousarray(image)

im = image.astype(np.float32)
im /= 255
im.shape

outname = [i.name for i in session.get_outputs()]
outname

inname = [i.name for i in session.get_inputs()]
inname

inp = {inname[0]:im}

Check your image to see if the input is right

In [None]:
print(im.shape)
plt.imshow(np.moveaxis(im[0], 0,2))

Do the inference on the onnx model to see if everything looks good. You want to focus on the prediction scores to see if anything changed.

In [None]:
#time output
start=time.time()
outputs = session.run(outname, inp)[0]
end=time.time()
print(end-start)

In [None]:
#check results
ori_images = [img.copy()]

for i,(batch_id,x0,y0,x1,y1,cls_id,score) in enumerate(outputs):
    image = ori_images[int(batch_id)]
    box = np.array([x0,y0,x1,y1])
    box -= np.array(dwdh*2)
    box /= ratio
    box = box.round().astype(np.int32).tolist()
    cls_id = int(cls_id)
    score = round(float(score),3)
    name = names[cls_id]
    color = colors[name]
    name += ' '+str(score)
    cv2.rectangle(image,box[:2],box[2:],color,2)
    cv2.putText(image,name,(box[0], box[1] - 2),cv2.FONT_HERSHEY_SIMPLEX,0.75,[225, 255, 255],thickness=2)  

Image.fromarray(ori_images[0])

If the Onnx model seems to be fine, its time to convert to TF model (.pb) This can easily be done by using onnx_tf library.

In [None]:
#convert to tf model

import onnx
from onnx_tf.backend import prepare
 
onnx_model = onnx.load("best.onnx")
tf_rep = prepare(onnx_model)
tf_rep.export_graph("best_tf2.pb")

Similar to before, we check the TF model to see if the same image being ran through it is given the same results. If there are new false positives or the scores seem off you might have had an issue converting this model.

In [None]:
#check tf model
model=tf.saved_model.load("best_tf2.pb")
infer = model.signatures["serving_default"]
print(infer.structured_outputs)

In [None]:
#time tf model
start=time.time()
labeling = infer(tf.constant(im))['output']
end=time.time()
print(end-start)
print("Result after saving and loading:\n", labeling)

In [None]:
#check the output
ori_images = [img.copy()]

for i,(batch_id,x0,y0,x1,y1,cls_id,score) in enumerate(labeling):
    image = ori_images[int(batch_id)]
    box = np.array([x0,y0,x1,y1])
    box -= np.array(dwdh*2)
    box /= ratio
    box = box.round().astype(np.int32).tolist()
    cls_id = int(cls_id)
    score = round(float(score),3)
    name = names[cls_id]
    color = colors[name]
    name += ' '+str(score)
    cv2.rectangle(image,box[:2],box[2:],color,2)
    cv2.putText(image,name,(box[0], box[1] - 2),cv2.FONT_HERSHEY_SIMPLEX,0.75,[225, 255, 255],thickness=2)  

Image.fromarray(ori_images[0])

Now its time to convert this tf model into tflite. If you will use the fp16 inputs its easier. However if you want to quantize this model to int8, you will need to make a representative dataset generator to feed images to.

In [None]:
#need representative dataset for quantization
def representative_dataset_gen(dataset, ncalib=100):
    # Representative dataset generator for use with converter.representative_dataset, returns a generator of np arrays
    for n, (path, img, im0s, vid_cap, string) in enumerate(dataset):
        # im = np.transpose(img, [1, 2, 0])
        im=img
        im = np.expand_dims(im, axis=0).astype(np.float32)
        im /= 255
        yield [im]
        if n >= ncalib:
            break

In [None]:
imgsz=320
int8=False

converter = tf.lite.TFLiteConverter.from_saved_model("best_tf2.pb")
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS]
converter.target_spec.supported_types = [tf.float16]
converter.optimizations = [tf.lite.Optimize.DEFAULT]
if int8:
  converter.representative_dataset = lambda: representative_dataset_gen(im[0])
  converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
  converter.target_spec.supported_types = []
  converter.inference_input_type = tf.int8  # or tf.int8
  converter.inference_output_type = tf.int8  # or tf.int8
  converter.experimental_new_quantizer = True
#adding NMS
converter.target_spec.supported_ops.append(tf.lite.OpsSet.SELECT_TF_OPS)
tflite_model = converter.convert()
open('best-tflite.tflite', "wb").write(tflite_model)

Check if the model has converted properly. Try without int8 quantization first

In [None]:
#tflite model try
tflite_model='best-tflite.tflite'
interpreter = tf.lite.Interpreter(model_path=tflite_model)
interpreter.allocate_tensors()


In [None]:
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
interpreter.set_tensor(input_details[0]['index'], im)
interpreter.invoke()
output_data = interpreter.get_tensor(output_details[0]['index'])
print("Inference output is {}".format(output_data))

In [None]:
import random
from PIL import Image
ori_images = [img.copy()]
names = ['Glasses', 'Sunglasses', 'Beer', 'Ball', 'Pen','Piano', 'Headphones', 'Light switch', 'Footwear', 'Watch', 'Coffeemaker', 'Waste container', 'Window', 'Window blind', 'Door handle', "Door", "Stairs", 'Bicycle','Car','Motorcycle','Bus','Train','Truck','Traffic light','Fire hydrant','Bench','Bird','Cat','Dog','Backpack','Handbag','Suitcase','Bottle','Wine glass','Coffee cup','Fork','Knife','Spoon','Bowl','Chair','Couch','Plant','Bed','Table','Toilet','Television','Laptop','Computer mouse','Remote control','Computer keyboard','Mobile phone','Microwave oven','Oven','Toaster','Sink','Refrigerator','Book','Clock','Toothbrush']
colors = {name:[random.randint(0, 255) for _ in range(3)] for i,name in enumerate(names)}
for i,(batch_id,x0,y0,x1,y1,cls_id,score) in enumerate(output_data):
    image = ori_images[0]
    box = np.array([x0,y0,x1,y1])
    box -= np.array(dwdh*2)
    box /= ratio
    box = box.round().astype(np.int32).tolist()
    cls_id = int(cls_id)
    score = round(float(score),3)
    name = names[cls_id]
    color = colors[name]
    name += ' '+str(score)
    cv2.rectangle(image,box[:2],box[2:],color,2)
    cv2.putText(image,name,(box[0], box[1] - 2),cv2.FONT_HERSHEY_SIMPLEX,0.75,[225, 255, 255],thickness=2)  
Image.fromarray(ori_images[0])