### Pytorch --> ONNX

As it was mentioned before, to deploy our model with OpenVINO, we may need to convert our model to onnx first. So the following steps are the same with "onnx_deploy.ipynb" tutorial

In [1]:
import torch
from networkFT import Net

""" Loading the trained model"""
model = Net(3) 
model.load_state_dict(torch.load('./outputs_FT/model.pt'))

""" create an example input having same shape with the expected input for the trained model"""

x = torch.randn((1, 3, 240, 240))

""" convert it to onnx"""
torch.onnx.export(model,       # model to convert
                  x,         # model input
                  "onnx/model.onnx",  # output model name
                  export_params=True, # store the trained weights
                  opset_version=11,   # the ONNX version
                  do_constant_folding=True, # remove randomness, make inference faster
                  input_names= ['input'], # set model input names    
                  output_names=['output'], # set model output names
)

### ONNX --> OpenVINO

Using the following terminal command, we will convert our onnx model to OpenVINO. We can determine the datatype with --data_type parameter. Note that we trained our model with float32 type, which means every single weight we have in our model is float32. For now, we will use the same data type but note that quantization, which means decreasing the data type of your model, may give you a smaller model size with a not significant accuracy loss. ımagine you dont store your parameters in float32 range -4 byte- but in uint8 range -1 byte-, your parameter amount would be the same with a smaller model size. Since you round your weights and cut the decimal numbers, you may have little accuracy loss though.  

In [2]:
!mo --input_model "onnx/model.onnx" --input_shape "[1, 3, 240, 240]" --data_type FP32 --output_dir "openvino/"

[ INFO ] The model was converted to IR v11, the latest model format that corresponds to the source DL framework input/output format. While IR v11 is backwards compatible with OpenVINO Inference Engine API v1.0, please use API v2.0 (as of 2022.1) to take advantage of the latest improvements in IR v11.
Find more information about API v2.0 and IR v11 at https://docs.openvino.ai/latest/openvino_2_0_transition_guide.html
[ SUCCESS ] Generated IR version 11 model.
[ SUCCESS ] XML file: /home/yca/educative/docker_image_jp/openvino/model.xml
[ SUCCESS ] BIN file: /home/yca/educative/docker_image_jp/openvino/model.bin


### Inference with OpenVINO

In [3]:
""" prepare input """
import cv2
import numpy as np

img1 = cv2.imread("data_flowers/daisy/100080576_f52e8ee070_n.jpg") 
img1 = cv2.resize(img1, (240,240), interpolation = cv2.INTER_AREA) # (240, 240, 3)
img1 = np.reshape(img1, (1,3,240,240)) #(1, 3, 240, 240)
img1 = img1.astype(np.float32)

In [4]:
from openvino.runtime import Core

# Load the network in OpenVINO Runtime.
ie = Core()
model_ir = ie.read_model(model="openvino/model.xml")
compiled_model_ir = ie.compile_model(model=model_ir, device_name="CPU")

# Get input and output layers.
output_layer_ir = compiled_model_ir.output(0)

# Run inference on the input image.
scores = compiled_model_ir([img1])[output_layer_ir]

""" check the result"""
class_names = ['dandelion', 'rose', 'daisy'] 
print(scores)
scores = list(scores[0])
print(class_names[scores.index(max(scores))]) #

[[0. 0. 1.]]
daisy


### Compare Inference Time

We already checked the inference time for Pytorch and ONNX with onnxruntime for python. There are two additional options with OpenVINO: 

1. ONNX model with OpenVINO runtime for python 
2. OpenVINO model with OpenVINO runtime for python 

Yes, it is also possible to run ONNX models directly in OpenVINO engine. 

In [5]:
""" ONNX in OpenVINO runtime"""
import time

# Load the network to OpenVINO Runtime.
ie = Core()
model_onnx = ie.read_model(model="onnx/model.onnx")
compiled_model_onnx = ie.compile_model(model=model_onnx, device_name="CPU")

output_layer_onnx = compiled_model_onnx.output(0)

# Run inference on the input image.

start_inference = time.time()

res_onnx = compiled_model_onnx([img1])[output_layer_onnx]

end_inference = time.time()

print("inference took", end_inference-start_inference, "seconds for 1 image")

inference took 0.0051517486572265625 seconds for 1 image


In [6]:
""" OpenVINO in OpenVINO runtime"""

# Run inference on the input image.

start_inference = time.time()

scores = compiled_model_ir([img1])[output_layer_ir]

end_inference = time.time()

print("inference took", end_inference-start_inference, "seconds for 1 image")

inference took 0.0058476924896240234 seconds for 1 image


It seems that ONNX in onnxruntime overperforms these results while any converted model gives a faster inference than inference with PyTorch! 