## From PyTorch to ONNX

Its a very easy process with only 1 line coding thanks to torch.onnx.export function!

In [1]:
import torch
from networkFT import Net

""" Loading the trained model"""
model = Net(3) 
model.load_state_dict(torch.load('./outputs_FT/model.pt'))

""" create an example input having same shape with the expected input for the trained model"""

x = torch.randn((1, 3, 240, 240))

""" convert it to onnx"""
torch.onnx.export(model,       # model to convert
                  x,         # model input
                  "onnx/model.onnx",  # output model name
                  export_params=True, # store the trained weights
                  opset_version=11,   # the ONNX version
                  do_constant_folding=True, # remove randomness, make inference faster
                  input_names= ['input'], # set model input names    
                  output_names=['output'], # set model output names
)

## Inference with ONNX

We used onnx framework to make the conversion. For inference, we will need onnx runtime to create our deployment pipeline working on onnx model. Note that now we are done with torch Tensors and we will send our input images directly in the form we read them. Only two things to obtain the required form for input data:
1. OpenCV read the images in H,W,C order and we need to reshape it to C,H,W as well as adding 1 dimension for the batch size
2. Converting the image pixel values from uint8 to float.

In [2]:
import onnxruntime as ort
import cv2
import numpy as np

""" start onnx runtime session"""
session = ort.InferenceSession("onnx/model.onnx") #start onnx engine for our onnx model

input_name = session.get_inputs()[0].name
output_name = session.get_outputs()[0].name

""" prepare input """
img1 = cv2.imread("data_flowers/daisy/100080576_f52e8ee070_n.jpg") 
img1 = cv2.resize(img1, (240,240), interpolation = cv2.INTER_AREA) # (240, 240, 3)
img1 = np.reshape(img1, (1,3,240,240)) #(1, 3, 240, 240)
img1 = img1.astype(np.float32)


""" run """
scores = session.run([output_name], {input_name: img1})[0]

""" check the result"""
class_names = ['dandelion', 'rose', 'daisy'] # remember our class_idx in training, output node 0: dandelion, output node 1:rose and output node 2: daisy
print(scores[0]) # prediction probabilities for each class
scores = list(scores[0])
print(class_names[scores.index(max(scores))]) # label having the maximum probability

[0. 0. 1.]
daisy


## Compare Inference Time

We managed to convert our model and make inference in Python using onnxruntime. But what is changed exactly?
How fast is our model now? Let's compare our inference speed for different batch size

### Inference time with ONNX 

In [3]:
import time

start_inference = time.time()

scores = session.run([output_name], {input_name: img1})[0]

end_inference = time.time()

print("inference took", end_inference-start_inference, "seconds for 1 image")

inference took 0.004109859466552734 seconds for 1 image


### Inference time with PyTorch 

In [4]:
from torchvision import  transforms

model = Net(3) 
model.load_state_dict(torch.load('./outputs_FT/model.pt'))

inf_transforms=transforms.Compose([
    transforms.ToPILImage(),
    transforms.Resize((240,240)),
    transforms.ToTensor()
])  

img1 = cv2.imread("data_flowers/daisy/100080576_f52e8ee070_n.jpg") 
img1 = inf_transforms(img1)
img1 = img1.unsqueeze(0)

start_inference = time.time()

outputs = model(img1.float())

end_inference = time.time()

print("inference took", end_inference-start_inference, "seconds for 1 image")

inference took 0.0071868896484375 seconds for 1 image


Yep! As we see ONNX inference is 2x faster than PyTorch inference. It may seem to be a little change since we talk about miliseconds, but imagine the time savings you would obtain while processing 1000 images one by one and for many real-time projects, even miliseconds are important! 