<a href="https://colab.research.google.com/github/R-Mahmoudi/Inference_with_OpenVINO_Execution_Provider/blob/main/Inference_with_OpenVINO_Execution_Provider.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Mobilenet V2 Inference with OpenVINO™ Execution Provider for ONNX Runtime on CPU**

**Let's install the necessary packages. We will install PyTorch, onnxruntime-openvino 1.11, ONNX and pillow.**

In [1]:
! pip install torch torchvision torchaudio -f https://download.pytorch.org/whl/torch_stable.html
! pip install --upgrade onnx
!pip install  pillow==9.0.0
! pip install onnxruntime-openvino

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in links: https://download.pytorch.org/whl/torch_stable.html
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting onnx
  Using cached onnx-1.13.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.5 MB)
Collecting protobuf<4,>=3.20.2
  Using cached protobuf-3.20.3-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.whl (1.0 MB)
Installing collected packages: protobuf, onnx
  Attempting uninstall: protobuf
    Found existing installation: protobuf 3.20.1
    Uninstalling protobuf-3.20.1:
      Successfully uninstalled protobuf-3.20.1
  Attempting uninstall: onnx
    Found existing installation: onnx 1.12.0
    Uninstalling onnx-1.12.0:
      Successfully uninstalled onnx-1.12.0
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of

**Use torchvision provided API to load mobilenet_v2 model.**

In [2]:
from torchvision import models, datasets, transforms as T
mobilenet_v2 = models.mobilenet_v2(pretrained=True)



**Pytorch onnx export API to export the model.**

In [3]:
import torch
image_height = 224
image_width = 224
x = torch.randn(1, 3, image_height, image_width, requires_grad=True)
torch_out = mobilenet_v2(x)

# Export the model
torch.onnx.export(mobilenet_v2,              # model being run
                  x,                         # model input (or a tuple for multiple inputs)
                  "mobilenet_v2_float.onnx", # where to save the model (can be a file or file-like object)
                  export_params=True,        # store the trained parameter weights inside the model file
                  opset_version=12,          # the ONNX version to export the model to
                  do_constant_folding=True,  # whether to execute constant folding for optimization
                  input_names = ['input'],   # the model's input names
                  output_names = ['output']) # the model's output names

**Run an sample with the FP32 ONNX model. Firstly, implement the preprocess.**

In [4]:
from PIL import Image
import numpy as np
import onnxruntime
import torch

def preprocess_image(image_path, height, width, channels=3):
    image = Image.open(image_path)
    image = image.resize((width, height), Image.ANTIALIAS)
    image_data = np.asarray(image).astype(np.float32)
    image_data = image_data.transpose([2, 0, 1]) # transpose to CHW
    mean = np.array([0.079, 0.05, 0]) + 0.406
    std = np.array([0.005, 0, 0.001]) + 0.224
    for channel in range(image_data.shape[0]):
        image_data[channel, :, :] = (image_data[channel, :, :] / 255 - mean[channel]) / std[channel]
    image_data = np.expand_dims(image_data, 0)
    return image_data

No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'


**Download the imagenet labels and load it.**

In [5]:
# Download ImageNet labels
!curl -o imagenet_classes.txt https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt

# Read the categories
with open("imagenet_classes.txt", "r") as f:
    categories = [s.strip() for s in f.readlines()]

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0100 10472  100 10472    0     0  83111      0 --:--:-- --:--:-- --:--:-- 83776


In [6]:
!curl -o cat.jpg https://raw.githubusercontent.com/maxogden/cat-picture/master/cat.jpg

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0100 78044  100 78044    0     0   399k      0 --:--:-- --:--:-- --:--:--  399k


**Run the example with OpenVINO Execution Provider for ONNX Runtime on CPU.**

In [7]:
import time
#set the provider as OpenVINO and device as CPU
session_openvino = onnxruntime.InferenceSession("mobilenet_v2_float.onnx",providers=['OpenVINOExecutionProvider'], provider_options=[{'device_type' : 'CPU_FP32'}])

def softmax(x):
    """Compute softmax values for each sets of scores in x."""
    e_x = np.exp(x - np.max(x))
    return e_x / e_x.sum()

def run_sample(session, image_file, categories):
    output = session.run([], {'input':preprocess_image(image_file, image_height, image_width)})[0]
    output = output.flatten()
    output = softmax(output) # this is optional
    top5_catid = np.argsort(-output)[:5]
    for catid in top5_catid:
        print(categories[catid], output[catid])
start = time.time()
run_sample(session_openvino, 'cat.jpg', categories)
elapsed = time.time() - start
print('Inference time in ms: %f' % (elapsed * 1000))

tabby 0.3298738
Persian cat 0.17306244
tiger cat 0.16069515
lynx 0.12683442
Egyptian cat 0.07306996
Inference time in ms: 50.282240
