# PaddlePaddle Image Classification with OpenVINO
This demo shows how to run MobileNetV3 Large PaddePaddle model on OpenVINO natively. Instead of exporting the PaddlePaddle model to ONNX and then create the Intermediate Representation (IR) format through OpenVINO optimizer, we can now read direct from the Paddle Model without any conversions.

# Download the MobileNetV3_large_x1_0 Model
Here we will direct the pre-trained model directly from the server. More details about the pretrained model can be found in PaddleClas documentation below.

Source: https://github.com/PaddlePaddle/PaddleClas/blob/release/2.2/deploy/lite/readme_en.md

In [None]:
import os.path
import urllib.request
import tarfile

mobilenet_url = "https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/MobileNetV3_large_x1_0_infer.tar"
mobilenetv3_model_path = "model/MobileNetV3_large_x1_0_infer/inference.pdmodel"
if os.path.isfile(mobilenetv3_model_path): 
    print("Model MobileNetV3_large_x1_0 already existed")
else:
    # Download the model from the server, and untar it.
    print("Downloading the MobileNetV3_large_x1_0_infer model (20Mb)... May take a while...")
    # make the directory if it is not 
    os.makedirs('model')
    urllib.request.urlretrieve(mobilenet_url, "model/MobileNetV3_large_x1_0_infer.tar")
    print("Model Downloaded")

    file = tarfile.open("model/MobileNetV3_large_x1_0_infer.tar")
    res = file.extractall('model')
    file.close()
    if (not res):
        print("Model Extracted to \"model/MobileNetV3_large_x1_0_infer\".")
    else:
        print("Error Extracting the model. Please check the network.")

# Define the callback function for postprecessing

In [None]:
def callback(infer_request, i) -> None:
    imagenet_classes = json.loads(open("utils/imagenet_class_index.json").read())
    predictions = next(iter(infer_request.results.values()))
    indices = np.argsort(-predictions[0])
    if (i == 0):
        # Calculate the first inference time
        latency = time.time() - start
        print("latency:", + latency)
        for i in range(5):
            print(
                "Class name:","'" + imagenet_classes[str(list(indices)[i])][1] + "'",
                ", probability:" , predictions[0][list(indices)[i]])

# Read the model file 

In [None]:
import openvino.runtime as ov

core = ov.Core()
# MobileNetV3_large_x1_0
model = core.read_model("model/MobileNetV3_large_x1_0_infer/inference.pdmodel")
# get the information of intput and output layer
input_layer = next(iter(model.inputs))
output_layer = next(iter(model.outputs))

# Integrate prepocessing steps into execution graph with Preprocessing API
When your input data don’t perfectly fit to Neural Network model input tensor - this means that additional operations/steps are needed to transform your data to format expected by model. These operations are known as “preprocessing”.
Preprocessing steps will be integrated into execution graph and will be performed on selected device (CPU/GPU/VPU/etc.) rather than always being executed on CPU. This will improve selected device utilization which is always good.

In [None]:
import cv2
import numpy as np
from openvino.preprocess import PrePostProcessor
from openvino.runtime import Layout, Type
from openvino.preprocess import ResizeAlgorithm
from openvino.runtime import AsyncInferQueue,PartialShape

filename = "coco.jpg"
test_image = cv2.imread(filename) 
test_image = np.expand_dims(test_image, 0) / 255
_, h, w, _ = test_image.shape

# Fix model’s input shape to get better performace
model.reshape({input_layer.any_name: PartialShape([1,3,224,224])})
ppp = PrePostProcessor(model)
# Set input tensor information:
# - input() provides information about a single model input
# - layout of data is 'NHWC'
# - set static spatial dimensions to input tensor to resize from
ppp.input().tensor() \
    .set_spatial_static_shape(h,w) \
    .set_layout(Layout('NHWC')) 
inputs = model.inputs
# Here we suppose model has 'NCHW' layout for input
ppp.input().model().set_layout(Layout('NCHW'))
# Do prepocessing:
# - apply linear resize from tensor spatial dims to model spatial dims
# - Subtract mean from each channel
# - Divide each pixel data to appropriate scale value
ppp.input().preprocess() \
    .resize(ResizeAlgorithm.RESIZE_LINEAR,224,224) \
    .mean([0.485, 0.456,0.406]) \
    .scale([0.229, 0.224, 0.225])
# Set output tensor information:
# - precision of tensor is supposed to be 'f32'
ppp.output().tensor().set_element_type(Type.f32)
# Apply preprocessing modifing the original 'model'
model = ppp.build()

# Run Inference
Use “AUTO” as the device name to delegate selection of an actual accelerator to OpenVINO. The Auto-device plugin internally recognizes and selects devices from among CPU, integrated GPU and discrete Intel GPUs (when available) depending on the device capabilities and the characteristics of CNN models (for example, precision). Then the Auto-device assigns inference requests to the selected device.
AUTO starts inferencing immediately on the CPU and then transparently shifts inferencing to the GPU (or VPU) once ready, dramatically reducing time to first inference.

In [None]:
import time
from IPython.display import Image
import json

# Check the available devices in your system
devices = core.available_devices
for device in devices:
    device_name = core.get_property(device_name=device, name="FULL_DEVICE_NAME")
    print(f"{device}: {device_name}")

# Loading model to a AUTO choosed device from the available devices list
compiled_model = core.compile_model(model=model, device_name="AUTO")
# Create infer request queue
infer_queue = AsyncInferQueue(compiled_model)
infer_queue.set_callback(callback)
start = time.time()
# Do inference
infer_queue.start_async({input_layer.any_name:test_image},0)
infer_queue.wait_all()
Image(filename=filename) 

# Run Inference with "LATENCY" Performance Hint
Expressing application target use-case with a single config key, letting the device configure itself to get a better "LATENCY" oriented performance.

In [None]:
# AUTO sets device config based on hints
compiled_model = core.compile_model(model=model, device_name="AUTO",config={"PERFORMANCE_HINT":"LATENCY"})
infer_queue = AsyncInferQueue(compiled_model)
infer_queue.set_callback(callback)
start = time.time()
for i in range(100):
    infer_queue.start_async({input_layer.any_name:test_image},i)
infer_queue.wait_all()
end = time.time()
# Calculate the average FPS
fps = 100 / (end - start)
print("fps:", + fps)

# Run Inference with "TRHOUGHPUT" Performance Hint
Expressing application target use-case with a single config key, letting the device configure itself to get a better "THROUGHPUT" oriented performance.

In [None]:
# AUTO sets device config based on hints
compiled_model = core.compile_model(model=model, device_name="AUTO",config={"PERFORMANCE_HINT":"THROUGHPUT"})
infer_queue = AsyncInferQueue(compiled_model)
infer_queue.set_callback(callback)
start = time.time()
for i in range(100):
    infer_queue.start_async({input_layer.any_name:test_image},i)
infer_queue.wait_all()
end = time.time()
# Calculate the average FPS
fps = 100 / (end - start)
print("fps:", + fps)