# Optimize Preprocessing

When input data does not fit the model input tensor perfectly, additional operations/steps are needed to transform the data to the format expected by the model. This tutorial demonstrates how it could be performed with Preprocessing API. Preprocessing API is an easy-to-use instrument, that allows to integrate preprocessing steps into an execution graph and perform it on selected device, which can improve of device utilization. For more information about Preprocessing API, please, see this [overview](https://docs.openvino.ai/latest/openvino_docs_OV_UG_Preprocessing_Overview.html#) and [details](https://docs.openvino.ai/latest/openvino_docs_OV_UG_Preprocessing_Details.html)

This tutorial include following steps:
- Downloading the model
- Setup preprocessing, loading the model and inference with original image
- Fitting image to the model input type and inference with prepared image
- Comparing results on one picture
- Comparing performance of the inference with and without using of preprocessing API

## Setup

## Imports

In [1]:
import cv2
import sys
import time

import numpy as np
from pathlib import Path
import matplotlib.pyplot as plt
from openvino.runtime import Core
from IPython.display import Markdown, display

sys.path.append("../utils")
from notebook_utils import download_file

### Setup image and device

In [2]:
image_path = "../data/image/coco.jpg"
device = "CPU"
# device = "GPU"

### Downloading the model

This tutorial uses the [caffe-googlenet-bn](https://github.com/lim0606/caffe-googlenet-bn). The caffe-googlenet-bn model is the second of the [Inception](https://github.com/tensorflow/tpu/tree/master/models/experimental/inception) family of models designed to perform image classification. Like other Inception models, caffe-googlenet-bn has been pre-trained on the [ImageNet](https://image-net.org/) data set. For more details about this family of models, see the [research paper](https://arxiv.org/abs/1512.00567).

The following code downloads caffe-googlenet-bn and converts it to OpenVINO IR format `(ir_model/caffe-googlenet-bn.xml)` with  Model Optimizer tool. For more information about Model Optimizer, see the [Model Optimizer Developer Guide](https://docs.openvino.ai/latest/openvino_docs_MO_DG_Deep_Learning_Model_Optimizer_DevGuide.html). Downloading and converting steps will be skipped if this actions have already been executed.

In [None]:
model_name = "caffe-googlenet-bn"
MODEL_DIR = Path("./model").expanduser()
ir_path = MODEL_DIR / "ir_model" / f"{model_name}.xml"

# donwload model from origin source if it is not exists
if Path(MODEL_DIR, f'{model_name}.prototxt').exists() and Path(MODEL_DIR, f'{model_name}.caffemodel').exists():
    print(f"Model {model_name} already donwloaded to {MODEL_DIR}\n")
else:
    prototxt_url = "https://raw.githubusercontent.com/lim0606/caffe-googlenet-bn/d19caf526b7d8cad873ff91ba4cea602eadd58b3/deploy.prototxt"
    download_file(prototxt_url, filename=f"{model_name}.prototxt", directory=MODEL_DIR)
    caffemodel_url = "https://github.com/lim0606/caffe-googlenet-bn/raw/d19caf526b7d8cad873ff91ba4cea602eadd58b3/snapshots/googlenet_bn_stepsize_6400_iter_1200000.caffemodel"
    download_file(caffemodel_url, filename=f"{model_name}.caffemodel", directory=MODEL_DIR)

prototxt_file = MODEL_DIR / f'{model_name}.prototxt'
caffemodel_file = MODEL_DIR / f'{model_name}.caffemodel'

# postpocessing
text = prototxt_file.read_text()
text = text.replace("dim: 10", "dim: 1")
text = text.replace("layers {", "layer {")
prototxt_file.write_text(text)

# convert the model to OpenVINO format with Model Optimizer if IR is not exists
if ir_path.exists():
    print(f"Model in OpenVINO format already exists: {ir_path}")
else: 
    mo_command = f"""mo
                 --input_model={caffemodel_file}
                 --input_proto={prototxt_file}
                 --output_dir="{ir_path.parent}"
                 --mean_values=data[104.0,117.0,123.0]
                 --output=prob
                 """
    mo_command = " ".join(mo_command.split())
    print("Model Optimizer command to convert the model to OpenVINO:")
    display(Markdown(f"`{mo_command}`"))
    ! $mo_command

## Setup preprocessing steps with Preprocessing API and perform inference

Intuitively, preprocessing API consists of the following parts:
- Tensor - declares user data format, like shape, layout, precision, color format from actual user’s data.
- Steps - describes sequence of preprocessing steps which need to be applied to user data.
- Model - specifies model data format. Usually, precision and shape are already known for model, only additional information, like layout can be specified.

Graph modifications of a model shall be performed after the model is read from a drive and before it is loaded on the actual device.

Pre-processing support following operations (please, see more details [here](https://docs.openvino.ai/latest/classov_1_1preprocess_1_1PreProcessSteps.html#doxid-classov-1-1preprocess-1-1-pre-process-steps-1aeacaf406d72a238e31a359798ebdb3b7))
- Mean/Scale Normalization
- Converting Precision
- Converting layout (transposing)
- Resizing Image
- Color Conversion
- Custom Operations

### Load the model

In [63]:
core = Core()
ppp_model = core.read_model(model=ir_path)

### Create PrePostProcessor Object

The [PrePostProcessor()](https://docs.openvino.ai/latest/classov_1_1preprocess_1_1PrePostProcessor.html#doxid-classov-1-1preprocess-1-1-pre-post-processor) class allows specifying preprocessing and postprocessing steps for a model.

In [64]:
from openvino.preprocess import PrePostProcessor

ppp = PrePostProcessor(ppp_model)

### Declare User’s Data Format

To address particular input of a model/preprocessor, use the PrePostProcessor.input(input_name) method. If the model has only one input, then simple PrePostProcessor.input() will get a reference to pre-processing builder for this input (a tensor, the steps, a model). In general, when a model has multiple inputs/outputs, each one can be addressed by a tensor name or by it’s index.

Below is all the specified input information:
- Precision is U8 (unsigned 8-bit integer).
- Data represents tensor with the {1,577,800,3} shape.
- Layout is “NHWC”. It means: height=577, width=800, channels=3.
- Color format is BGR.

The height/width information is necessary for resize, and channels is needed for mean/scale normalization.

In [None]:
from openvino.runtime import Type, Layout
from openvino.preprocess import ColorFormat

# Read image to check the image format
image = cv2.imread(image_path)
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB));
print(f"The shape of the image is {image.shape}")
print(f"The data type of the image is {image.dtype}")

# setup formant of data
ppp.input().tensor().set_element_type(Type.u8)\
                    .set_shape([1, 577, 800, 3])\
                    .set_layout(Layout('NHWC')) \
                    .set_color_format(ColorFormat.BGR)

### Declaring Model Layout

Model input already has information about precision and shape. Preprocessing API is not intended to modify this. The only thing that may be specified is input data [layout](https://docs.openvino.ai/latest/openvino_docs_OV_UG_Layout_Overview.html#doxid-openvino-docs-o-v-u-g-layout-overview).

In [None]:
input_layer_ir = next(iter(ppp_model.inputs))
print(f"The input shape of the model is {input_layer_ir.shape}")

ppp.input().model().set_layout(Layout('NCHW'))

### Preprocessing Steps

Now, the sequence of preprocessing steps can be defined.

Perform the following:
- Convert U8 to FP32 precision.
- Resize to height/width of a model. Be aware that if a model accepts dynamic size e.g., {?, 3, ?, ?}, resize will not know how to resize the picture. Therefore, in this case, target height / width should be specified as PreProcessSteps.resize( ResizeAlgorithm, destination_height, destination_width). For more details, see also the [PreProcessSteps.resize()](https://docs.openvino.ai/latest/classov_1_1preprocess_1_1PreProcessSteps.html#doxid-classov-1-1preprocess-1-1-pre-process-steps-1a40dab78be1222fee505ed6a13400efe6).
- Subtract mean from each channel.

Also it could be specify color format with PrePostProcessor.convert_color() and scale with PrePostProcessor.scale().
Specifitng of converting layout is not needed, such conversion will be added implicitly.

Keep in mind that the last convert_layout step is commented out as it is not necessary to specify the last layout conversion. The PrePostProcessor will do such conversion automatically.

In [None]:
from openvino.preprocess import ResizeAlgorithm

ppp.input().preprocess().convert_element_type(Type.f32) \
                        .resize(ResizeAlgorithm.RESIZE_LINEAR) \
                        .mean([104.0, 117.0, 123.0])

### Integrating Steps into a Model

Once the preprocessing steps have been finished the model can be finally built. It is possible to display PrePostProcessor configuration for debugging purposes.

In [None]:
print(f'Dump preprocessor: {ppp}')
model_with_preprocess = ppp.build()

## Load model and perform inference

In [73]:
compiled_model_with_preprocess = core.compile_model(model=ppp_model, device_name=device)

ppp_output_layer = compiled_model_with_preprocess.output(0)

image = cv2.imread(image_path)
ppp_input_tensor = np.expand_dims(image, 0)

results = compiled_model_with_preprocess(ppp_input_tensor)[ppp_output_layer][0]

## Fit image manually and perform inference

### Load the model

In [74]:
model = core.read_model(model=ir_path)
compiled_model = core.compile_model(model=model, device_name=device)

### Load image and fit it to model input

In [None]:
def manual_image_preprocessing(path_to_image, compiled_model):
    input_layer_ir = next(iter(compiled_model.inputs))

    # Read image in BGR format.
    image = cv2.imread(path_to_image)

    # N, C, H, W = batch size, number of channels, height, width.
    N, C, H, W = input_layer_ir.shape

    # Resize image to the input size expected by the model.
    resized_image = cv2.resize(image, (W, H))

    # change data type
    dtype_changed_image = np.float32(resized_image)

    # perform mean normalization
    mean_values = np.array([104, 117, 123])
    preprocessed_image = dtype_changed_image - mean_values

    input_tensor = np.expand_dims(preprocessed_image.transpose(2, 0, 1), 0)

    return input_tensor


input_tensor = manual_image_preprocessing(image_path, compiled_model)
print(f"The shape of the image is {input_tensor.shape}")
print(f"The data type of the image is {input_tensor.dtype}")

### Perform inference 

In [85]:
output_layer = compiled_model.output(0)

result = compiled_model(input_tensor)[output_layer]

## Compare results

### Compare results on one image

In [None]:
def results(input_tensor, compiled_model):
    output_layer = compiled_model.output(0)

    results = compiled_model(input_tensor)[output_layer][0]

    top_indices = np.argsort(results)[-5:][::-1]
    top_softmax = results[top_indices]

    return top_indices, top_softmax


# Convert the inference result to a class name.
imagenet_classes = open("../data/datasets/imagenet/imagenet_2012.txt").read().splitlines()
imagenet_classes = ['background'] + imagenet_classes

# get result for inference with preprocessing api
top_indices, top_softmax = results(ppp_input_tensor, compiled_model_with_preprocess)

print("Result of inference with preprocessing api:")
for index, softmax_probability in zip(top_indices, top_softmax):
    print(f"{imagenet_classes[index]}, {softmax_probability:.5f}")

print("\n")
# get result for inference with the manual preparing of the image
top_indices, top_softmax = results(input_tensor, compiled_model)

print("Result of inference with manual image setup:")
for index, softmax_probability in zip(top_indices, top_softmax):
    print(f"{imagenet_classes[index]}, {softmax_probability:.5f}")

### Compare performance

In [None]:
# check performance for inference with preprocessing api
num_images = 1000

start = time.perf_counter()

for _ in range(num_images):
    compiled_model_with_preprocess(ppp_input_tensor)

end = time.perf_counter()
time_ir = end - start

print(
    f"IR model in OpenVINO Runtime/CPU with preprocessing API: {time_ir/num_images:.4f} "
    f"seconds per image, FPS: {num_images/time_ir:.2f}"
)

# check performance for inference with the manual preparing of the image
start = time.perf_counter()

for _ in range(num_images):
    input_tensor = manual_image_preprocessing(image_path, compiled_model)
    compiled_model([input_tensor])

end = time.perf_counter()
time_ir = end - start

print(
    f"IR model in OpenVINO Runtime/{device}: {time_ir/num_images:.4f} "
    f"seconds per image, FPS: {num_images/time_ir:.2f}"
)