<a id="top"></a>
# Object Detection Tutorial

## Introduction

The purpose of this tutorial is to examine a sample application that was created using the [Intel® Distribution of Open Visual Inference & Neural Network Optimization (OpenVINO™) toolkit](https://www.intel.com/content/www/us/en/developer/tools/openvino-toolkit/overview.html). This tutorial will go step-by-step through the necessary steps to demonstrate object detection on images. Object detection is performed using a pre-trained network and running it using the OpenVINO™ Runtime.

Object Detection in Computer Vision is a task of finding objects and locating them in the image.

The tutorial guides you through the following steps:

1. [Obtain Required Modules](#1.-Obtain-Required-Modules) 
2. [_Optional_. Download and convert a pretrained model from the Open Model Zoo](#2.-Optional.-Download-and-Convert-a-Pretrained-Model-from-the-Open-Model-Zoo)
3. [Configure inference: path to a model and other data](#3.-Configure-an-Inference)
4. [Initialize the OpenVINO™ runtime](#4.-Initialize-the-OpenVINO™-Runtime)
5. [Read the model](#5.-Read-the-Model)
6. [Make the model executable](#6.-Make-the-Model-Executable)
7. [Prepare an image for model inference](#7.-Prepare-an-Image-for-Model-Inference)
8. [Infer the model](#8.-Infer-the-Model)
9. [Show predictions](#9.-Show-Predictions)

### 1. Obtain Required Modules
Install required modules on your system

In [None]:
%%bash 
python3 -m pip install -r requirements.txt

Import the Python* modules that you will use in the sample code:
- [os](https://docs.python.org/3/library/os.html#module-os) is a standard Python module used for filename parsing.
- [cv2](https://docs.opencv.org/trunk/) is an OpenCV module used to work with images.
- [time](https://docs.python.org/3/library/time.html#module-time) is a standard Python module used to measure execution time.
- [NumPy](http://www.numpy.org/) is an array manipulation module used to process images as arrays.
- [Deep Learning OpenVINO™ Runtime](https://docs.openvino.ai/latest/openvino_docs_OV_Runtime_User_Guide.html) is an OpenVINO™ Python API module used for inference.
- [Matplotlib](https://matplotlib.org/) is a visualization module used to display output images.

Run the cell below to import the modules. 

In [None]:
import os
import cv2
import time
from openvino.runtime import Core
from matplotlib import pyplot as plt

%matplotlib inline

### 2. _Optional_. Download and Convert a Pretrained Model from the Open Model Zoo

> **NOTE**: If you already imported a model in the DL Workbench, skip this step and proceed to [configuring an inference](#3.-Configure-an-Inference).

OpenVINO™ toolkit includes the [Model Optimizer](https://docs.openvino.ai/latest/openvino_docs_MO_DG_Deep_Learning_Model_Optimizer_DevGuide.html) used to convert and optimize trained models into Intermediate Representation (IR) model files, and the  [OpenVINO Runtime](https://docs.openvino.ai/latest/openvino_docs_OV_Runtime_User_Guide.html), which uses the IR model files to run an inference on hardware devices. The IR model files are created from models trained in popular frameworks, like Caffe\*, TensorFlow\*, and others. 

OpenVINO™ [Model Downloader](https://docs.openvino.ai/latest/omz_tools_downloader.html) downloads common inference models from the [Intel® Open Model Zoo](https://github.com/openvinotoolkit/open_model_zoo). 

Before downloading a model, you need to configure a Python* environment to convert model from TensorFlow* framework. To do this, create a new virtual environment and install required packages.

In [None]:
%%bash
python3 -m virtualenv /tmp/virtualenvs/tutorial_object_detection
source /tmp/virtualenvs/tutorial_object_detection/bin/activate

python -m pip install --upgrade pip
pip uninstall openvino openvino_dev -y
pip install --upgrade openvino-dev[tensorflow]==2022.3.0

Let's download the `ssd_mobilenet_v1_coco` model first.

In [None]:
%%bash 
source /tmp/virtualenvs/tutorial_object_detection/bin/activate

omz_downloader --name ssd_mobilenet_v1_coco -o raw_model

The next step is to translate the model into the OpenVINO™ IR format.

In [None]:
%%bash 
source /tmp/virtualenvs/tutorial_object_detection/bin/activate

omz_converter\
    --name ssd_mobilenet_v1_coco \
    -d raw_model \
    -o model \
    --precision FP32

### 3. Configure an Inference

Once you have the OpenVINO™ IR of your model, you can start experimenting with it by inferring it and inspecting its output. 

> **NOTE**: If you have the model imported in DL Workbench, copy the paths to the `.xml` and `.bin` files from the DL Workbench UI and paste them below.

#### Required parameters

Parameter| Explanation
---|---
**model_xml**| Path to the `.xml` file of OpenVINO™ IR of your model
**model_bin**| Path to the `.bin` file of OpenVINO™ IR of your model

In [None]:
# Model IR files
model_xml = "model/public/ssd_mobilenet_v1_coco/FP32/ssd_mobilenet_v1_coco.xml"
model_bin = "model/public/ssd_mobilenet_v1_coco/FP32/ssd_mobilenet_v1_coco.bin"

#### Optional Parameters

Experiment with optional parameters after you go the full workflow of the tutorial.

Parameter| Explanation
---|---
**input_image_path**| Path to an input image. Use the `car.bmp` image placed in the directory of the notebook or, if you have imported a dataset in the DL Workbench, copy the path to an image in the dataset.
**device**| Specify the [target device](https://docs.openvino.ai/latest/workbench_docs_Workbench_DG_Select_Environment.html) to infer on: CPU, GPU, or MYRIAD. Note that the device must be present. For this tutorial, use `CPU` which is known to be present.
**labels_path**| Path to the annotations file that maps the integers predicted by the model to strings. For example: `7: car`
**prob_threshold**| Probability threshold to filter detection results

In [None]:
# Input image file. 
# Copy the path to one of images from the dataset imported in DL Workbench
# or use the default image "./car.bmp".
input_image_path = "car.bmp"

# Device to use
device = "CPU"

# Output labels file path or an empty string
labels_path = "labels.txt"

# Minimum probability threshold to detect an object
prob_threshold = 0.5

print(
    "Configuration parameters settings:"
    f"\n\tmodel_xml={model_xml}",
    f"\n\tmodel_bin={model_bin}",
    f"\n\tinput_image_path={input_image_path}",
    f"\n\tdevice={device}", 
    f"\n\tlabels_path={labels_path}", 
    f"\n\tprob_threshold={prob_threshold}",
)

### 4. Initialize the OpenVINO™ Runtime

Once you define the parameters, let's initiate the `Core` object that accesses OpenVINO™ runtime capabilities.

In [None]:
# Create an OpenVINO™ Runtime instance
core = Core()

### 5. Read the Model

Put the IR of your model in the memory.

In [None]:
# Read the network from IR files
model = core.read_model(model=model_xml, weights=model_bin)

### 6. Make the Model Executable

Reading a network is not enough to start a model inference. The model must be loaded to a particular abstraction representing a particular accelerator. In OpenVINO™, this abstraction is called *plugin*. A network loaded to a plugin becomes executable and will be inferred in one of the next steps. 

After loading, we keep necessary model information such as `input_name`. Let's remember the input dimensions of your model:
- `n` - input batch size
- `c` - number of input channels. Often, it is `1` or `3`, which means that the model expects either a grayscale or a color image.
- `h` - input image height
- `w` - input image width

In [None]:
compiled_model = core.compile_model(model=model, device_name=device)

# Store the input name
input_name = model.input().any_name

# Read the input dimensions: n=batch size, c=number of channels, h=height, w=width
n, h, w, c = model.input().get_shape()
print(f"Loaded the model into the OpenVINO Runtime for the {device} device.", 
      f"\nModel input dimensions: n={n}, c={c}, h={h}, w={w}")

### 7. Prepare an Image for Model Inference

Now let's read and prepare the input image by resizing according to the input dimensions of the model.

In [None]:
# Define the function to load the input image
def load_input_image(input_path):
    # Globals to store input width and height
    global input_w, input_h
    
    # Use OpenCV to load the input image
    img = cv2.imread(input_path)
    
    input_h, input_w, *_ = img.shape
    print(f"Loaded the input image {input_path}. \nInput image resolution: {input_w}x{input_h}")
    
    return img

# Define the function to resize the input image
def resize_input_image(image):
    # Resize the image dimensions from image to model input w x h
    in_frame = cv2.resize(image, (w, h))
    # Reshape to input dimensions
    in_frame = in_frame.reshape((n, h, w, c))
    print(f"Resized the input image to {w}x{h}.")
    return in_frame

# Load the image
image = load_input_image(input_image_path)

# Resize the input image
in_frame = resize_input_image(image)

# Display the input image
print("Input image:")
plt.axis("off")
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))

### 8. Infer the Model

Now that you have the input image in the BGR format and of the right size, you can perform the inference of the model.

In [None]:
# Save the starting time
inf_start = time.time()

# Run the inference
res = compiled_model.infer_new_request({input_name: in_frame})   

# Calculate the time from the start until now
inf_time = time.time() - inf_start
print(f"Inference is complete. Run time: {inf_time * 1000:.3f} ms.")

### 9. Show Predictions

The next step is to parse the inference results and draw boxes over the objects detected in the image.

A result of model inference (`res`) is an array of predictions. Each prediction `obj` has a following structure:

- `obj[1]`: class ID, or the type of a detected object
- `obj[2]`: Confidence level that currently detected object is an instance of the predicted class
- `obj[3]`: lower x coordinate of the detected object 
- `obj[4]`: lower y coordinate of the detected object
- `obj[5]`: upper x coordinate of the detected object
- `obj[6]`: upper y coordinate of the detected object

For each detected object, the output from the model will include an integer to indicate which type of the object, such as car or human, has been detected. To translate the integer into a more readable text string, use a label mapping file. The label mapping file is a text file of the format `n: string` (for example, `3: car`) that is loaded into a lookup table to be used later when labeling detected objects.

Now we have an image where every detected object is bounded with a box with class id and confidence level. To replace class ids with their names, you need a label mapping file. You can find the sample label mapping file in the current directory with the name `labels.txt`.

<b> Note: Postprocessing is created only for the example `ssd_mobilenet_v1_coco` object detection model. If you used another model for this tutorial, rewrite the `process_and_display_results` function and, optionally, `load_labels_map` function. You can find object detection model postprocessing examples in the OpenVINO samples</b>:

- [Object Detection Python* Demo](https://docs.openvino.ai/latest/omz_demos_object_detection_demo_python.html)
- [Object Detection SSD Python* Sample](https://docs.openvino.ai/latest/openvino_inference_engine_ie_bridges_python_sample_object_detection_sample_ssd_README.html)


In [None]:
def load_labels_map():
    labels_map = None
    # If there is a path to a label mapping file, load the file into labels_map
    print(labels_path)
    if os.path.isfile(labels_path):
        with open(labels_path, 'r') as f:
            labels_map = [x.split(sep=' ', maxsplit=1)[-1].strip() for x in f]
        print(f"Loaded label mapping file [{labels_path}]")
    else:
        print("No label mapping file has been loaded, only numbers will be used",
              "for detected object labels.")
    return labels_map

# Create a function to process inference results
def process_results(result):
    # Get output results
    res = result[compiled_model.output()]
    
    # Load the names of the classes from the labels_path file if possible
    labels_map = load_labels_map()
    
    # Loop through all possible results
    for obj in res[0][0]:
        # If probability is more than the specified threshold, draw and label the box 
        if obj[2] > prob_threshold:
            # Get coordinates of the box containing the detected object
            xmin = int(obj[3] * input_w)
            ymin = int(obj[4] * input_h)
            xmax = int(obj[5] * input_w)
            ymax = int(obj[6] * input_h)
            
            # Get the type of the object detected
            class_id = int(obj[1])
            
            # Draw the box and label for the detected object
            color = (min(class_id * 12.5, 255), 255, 255)
            cv2.rectangle(image, (xmin, ymin), (xmax, ymax), color, 4)
            det_label = labels_map[class_id] if labels_map else str(class_id)
            cv2.putText(image, det_label + ' ' + str(round(obj[2] * 100, 1)) + ' %', (xmin, ymin - 7),
                        cv2.FONT_HERSHEY_COMPLEX, 1, color, 2)

process_results(res)

# Convert colors from BGR to RGB
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

# Disable axis display, then display the image
plt.axis("off")
plt.imshow(image)
print("Processed the image and displayed the inference output result.")

Congratulations! Now you can proceed to importing the model into the DL Workbench or if you have already done that, start exploring numerous features such as:

* [Analyse how the model works and its quality](https://docs.openvino.ai/latest/workbench_docs_Workbench_DG_Visualize_Accuracy.html)
* [Perform a baseline inference and analyze model performance](https://docs.openvino.ai/latest/workbench_docs_Workbench_DG_Run_Single_Inference.html)
* [Boost the model by calibrating it to the INT8 precision](https://docs.openvino.ai/latest/workbench_docs_Workbench_DG_Int_8_Quantization.html)
* [Tune the performance of the model by selecting optimal inference parameters](https://docs.openvino.ai/latest/workbench_docs_Workbench_DG_Run_Range_of_Inferences.html)
* [Preparing the model for deployment](https://docs.openvino.ai/latest/workbench_docs_Workbench_DG_Deploy_and_Integrate_Performance_Criteria_into_Application.html)