# Vehicle Detection And Recognition with OpenVINO™

This tutorial demonstrates how to use two pre-trained models from [Open Model Zoo](https://github.com/openvinotoolkit/open_model_zoo): [vehicle-detection-0200](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/intel/vehicle-detection-0200) for object detection and [vehicle-attributes-recognition-barrier-0039](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/intel/vehicle-attributes-recognition-barrier-0039) for image classification. Using these models, you will detect vehicles from raw images and recognize attributes of detected vehicles.
![flowchart](https://user-images.githubusercontent.com/47499836/157867076-9e997781-f9ef-45f6-9a51-b515bbf41048.png)

As a result, you can get:

![result](https://user-images.githubusercontent.com/47499836/157867020-99738b30-62ca-44e2-8d9e-caf13fb724ed.png)


#### Table of contents:

- [Imports](#Imports)
- [Download Models](#Download-Models)
- [Load Models](#Load-Models)
    - [Get attributes from model](#Get-attributes-from-model)
    - [Helper function](#Helper-function)
    - [Read and display a test image](#Read-and-display-a-test-image)
- [Use the Detection Model to Detect Vehicles](#Use-the-Detection-Model-to-Detect-Vehicles)
    - [Detection Processing](#Detection-Processing)
    - [Recognize vehicle attributes](#Recognize-vehicle-attributes)
        - [Recognition processing](#Recognition-processing)
    - [Combine two models](#Combine-two-models)



## Imports
[back to top ⬆️](#Table-of-contents:)

Import the required modules.

In [None]:
import platform

%pip install -q "openvino>=2023.1.0"

if platform.system() != "Windows":
    %pip install -q "matplotlib>=3.4"
else:
    %pip install -q "matplotlib>=3.4,<3.7"

In [None]:
import os
from pathlib import Path
from typing import Tuple

import cv2
import numpy as np
import matplotlib.pyplot as plt
import openvino as ov

# Fetch `notebook_utils` module
import urllib.request
urllib.request.urlretrieve(
    url="https://raw.githubusercontent.com/openvinotoolkit/openvino_notebooks/latest/utils/notebook_utils.py",
    filename="notebook_utils.py",
)

import notebook_utils as utils

## Download Models
[back to top ⬆️](#Table-of-contents:)

Download pretrained models from https://storage.openvinotoolkit.org/repositories/open_model_zoo. If the model is already downloaded, this step is skipped.

> **Note**: To change the model, replace the name of the model in the code below, for example to `"vehicle-detection-0201"` or `"vehicle-detection-0202"`. Keep in mind that they support different image input sizes in detection. Also, you can change the recognition model to `"vehicle-attributes-recognition-barrier-0042"`. They are trained from different deep learning frames. Therefore, if you want to change the precision, you need to modify the precision value in `"FP32"`, `"FP16"`, and `"FP16-INT8"`. A different type has a different model size and a precision value.

In [None]:
# A directory where the model will be downloaded.
base_model_dir = Path("model")
# The name of the model from Open Model Zoo.
detection_model_name = "vehicle-detection-0200"
recognition_model_name = "vehicle-attributes-recognition-barrier-0039"
# Selected precision (FP32, FP16, FP16-INT8)
precision = "FP32"

base_model_url = "https://storage.openvinotoolkit.org/repositories/open_model_zoo/2023.0/models_bin/1"

# Check if the model exists.
detection_model_url = (
    f"{base_model_url}/{detection_model_name}/{precision}/{detection_model_name}.xml"
)
recognition_model_url = (
    f"{base_model_url}/{recognition_model_name}/{precision}/{recognition_model_name}.xml"
)
detection_model_path = (base_model_dir / detection_model_name).with_suffix('.xml')
recognition_model_path = (base_model_dir / recognition_model_name).with_suffix('.xml')

# Download the detection model.
if not detection_model_path.exists():
    utils.download_file(detection_model_url, detection_model_name + '.xml', base_model_dir)
    utils.download_file(detection_model_url.replace('.xml', '.bin'), detection_model_name + '.bin', base_model_dir)
# Download the recognition model.
if not os.path.exists(recognition_model_path):
    utils.download_file(recognition_model_url, recognition_model_name + '.xml', base_model_dir)
    utils.download_file(recognition_model_url.replace('.xml', '.bin'), recognition_model_name + '.bin', base_model_dir)

## Load Models
[back to top ⬆️](#Table-of-contents:)

This tutorial requires a detection model and a recognition model. After downloading the models, initialize OpenVINO Runtime, and use `read_model()` to read network architecture and weights from `*.xml` and `*.bin` files. Then, compile it with `compile_model()` to the specified device.

In [None]:
import ipywidgets as widgets

core = ov.Core()

device = widgets.Dropdown(
    options=core.available_devices + ["AUTO"],
    value='AUTO',
    description='Device:',
    disabled=False,
)

device

In [None]:
# Initialize OpenVINO Runtime runtime.
core = ov.Core()


def model_init(model_path: str) -> Tuple:
    """
    Read the network and weights from file, load the
    model on the CPU and get input and output names of nodes

    :param: model: model architecture path *.xml
    :retuns:
            input_key: Input node network
            output_key: Output node network
            exec_net: Encoder model network
            net: Model network
    """

    # Read the network and corresponding weights from a file.
    model = core.read_model(model=model_path)
    compiled_model = core.compile_model(model=model, device_name=device.value)
    # Get input and output names of nodes.
    input_keys = compiled_model.input(0)
    output_keys = compiled_model.output(0)
    return input_keys, output_keys, compiled_model

### Get attributes from model
[back to top ⬆️](#Table-of-contents:)

Use `input_keys.shape` to get data shapes.

In [None]:
# de -> detection
# re -> recognition
# Detection model initialization.
input_key_de, output_keys_de, compiled_model_de = model_init(detection_model_path)
# Recognition model initialization.
input_key_re, output_keys_re, compiled_model_re = model_init(recognition_model_path)

# Get input size - Detection.
height_de, width_de = list(input_key_de.shape)[2:]
# Get input size - Recognition.
height_re, width_re = list(input_key_re.shape)[2:]

### Helper function
[back to top ⬆️](#Table-of-contents:)

The `plt_show()` function is used to show image.

In [None]:
def plt_show(raw_image):
    """
    Use matplot to show image inline
    raw_image: input image

    :param: raw_image:image array
    """
    plt.figure(figsize=(10, 6))
    plt.axis("off")
    plt.imshow(raw_image)

### Read and display a test image
[back to top ⬆️](#Table-of-contents:)

The input shape of detection model is `[1, 3, 256, 256]`. Therefore, you need to resize the image to `256 x 256`, and expand the batch channel with `expand_dims` function.

In [None]:
# Load an image.
url = "https://storage.openvinotoolkit.org/data/test_data/images/person-bicycle-car-detection.bmp"
filename = "cars.jpg"
directory = "data"
image_file = utils.download_file(
    url, filename=filename, directory=directory, show_progress=False, silent=True,timeout=30
)
assert Path(image_file).exists()

# Read the image.
image_de = cv2.imread("data/cars.jpg")
# Resize it to [3, 256, 256].
resized_image_de = cv2.resize(image_de, (width_de, height_de))
# Expand the batch channel to [1, 3, 256, 256].
input_image_de = np.expand_dims(resized_image_de.transpose(2, 0, 1), 0)
# Show the image.
plt_show(cv2.cvtColor(image_de, cv2.COLOR_BGR2RGB))

## Use the Detection Model to Detect Vehicles
[back to top ⬆️](#Table-of-contents:)

![pipline](https://user-images.githubusercontent.com/47499836/157867076-9e997781-f9ef-45f6-9a51-b515bbf41048.png)

As shown in the flowchart, images of individual vehicles are sent to the recognition model. First, use `infer` function to get the result.

The detection model output has the format `[image_id, label, conf, x_min, y_min, x_max, y_max]`, where:

- `image_id` - ID of the image in the batch
- `label` - predicted class ID (0 - vehicle)
- `conf` - confidence for the predicted class
- `(x_min, y_min)` - coordinates of the top left bounding box corner
- `(x_max, y_max)` - coordinates of the bottom right bounding box corner

Delete unused dims and filter out results that are not used.

In [None]:
# Run inference.
boxes = compiled_model_de([input_image_de])[output_keys_de]
openvino_detection_result = compiled_model_de([input_image_de])[output_keys_de]
# Delete the dim of 0, 1.
boxes = np.squeeze(boxes, (0, 1))
# Remove zero only boxes.
boxes = boxes[~np.all(boxes == 0, axis=1)]

### Detection Processing
[back to top ⬆️](#Table-of-contents:)

With the function below, you change the ratio to the real position in the image and filter out low-confidence results.

In [None]:
def crop_images(bgr_image, resized_image, boxes, threshold=0.6) -> np.ndarray:
    """
    Use bounding boxes from detection model to find the absolute car position
    
    :param: bgr_image: raw image
    :param: resized_image: resized image
    :param: boxes: detection model returns rectangle position
    :param: threshold: confidence threshold
    :returns: car_position: car's absolute position
    """
    # Fetch image shapes to calculate ratio
    (real_y, real_x), (resized_y, resized_x) = bgr_image.shape[:2], resized_image.shape[:2]
    ratio_x, ratio_y = real_x / resized_x, real_y / resized_y

    # Find the boxes ratio
    boxes = boxes[:, 2:]
    # Store the vehicle's position
    car_position = []
    # Iterate through non-zero boxes
    for box in boxes:
        # Pick confidence factor from last place in array
        conf = box[0]
        if conf > threshold:
            # Convert float to int and multiply corner position of each box by x and y ratio
            # In case that bounding box is found at the top of the image, 
            # upper box  bar should be positioned a little bit lower to make it visible on image 
            (x_min, y_min, x_max, y_max) = [
                int(max(corner_position * ratio_y * resized_y, 10)) if idx % 2 
                else int(corner_position * ratio_x * resized_x)
                for idx, corner_position in enumerate(box[1:])
            ]
            
            car_position.append([x_min, y_min, x_max, y_max])
            
    return car_position

In [None]:
# Find the position of a car.
car_position = crop_images(image_de, resized_image_de, boxes)

### Recognize vehicle attributes
[back to top ⬆️](#Table-of-contents:)

Select one of the detected boxes. Then, crop to an area containing a vehicle to test with the recognition model. Again, you need to resize the input image and run inference.

In [None]:
# Select a vehicle to recognize.
pos = car_position[0]
# Crop the image with [y_min:y_max, x_min:x_max].
test_car = image_de[pos[1]:pos[3], pos[0]:pos[2]]
# Resize the image to input_size.
resized_image_re = cv2.resize(test_car, (width_re, height_re))
input_image_re = np.expand_dims(resized_image_re.transpose(2, 0, 1), 0)
plt_show(cv2.cvtColor(resized_image_re, cv2.COLOR_BGR2RGB))

##### Recognition processing
[back to top ⬆️](#Table-of-contents:)

The result contains colors of the vehicles (white, gray, yellow, red, green, blue, black) and types of vehicles (car, bus, truck, van). Next, you need to calculate the probability of each attribute. Then, you determine the maximum probability as the result.

In [None]:
def vehicle_recognition(compiled_model_re, input_size, raw_image):
    """
    Vehicle attributes recognition, input a single vehicle, return attributes
    :param: compiled_model_re: recognition net 
    :param: input_size: recognition input size
    :param: raw_image: single vehicle image
    :returns: attr_color: predicted color
                       attr_type: predicted type
    """
    # An attribute of a vehicle.
    colors = ['White', 'Gray', 'Yellow', 'Red', 'Green', 'Blue', 'Black']
    types = ['Car', 'Bus', 'Truck', 'Van']
    
    # Resize the image to input size.
    resized_image_re = cv2.resize(raw_image, input_size)
    input_image_re = np.expand_dims(resized_image_re.transpose(2, 0, 1), 0)
    
    # Run inference.
    # Predict result.
    predict_colors = compiled_model_re([input_image_re])[compiled_model_re.output(1)]
    openvino_recognition_result_color = compiled_model_re([input_image_re])[compiled_model_re.output(1)]
    # Delete the dim of 2, 3.
    predict_colors = np.squeeze(predict_colors, (2, 3))
    predict_types = compiled_model_re([input_image_re])[compiled_model_re.output(0)]
    openvino_recognition_result_type = compiled_model_re([input_image_re])[compiled_model_re.output(0)]
    predict_types = np.squeeze(predict_types, (2, 3))

    attr_color, attr_type = (colors[np.argmax(predict_colors)],
                             types[np.argmax(predict_types)])
    return attr_color, attr_type, openvino_recognition_result_color, openvino_recognition_result_type

In [None]:
attr_color, attr_type, openvino_recognition_result_color, openvino_recognition_result_type = vehicle_recognition(compiled_model_re, (72, 72), test_car)
print(f"Attributes:{attr_color, attr_type}")

## PySDK Version

In [None]:
# hw_location: where you want to run inference
#     "@cloud" to use DeGirum cloud
#     "@local" to run on local machine
#     IP address for AI server inference
# model_zoo_url: url/path for model zoo
#     cloud_zoo_url: valid for @cloud, @local, and ai server inference options
#     '': ai server serving models from local folder
#     path to json file: single model zoo in case of @local inference
# model_name: name of the model for running AI inference
# image_source: image source for inference
#     path to image file
#     URL of image
#     PIL image object
#     numpy array
hw_location = "@cloud"
model_zoo_url = "https://cs.degirum.com/degirum/timm_gender_model_test"
vehicle_det_model_name = "vehicle_detection--256x256_float_openvino_cpu_1"
vehicle_rec_model_name = "vehicle_recognition--72x72_float_openvino_cpu_1"
image_source = "cars.jpg"

In [None]:
import degirum as dg, degirum_tools

vehicle_det_zoo = dg.connect(hw_location, model_zoo_url, degirum_tools.get_token())
vehicle_rec_zoo = dg.connect(hw_location, model_zoo_url, degirum_tools.get_token())
# load text detection AI model and text recognition AI model
vehicle_det_model = vehicle_det_zoo.load_model(vehicle_det_model_name, input_image_format="RAW")
vehicle_rec_model = vehicle_rec_zoo.load_model(vehicle_rec_model_name)

In [None]:
pysdk_detection_result = vehicle_det_model(image_source).results[0]["data"]

### Comparing Detection results

In [None]:
np.allclose(openvino_detection_result, pysdk_detection_result)

In [None]:
pysdk_boxes = np.squeeze(pysdk_detection_result, (0, 1))
# Remove zero only boxes.
pysdk_boxes = pysdk_boxes[~np.all(pysdk_boxes == 0, axis=1)]
car_position = crop_images(image_de, resized_image_de, pysdk_boxes)

In [None]:
pos = car_position[0]
# Crop the image with [y_min:y_max, x_min:x_max].
test_car = image_de[pos[1]:pos[3], pos[0]:pos[2]]
# Resize the image to input_size.
rec_H, rec_W = vehicle_rec_model.model_info.InputW[0],vehicle_rec_model.model_info.InputC[0]
resized_image_re = cv2.resize(test_car, (rec_H, rec_W))
input_image_re = np.expand_dims(resized_image_re.transpose(2, 0, 1), 0)
input_image_re = input_image_re.astype(np.float32)  

pysdk_recognition_result = vehicle_rec_model(input_image_re)
pysdk_recognition_result_color = pysdk_recognition_result.results[1]["data"]
pysdk_recognition_result_type = pysdk_recognition_result.results[0]["data"]

### Comparing Recognition results

In [None]:
np.allclose(pysdk_recognition_result_color, openvino_recognition_result_color)

In [None]:
np.allclose(pysdk_recognition_result_type, openvino_recognition_result_type)