# Person attributes recognition with OpenVINO

This tutorial demonstrates person attributes recognition with MidasNet in OpenVINO. Model information can be found [here](https://docs.openvino.ai/latest/omz_models_model_person_attributes_recognition_crossroad_0230.html)

  ![ceo](./data/ceo.png)

### Description

This model presents a person attributes classification algorithm analysis scenario. It produces probability of person attributions existing on the sample and a position of two point on sample, which can be used for color prob (like, color picker in graphical editors)

In [1]:
model_name = "person-attributes-recognition-crossroad-0230"

## Preparation
### Imports

In [2]:
import sys
from pathlib import Path

import cv2
import numpy as np
from IPython.display import HTML, FileLink, Video, clear_output, display
from openvino.runtime import Core

sys.path.append("../utils")

### Settings

Here set the path of some models, and set the precision of the model used, can choose "FP16" and "FP32", and then import the inference engine here, see if the "GPU" option can be used.
VideoWriter_fourcc() is the video codec, we set "vp09".

In [3]:
base_model_dir = Path("./model/open_model_zoo_models")
omz_cache_dir = Path("./model/open_model_zoo_cache")
model_dir = Path("./model")
precision = "FP16"
FOURCC = cv2.VideoWriter_fourcc(*"vp09")

# Check if an iGPU is available on this system to use with Benchmark App
ie = Core()
gpu_available = "GPU" in ie.available_devices

### Dwonload models

Use the omz_downloader tool to download the model. This model is called "person-attributes- recognization-crossroad-0230 ", which is the official model library of Intel OpenVino.

In [4]:
# No need for convert !
path_to_model_weights = Path(f'{base_model_dir}/intel/{model_name}/{precision}/{model_name}.bin')

if not path_to_model_weights.is_file():
    download_command = (f"omz_downloader --name {model_name} --output_dir {base_model_dir} --cache_dir {omz_cache_dir}")
    print(download_command)
    ! $download_command
else:
    print("Model has been download")

Model has been download


### Load the Model

Use the omz_downloader tool to download the model. This model is called "person-attributes- recognization-crossroad-0230 ", which is the official model library of Intel OpenVino.Here we have eight different attributes.
1. mark the attributes
2. set the size of input parameter and the size of output parameter

In [5]:
ie = Core()
path_to_model = path_to_model_weights.with_suffix(".xml")

# mark some attributes
attrs = [
    "is_male",
    "has_bag",
    "has_backpack",
    "has_hat",
    "has_longsleeves",
    "has_longpants",
    "has_longhair",
    "has_coat_jacket",
]

model = ie.read_model(model=path_to_model)
compiled_model = ie.compile_model(model=model, device_name="CPU")
recognition_output_layer = next(iter(compiled_model.outputs))
recognition_input_layer = next(iter(compiled_model.inputs))

print(f"{recognition_output_layer.shape} is output layer's shape")
print(f"{recognition_input_layer.shape} is input layer's shape")

{1, 8, 1, 1} is output layer's shape
{1, 3, 160, 80} is input layer's shape


## Functions

the function include image processing and model inference.
For each frame in the video, image processing is performed on it and the return a new image.

1. Change the size and RGB channel of the image
2. Mark the true attribute and false attribute in different colors
3. Put the text in the picture 

In [6]:
def process_image(
    image,
    recognition_output_layer=recognition_output_layer,
    recognition_input_layer=recognition_input_layer,
    attrs=attrs,
):
    N, C, H, W = recognition_input_layer.shape
    # Resize image to meet network expected input sizes

    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

    resized_image = cv2.resize(image, (W, H))
    # Reshape to network input shape
    input_image = np.expand_dims(resized_image.transpose(2, 0, 1), 0)
    output_image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    result = compiled_model([input_image])[recognition_output_layer]

    # Use different colors to indicate whether the target has the attribute
    has_attr = (0, 255, 255)
    no_attr = (255, 0, 255)
    # attribute text height
    text_height = 20

    # there are 8 attributes, put the 8 attributes text into the picture with different color
    for index in range(8):
        # print(type(result[0][index]))
        if result[0][index] > 0.5:
            color = has_attr
        else:
            color = no_attr
        cv2.putText(
            output_image,
            attrs[index],
            (35, text_height),
            cv2.FONT_HERSHEY_COMPLEX,
            1,
            color,
            2,
        )
        text_height += 40
    return output_image

## Load video and play

Use VideoCapture() function to capture each frame of the video, and judge between them.


In [7]:
# if you choose your camera, set the number 0
# cap = cv2.VideoCapture(0)
# load video
mp4dir = Path("./data/ceo.mp4")
result_video_path = Path("./data/transfer.mp4")
cap = cv2.VideoCapture(str(mp4dir))
top_frame = 0

if not cap.isOpened():
    print("Cannot open camera")
    exit()

ret, image = cap.read()
if not ret:
    raise ValueError(f"The video at {mp4dir} cannot be read.")
input_fps = cap.get(cv2.CAP_PROP_FPS)
target_video_frame_height, target_video_frame_width = image.shape[:2]

# Create result video
out_video = cv2.VideoWriter(
    str(result_video_path),
    FOURCC,
    input_fps,
    (target_video_frame_width, target_video_frame_height),
)

# num_frames = int(4 * input_fps)
# total_frames = cap.get(cv2.CAP_PROP_FRAME_COUNT) if num_frames == 0 else num_frames
# progress_bar = ProgressBar(total=total_frames)
# progress_bar.display()

In [8]:
try:
    while True:
        # catch every frame
        ret, frame = cap.read()
        # if run right, ret = True
        if not ret:
            print("Can't receive frame (stream end?). Exiting ...")
            break
        top_frame = top_frame + 1
        image = process_image(frame)
        # Display the result frame E
        # cv2.imshow('frame', image)
        # print("cv had show")
        out_video.write(image)
        # you can do with more frames
        if top_frame > 200:
            break
        if cv2.waitKey(1) == ord("q"):
            break
except KeyboardInterrupt:
    print("Processing interrupted.")

finally:
    clear_output()
    out_video.release()
    cap.release()

# finished all, release all，As there is a famous saying,
# the rainbow after the rain is more beautiful, and the suffering life is more brilliant

## Show the video

In [9]:
video = Video(result_video_path, embed=True)
if not result_video_path.exists():
    raise ValueError(
        "OpenCV was unable to write the video file. Showing one video frame."
    )
else:
    print(f"Showing monodepth video saved at\n{result_video_path.resolve()}")
    print(
        "If you cannot see the video in your browser, please click on the "
        "following link to download the video "
    )
    video_link = FileLink(result_video_path)
    video_link.html_link_str = "<a href='%s' download>%s</a>"
    display(HTML(video_link._repr_html_()))
    display(video)

Showing monodepth video saved at
D:\develop\Cpp_project\cnm\openvino_notebooks\notebooks\222-person-attributes-recognition-crossroad\data\transfer.mp4
If you cannot see the video in your browser, please click on the following link to download the video 


# Delete the downloaded model

The purpose of this block is to clear the downloaded Intel model.

When you are done with the above code and no longer need it, you can run the code below

In [10]:
# import os
# import shutil

# # remove model directory
# os.remove(result_video_path)
# if os.path.exists(model_dir):
#     shutil.rmtree(model_dir)
# else: