# Open Model Zoo Object Detection Demo

This demo showcases Object Detection with Sync and Async API.

Async API usage can improve overall frame-rate of the application, because rather than wait for inference to complete,
the app can continue doing things on the host, while accelerator is busy.


Other demo objectives are:

* Video as input support via OpenCV\*
* Visualization of the resulting bounding boxes

See the [Python demo](../python/) for more details about the Async API, and the [Optimization Guide](https://docs.openvinotoolkit.org/latest/_docs_optimization_guide_dldt_optimization_guide.html) for more information on optimizing models.


## Prerequisites

...




<div class="alert alert-warning" style="color:black"><i>
<b>Note: </b>Binder has limited resources. If you run this notebook in Binder, it may suddenly stop. You can reload the page and try a different model, or run the notebook on your own computer.</div>

To run this notebook on your own computer:
    
* clone the Open Model Zoo repository to your computer with <span style="font-family: monospace;font-style: normal">git clone https://github.com/helena-intel/open_model_zoo.git</span>
* go to the directory that contains this notebook (demos/object_detection_demo/jupyter-python) and install the requirements in that directory with  <span style="font-family: monospace;font-style: normal">pip install requirements.txt</span>
* run <span style="font-family: monospace;font-style: normal">jupyter lab</span>

## Imports

In [1]:
import glob
import json
import os.path
import random
import re
import subprocess
import sys
from pathlib import Path
from time import perf_counter

import cv2
import ipywidgets as widgets
import matplotlib.pyplot as plt
from IPython.display import clear_output
from ipywidgets import Layout, fixed, interact, interact_manual
from openvino.inference_engine import IECore

from detection_utils import ColorPalette, download_video, draw_detections, get_model, put_highlighted_text

open_model_zoo_path =  os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(os.curdir))))


sys.path.append(os.path.join(open_model_zoo_path, "demos", "common", "python"))

from pipelines import AsyncPipeline

## Settings

Set the file and directory paths. The default settings expect that the models are located in `open_model_zoo_models` in your `$HOME` directory, typically `c:\users\username` or `/home/username`. You can change this by setting the `base_model_dir` variable to another directory.

In [49]:
base_model_dir = os.path.expanduser("~/open_model_zoo_models")
precision = "FP16"
num_infer_requests = 3
loop = False
prob_threshold = 0.5
utilization_monitors = ""
device = "CPU"

palette = ColorPalette(100)
font_scale = 1
thickness = 2

DOWNLOAD_MODELS=True
CONVERT_MODELS=True

omz_cache_dir = os.path.expanduser('~')

# The settings below are only required if you want to use the Model Converter to convert models to OpenVINO IR format.
# You can use this demo with models that are already downloaded in IR format, so use of the model optimizer is optional.

# The path to the Model Optimizer is required if models need to be converted to IR. The paths below should work for default installations of 
# the Intel Distribution of OpenVINO Toolkit https://software.intel.com/content/www/us/en/develop/tools/openvino-toolkit/download.html
# Adjust them if you installed OpenVINO in a different location.
# Note that you also need to install the Model Optimizer prerequisites. See the documentation for your OS at 
# https://docs.openvinotoolkit.org/latest/installation_guides.html
if CONVERT_MODELS:
    if sys.platform.startswith('win'):
        model_optimizer_path = r"C:\Program Files (x86)\intel\openvino_2021\deployment_tools\model_optimizer\mo.py"  # Windows
    else:
        model_optimizer_path = "/opt/intel/openvino_2021/deployment_tools/model_optimizer/mo.py"  # Linux/MacOS

## Download Models and convert them to IR format

The [Model Downloader](https://github.com/openvinotoolkit/open_model_zoo/blob/master/tools/downloader/README.md) downloads models from the Open Model Zoo. Models that are not in OpenVINO IR format are converted to this format by the Model Converter. 

The [Open Model Zoo](https://github.com/openvinotoolkit/open_model_zoo/) models that are compatible with this demo are listed in the file *models.lst* in the same folder as this notebook. By default all these models are downloaded, with the `--list=models.lst` argument for the Model Downloader. You can choose to download a specific model by using `--name=model_name` instead of `--list=models.lst`. If you already have downloaded Open Zoo Models, you can set the `base_model_dir` variable in the *Settings* cell to the folder that contains your models (this should be a folder with subfolders `intel` and `public`) and set `DOWNLOAD_MODELS` to `False`.

<div class="alert alert-info" style="color:black"><i>
<b>Note: </b>It will take a while to download and convert all the models. </div> 

In [50]:
if DOWNLOAD_MODELS:
    downloader_command = os.path.join(open_model_zoo_path, "tools", "downloader", "downloader.py")
    download_result = subprocess.run(
        [
            "python",
            downloader_command,
            "--output_dir",
            base_model_dir,
            "--jobs",
            "4",
            "--cache_dir",
            omz_cache_dir,
            "--precision",
            precision,
            "--list",
            "models.lst",
        ],
        shell=False,
        check=False,
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE,
        text=True,
    )
#     if download_result.returncode == 0:
#         print(
#             "Downloading models succeeded. You can set `DOWNLOAD_MODELS=False` to save some time when you run this notebook again."
#         )
#     else:
#         print(f"Downloading models failed. The error message is: {download_result.stderr}")

In [51]:
# Convert the models that are not in IR format to IR
if CONVERT_MODELS:
    converter_command = os.path.join(open_model_zoo_path, "tools", "downloader", "converter.py")
    converter_result = subprocess.run(
        [
            "python",
            converter_command,
            "--download_dir",
            base_model_dir,
            "--list",
            "models.lst",
            "--precisions",
            precision,
            "--mo",
            model_optimizer_path
        ],
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE,
        capture_output=False,
        shell=False,
    )
    if converter_result.returncode == 0:
        print("Converting models succeeded.")
    else:
        print(
            f"There were some error messages while converting the models. Check `converter_result.stderr` for more details."
        )

There were some error messages while converting the models. Check `converter_result.stderr` for more details.


### Get model info

The Info Dumper returns information for the Open Model Zoo models. It returns a list of dictionaries with the model name, description, framework, license url, precisions, task type, and the subdirectory for the downloaded model.

In [52]:
info_command = os.path.join(open_model_zoo_path, "tools", "downloader", "info_dumper.py")
info_result = subprocess.run(
    [
        "python",
        info_command,
        "--list",
        "models.lst",
    ],
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
    capture_output=False,
    shell=False,
    text=True,
)
info = json.loads(info_result.stdout)
model_names = [model["name"] for model in info if "intel" in model["subdirectory"]]
model_names = [model["name"] for model in info ]

In [53]:
# Show an example of the information that the Info Dumper returns
info[0]

{'name': 'ctdet_coco_dlav0_384',
 'description': 'CenterNet object detection model "ctdet_coco_dlav0_384" originally trained with PyTorch*. CenterNet models an object as a single point - the center point of its bounding box and uses keypoint estimation to find center points and regresses to object size. For details see paper <https://arxiv.org/abs/1904.07850>, repository <https://github.com/xingyizhou/CenterNet/>.',
 'framework': 'pytorch',
 'license_url': 'https://raw.githubusercontent.com/xingyizhou/CenterNet/master/LICENSE',
 'precisions': ['FP16', 'FP32'],
 'subdirectory': 'public/ctdet_coco_dlav0_384',
 'task_type': 'detection'}

The `models.lst` file lists the models that are supported by this demo, sorted by architecture. The model names can contain wildcard. For example, `face-detection-????` means that the demo supports all models with a name that starts with `face-detection-` followed by four digits. 

We create a `model_architectures` dictionary that maps the model names given by the Info Dumper, to an architecture given by `models.lst`.

In [54]:
model_architectures = {}
modellist = open("models.lst").read().splitlines()

for line in modellist[1:]:
    if line.startswith("# For"):
        _, architecture = line.split("=")
    else:
        model_architectures[line] = architecture
        for modelname in model_names:
            modelpattern = re.search(line.replace("?", "[0-9]"), modelname)
            if modelpattern:
                model_architectures[modelpattern.group(0)] = architecture

## Create inference functions

The `do_inference_on_video` function performs the inference of a model on a specific video. The helper function `process_results` add the time to the result from the pipeline, so that the inference speed can be computed. The function opens the video file given by `input_filename` with OpenCV's `VideoCapture`. It reads the frames sequentially, `jump_frames` frames at a time. If `jump_frames = 1` all frames will be read. By default `jump_frames=10` which means that every tenth frame will be read. While there are new frames, the code:

* Checks if there are results from the pipeline. If there are, it records the time, and adds the result to the list of results
* Checks if the pipeline is ready. If it is, it sees if there is a new frame. 
  * If there is a new frame (we have not reached the end of the video), the frame is read, and sent to the detector pipeline for inference. 
  * If there are no more frames, the video is closed

At the end of the function, we wait until the detector is finished, and add the final results to the list of results

In [55]:
jump_frames = 10


def do_inference_on_video(detector_pipeline, input_filename):
    resultlist = []
    next_frame_id = 0
    next_frame_id_to_show = 0
    overall_start_time = perf_counter()

    def process_results(results):
        """Helper function to add inference time to results"""
        outputs, meta = results
        meta["end_time"] = perf_counter()
        meta["overall_start_time"] = overall_start_time
        return outputs, meta

    cap = cv2.VideoCapture(input_filename)

    while cap.isOpened():
        cap.set(cv2.CAP_PROP_POS_FRAMES, next_frame_id)
        if detector_pipeline.callback_exceptions:
            raise detector_pipeline.callback_exceptions[0]

        # Process all completed requests
        results = detector_pipeline.get_result(next_frame_id_to_show)
        if results:
            resultlist.append(process_results(results))
            next_frame_id_to_show += jump_frames

        if detector_pipeline.is_ready():
            # Get new image/frame
            start_time = perf_counter()
            ret, frame = cap.read()
            if not ret:
                cap.release()
                continue

            # Submit for inference
            detector_pipeline.submit_data(frame, next_frame_id, {"frame": frame, "start_time": start_time})
            next_frame_id += jump_frames

        else:
            # Wait for empty request
            detector_pipeline.await_any()
        # Process completed requests

    detector_pipeline.await_all()

    while detector_pipeline.has_completed_request():
        results = detector_pipeline.get_result(next_frame_id_to_show)
        if results:
            resultlist.append(process_results(results))
            next_frame_id_to_show += jump_frames

    return resultlist

In [56]:
[item for item in info if item['name'].startswith('effic')]

[{'name': 'efficientdet-d0-tf',
  'description': 'The "efficientdet-d0-tf" model is one of the EfficientDet <https://arxiv.org/abs/1911.09070> models  designed to perform object detection. This model was pretrained in TensorFlow*. All the EfficientDet models have been pretrained on the MSCOCO* image database. For details about this family of models, check out the Google AutoML repository <https://github.com/google/automl/tree/master/efficientdet>.',
  'framework': 'tf',
  'license_url': 'https://raw.githubusercontent.com/google/automl/master/LICENSE',
  'precisions': ['FP16', 'FP32'],
  'subdirectory': 'public/efficientdet-d0-tf',
  'task_type': 'detection'},
 {'name': 'efficientdet-d1-tf',
  'description': 'The "efficientdet-d1-tf" model is one of the EfficientDet <https://arxiv.org/abs/1911.09070> models  designed to perform object detection. This model was pretrained in TensorFlow*. All the EfficientDet models have been pretrained on the MSCOCO* image database. For details about thi

In [44]:
def get_results_for_model(modelname, num_threads, num_streams, num_requests):
    input_filename = get_input_filename()
    model_info = [item for item in info if item["name"] == modelname][0]
    model_xml = os.path.join(base_model_dir, model_info["subdirectory"], precision, modelname + ".xml")
    resultvideos = []
    architecture_type = model_architectures[modelname]
    ie = IECore()

    model = get_model(ie, model=Path(model_xml), architecture_type=architecture_type, labels=None)
    plugin_config = {
        "CPU_THREADS_NUM": f"{num_threads}",
        "CPU_THROUGHPUT_STREAMS": f"{num_streams}",
    }
    detector_pipeline = AsyncPipeline(ie, model, plugin_config, device="CPU", max_num_requests=num_requests)
#     print(
#         f"Starting inference. Model: {modelname}, video: {input_filename},  threads: {num_threads}, streams: {num_streams}, max_num_requests: {num_requests}"
#     )
    start_time = perf_counter()
    result = do_inference_on_video(detector_pipeline, input_filename)
    end_time = perf_counter()

    has_landmarks = architecture_type == "retina"

    resultvideo = make_result_videos(result, has_landmarks)
    fps = len(resultvideo) / (end_time - start_time)

    return resultvideo, fps

The `make_result_videos` function takes the output of the `do_inference_on_video` function and returns a list of videoframes with detection boxes drawn on the frame, as well as the fps and latency.

In [35]:
def make_result_videos(resultlist, has_landmarks):
    framelist = list()

    for i, (objects, meta) in enumerate(resultlist):
        start_time = meta["start_time"]
        overall_start_time = meta["overall_start_time"]
        end_time = meta["end_time"]
        latency = (end_time - start_time) * 1000
        fps = (i + 1) / (end_time - overall_start_time)

        frame = meta["frame"]
        frame = draw_detections(
            frame=cv2.cvtColor(frame, cv2.COLOR_BGR2RGB),
            detections=objects,
            palette=palette,
            labels=None,
            threshold=prob_threshold,
            draw_landmarks=has_landmarks,
        )
        put_highlighted_text(
            frame,
            "Latency: {:.1f} ms".format(latency),
            (20, 30),
            cv2.FONT_HERSHEY_COMPLEX,
            font_scale,
            palette[0],
            thickness,
        )
        put_highlighted_text(
            frame,
            "FPS: {:.1f}".format(fps),
            (20, 60),
            cv2.FONT_HERSHEY_COMPLEX,
            font_scale,
            palette[0],
            thickness,
        )

        framelist.append(frame)
    return framelist

## Create widgets

This demo works with a variety of [Open Model Zoo](https://github.com/openvinotoolkit/open_model_zoo/) models and allows you to use your own video.  We create widgets with [IPywidgets](https://github.com/jupyter-widgets/ipywidgets) to easily select a model and choose a video from your PC.

## Download or upload a video

### Option 1: Download a sample video

In [11]:
sample_video_base_url = "https://github.com/intel-iot-devkit/sample-videos/raw/master"
sample_video_filenames = open("sample_videos.lst").read().splitlines()
sample_video_list = [(fn[:-4], os.path.join(sample_video_base_url, fn)) for fn in sample_video_filenames]

In [12]:
sample_video = widgets.Dropdown(options=sample_video_list, index=12)
sample_video

Dropdown(index=12, options=(('bolt-detection', 'https://github.com/intel-iot-devkit/sample-videos/raw/master/b…

### Option 2: Upload your own video

In [13]:
uploader = widgets.FileUpload(multiple=False)
uploader

FileUpload(value={}, description='Upload')

`get_input_filename` checks if a video was uploaded. If so, it returns the filename of that video. If not, it returns the selected sample video. 

In [36]:
def get_input_filename():
    """If a video is uploaded, returns the filename of that video. If not, returns the selected sample video."""
    if len(uploader.value) > 0:
        uploaded_filename = next(iter(uploader.value))
        content = uploader.value[uploaded_filename]["content"]
        with open(uploaded_filename, "wb") as f:
            f.write(content)
        input_filename = uploaded_filename
    else:
        input_filename = os.path.basename(sample_video.value)
        if not os.path.exists(input_filename):
            download_video(sample_video.value)

    return input_filename

---

## Detection results of one model, drawn on video

In [37]:
interact_inference = interact_manual.options(manual_name="Do inference")


@interact_inference
def show_results_on_model(model=model_names, num_threads=(0, 8), num_streams=(0, 8), num_requests=(0, 10)):
    resultvideo, fps = get_results_for_model(model, num_threads, num_streams, num_requests)
    for item in resultvideo:
        clear_output(wait=True)
        plt.imshow(item)
        plt.axis("off")
        plt.show()
    print(
        f"Finished inference. Model: {model},  threads: {num_threads}, streams: {num_streams}, max_num_requests: {num_requests}. FPS: {fps:.2f}"
    )

interactive(children=(Dropdown(description='model', options=('ctdet_coco_dlav0_384', 'ctdet_coco_dlav0_512', '…

---

## Detection results of multiple models

Perform inference on up to four selected models. Show results on three random frames by clicking on the *Show frames* button after inference is complete. Click the button again to show different frames.

In [38]:
select_model_widget = widgets.SelectMultiple(
    description="Models",
    options=model_names,
    index=[2, 5, 8],
    rows=32,
    layout=Layout(display="flex", flex_flow="column"),
    disabled=False,
)

In [39]:
@interact_inference(modelnames=select_model_widget)
def multiple(modelnames, num_threads=(0, 8), num_streams=(0, 8), num_requests=(0, 10)):
    global resultvideos
    resultvideos = []
    for i, modelname in enumerate(modelnames):
        resultvideo, fps = get_results_for_model(modelname, num_threads, num_streams, num_requests)
        resultvideos.append(resultvideo)
        print(f"--- Finished: FPS: {fps:.2f}")

interactive(children=(SelectMultiple(description='Models', index=(2, 5, 8), layout=Layout(display='flex', flex…

In [45]:

def multiple(modelnames, num_threads=(0, 8), num_streams=(0, 8), num_requests=(0, 10)):
    global resultvideos
    resultvideos = []
    print("Model name, FPS")
    for i, modelname in enumerate(modelnames):
        resultvideo, fps = get_results_for_model(modelname, num_threads, num_streams, num_requests)
        #resultvideos.append(resultvideo)
        print(f"{modelname}, {fps:.2f}")

multiple(model_names, 4, 4, 5)

Model name, FPS
ctdet_coco_dlav0_384, 10.26
ctdet_coco_dlav0_512, 5.77
faceboxes-pytorch, 24.58
efficientdet-d0-tf, 22.45
efficientdet-d1-tf, 11.15
face-detection-0200, 107.70
face-detection-0202, 90.25
face-detection-0204, 67.31
face-detection-0205, 77.97
face-detection-0206, 1.16
face-detection-adas-0001, 72.89
face-detection-retail-0004, 103.41
face-detection-retail-0005, 83.56
face-detection-retail-0044, 98.04
faster-rcnn-resnet101-coco-sparse-60-0001, 0.47
pedestrian-and-vehicle-detector-adas-0001, 67.27
pedestrian-detection-adas-0002, 73.73
pelee-coco, 61.69
person-detection-0106, 0.79
person-detection-0200, 105.26
person-detection-0201, 84.81
person-detection-0202, 70.07
person-detection-0203, 33.52
person-detection-retail-0013, 82.85
person-vehicle-bike-detection-2000, 103.00
person-vehicle-bike-detection-2001, 87.28
person-vehicle-bike-detection-2002, 69.92
product-detection-0001, 60.34
retinanet-tf, 0.55
rfcn-resnet101-coco-tf, 2.57
ssd300, 6.99
ssd512, 2.44
ssd_mobilenet_v1_

Exception: Path to the model /home/lena/open_model_zoo_models/public/retinaface-anti-cov/FP16/retinaface-anti-cov.xml doesn't exist or it's a directory

In [62]:

def multiple(modelnames, num_threads=(0, 8), num_streams=(0, 8), num_requests=(0, 10)):
    global resultvideos
    resultvideos = []
    print("Model name, FPS")
    for i, modelname in enumerate(modelnames):
        resultvideo, fps = get_results_for_model(modelname, num_threads, num_streams, num_requests)
        #resultvideos.append(resultvideo)
        print(f"{modelname}, {fps:.2f}")

multiple(new_model_names, 4, 4, 5)

Model name, FPS
yolo-v1-tiny-tf, 44.52
yolo-v2-ava-0001, 13.48
yolo-v2-ava-sparse-35-0001, 13.46
yolo-v2-ava-sparse-70-0001, 13.43
yolo-v2-tf, 6.75
yolo-v2-tiny-ava-0001, 43.49
yolo-v2-tiny-ava-sparse-30-0001, 43.79
yolo-v2-tiny-ava-sparse-60-0001, 44.25
yolo-v2-tiny-tf, 53.74
yolo-v2-tiny-vehicle-detection-0001, 53.70
yolo-v3-tf, 6.35
yolo-v3-tiny-tf, 40.07


In [61]:
new_model_names = [

 'yolo-v1-tiny-tf',
 'yolo-v2-ava-0001',
 'yolo-v2-ava-sparse-35-0001',
 'yolo-v2-ava-sparse-70-0001',
 'yolo-v2-tf',
 'yolo-v2-tiny-ava-0001',
 'yolo-v2-tiny-ava-sparse-30-0001',
 'yolo-v2-tiny-ava-sparse-60-0001',
 'yolo-v2-tiny-tf',
 'yolo-v2-tiny-vehicle-detection-0001',
 'yolo-v3-tf',
 'yolo-v3-tiny-tf']

In [18]:
@interact_manual.options(manual_name="Show frames")
def show_random_frames():
    global resultvideos
    try:
        fig, ax = plt.subplots(3, len(select_model_widget.value), figsize=(25, 15), squeeze=False)

        indices = random.choices(range(len(resultvideos[0])), k=3)
        for i in range(len(resultvideos)):
            modelname = select_model_widget.value[i]
            resultvideo = resultvideos[i]

            for j, framenr in enumerate(indices):
                ax[j, i].imshow(resultvideo[framenr])
                ax[0, i].set_title(modelname)
        for a in ax.ravel():
            a.axis("off")
    except NameError:
        pass

interactive(children=(Button(description='Show frames', style=ButtonStyle()), Output()), _dom_classes=('widget…