# Neural Networks with OpenVINO: make them fly
![](pictures/openvino_start.png)

## What is inference and why do we need separate tools for it?

![](pictures/training_vs_inference.png)

## What is OpenVINO Toolkit anyway?
![](pictures/about_vino.png)

## Full-featured pipeline underhood
![](pictures/openvino_toolkit.png)

The first step of this workshop is initializing OpenVINO environment in this Jupyter notebook. 
The OpenVINO 2020.1 package have been installed to `intel/openvino/` already.
For initializing the OpenVINO environment you should run the script `intel/openvino/bin/setupvars.sh`

In [None]:
!bash ~/intel/openvino/bin/setupvars.sh

## Open Model Zoo
![](pictures/models.png)

The OpenVINO package contains tools for easy download model from [OpenModelZoo](https://github.com/opencv/open_model_zoo) 
and convert the model to Intermediate Representation that OpenVINO supports

To see all available models (both public open-sourse from original frameworks (TensorFlow, Caffe, MxNet, Pytorch e.t.c),
and made in Intel).

In [None]:
!python3 ${INTEL_OPENVINO_DIR}/deployment_tools/open_model_zoo/tools/downloader/downloader.py --print_all

For downloading any of these models you need to use downloader

In [None]:
!python3 ${INTEL_OPENVINO_DIR}/deployment_tools/open_model_zoo/tools/downloader/downloader.py -h

## SSD-MobileNet

![](pictures/mobileNet-SSD-network-architecture.png)

## How SSD works

![](pictures/ssd_boxes.png)

![](pictures/ssd_loc.png)

## MobileNet_v2 review

![](pictures/mobilenetv2.png)

Let's try to download an object detection model `ssd_mobilenet_v2_coco`

In [None]:
!python3  ${INTEL_OPENVINO_DIR}/deployment_tools/open_model_zoo/tools/downloader/downloader.py \
--name ssd_mobilenet_v2_coco \
--output_dir ./data

Model Downloader downloaded the model to `data/public/ssd_mobilenet_v2_coco/ssd_mobilenet_v2_coco_2018_03_29`

In [None]:
!ls data/public/ssd_mobilenet_v2_coco/ssd_mobilenet_v2_coco_2018_03_29

![](pictures/openvino_support.png)

But the Model Downloader downloaded the model in TensorFlow format.
You need convert this model to IR format. 
For this you need run converter script
converter script runs the Model Optimizer with right parameters to converting the model with to IR.
Of course  we can run the Model Optimizer directly. But for this we need pass right arguments to the Model Optimizer.
All information about converting

![](pictures/model_optimizer.png)

- Easy to use, Python*-based workflow does not require rebuilding frameworks.
- Import Models from many supported frameworks: Caffe*, TensorFlow*, MXNet*, Kaldi*, exchange formats like ONNX* (Pytorch*, Caffe2* and others through ONNX).
- 100+ models for Caffe, MXNet, TensorFlow validated. Supports all ONNX* model zoo public models.
- Extends inferencing for non-vision networks with support of LSTM, Bert, GNMT, TDNN-LSTM, ESPNet and more.
- IR files for models using standard layers or user-provided custom layers do not require Caffe.
- Fallback to original framework is possible in cases of unsupported layers, but requires original framework.

![](pictures/mo_result1.png)

In [None]:
!python3 ${INTEL_OPENVINO_DIR}/deployment_tools/open_model_zoo/tools/downloader/converter.py \
--name ssd_mobilenet_v2_coco \
--download_dir ./data \
--output_dir ./data \
--precisions FP32

In [None]:
!ls data/public/ssd_mobilenet_v2_coco/FP32/

You can find a command of running OpenVINO Model Optimizer in the output of the converter.py script.
You can try this command:

In [None]:
!python3 ${INTEL_OPENVINO_DIR}/deployment_tools/model_optimizer/mo.py \
--output_dir=data/public/ssd_mobilenet_v2_coco/FP32 \
--reverse_input_channels \
--model_name=ssd_mobilenet_v2_coco \
--transformations_config=${INTEL_OPENVINO_DIR}/deployment_tools/model_optimizer/extensions/front/tf/ssd_v2_support.json \
--tensorflow_object_detection_api_pipeline_config=data/public/ssd_mobilenet_v2_coco/ssd_mobilenet_v2_coco_2018_03_29/pipeline.config \
--output=detection_classes,detection_scores,detection_boxes,num_detections \
--input_model=data/public/ssd_mobilenet_v2_coco/ssd_mobilenet_v2_coco_2018_03_29/frozen_inference_graph.pb

Needed python imports

In [None]:
# Import OpenCV for image processing
import cv2

# Import some functions from matplotlib for show an image
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec

# Import needed functions from TensorFlow
import tensorflow as tf
from tensorflow.python.framework import graph_io

# Import OpenVINO Inference Engine classes
from openvino.inference_engine import IENetwork, IEPlugin, IECore

# Import other needed functions
import numpy as np

import logging as log
import time
import os
import sys
import platform

Some function are needed for the next part of the workshop:

In [None]:
def read_resize_image(path_to_image: str, width: int, height: int) -> np.ndarray:
    """
    Takes an image and resizes it to the given dimensions
    """
    #Load the image 
    raw_image = cv2.imread(path_to_image)
    #Return the resized to (width, height) size image  
    return cv2.resize(raw_image, (width, height), interpolation=cv2.INTER_NEAREST)

In [None]:
def show_performance(performance_data: dict):
    """
    Takes dictionary contains name of configurations as keys and FPS for it as values
    Plots bar chart with data
    """
    l = np.arange(len(performance_data))
    
    performance = [fps for _, fps in performance_data.items()]
    configurations = list(performance_data.keys())
    figsize=(3*len(performance_data),10)
    print(figsize)
    fig, ax = plt.subplots(figsize=figsize)
    
    bars = ax.bar(x=l, height=performance, tick_label=configurations)

    ax.spines['top'].set_visible(False)
    ax.spines['right'].set_visible(False)
    ax.spines['left'].set_visible(False)
    ax.spines['bottom'].set_color('#DDDDDD')
    
    ax.tick_params(bottom=False, left=False)
    ax.set_axisbelow(True)
    ax.yaxis.grid(True, color='#EEEEEE')
    ax.xaxis.grid(False)
   
    bar_color = bars[0].get_facecolor()

    for bar in bars:
      ax.text(
          bar.get_x() + bar.get_width() / 2,
          bar.get_height() + 5,
          round(bar.get_height(), 1),
          horizontalalignment='center',
          color=bar_color,
          weight='bold',
          fontsize=17
      )
    ax.set_xlabel('Configurations', labelpad=15, color='#333333')
    ax.set_ylabel('Frame per seconds', labelpad=15, color='#333333')
    ax.set_title('Performance mesuarments', pad=15, color='#333333', weight='bold')
    plt.ylim(0, max(performance)+20)
    fig.tight_layout()

In [None]:
def draw_image(original_image: str,
               res: tuple,
               path_to_image: str,
               prob_threshold: float=0.8,
               color: tuple=(0, 255, 0)):
    """
    Takes a path to the image and bounding boxes. Draws those boxes on the new image and saves it
    """
    raw_image = cv2.imread(original_image)
    initial_w = raw_image.shape[1]
    initial_h = raw_image.shape[0]
    labels_map = {
        18: 'dog',
        21: 'cat'
    }
    for obj in res[0][0]:
        # Draw only objects when probability more than specified threshold
        if obj[2] > prob_threshold:
            xmin = int(obj[3] * initial_w)
            ymin = int(obj[4] * initial_h)
            xmax = int(obj[5] * initial_w)
            ymax = int(obj[6] * initial_h)
            class_id = int(obj[1])
            confidence = round(obj[2] * 100, 1)
            cv2.rectangle(raw_image, (xmin, ymin), (xmax, ymax), color, 2)
            det_label = labels_map[class_id] if labels_map else str(class_id)
            box_title = '{} {}%'.format(det_label, confidence)
            cv2.putText(raw_image,
                        box_title,
                        (xmin, ymin - 7),
                        cv2.FONT_HERSHEY_COMPLEX, 5, color, cv2.LINE_AA)
    cv2.imwrite(path_to_image, raw_image)

In [None]:
def show_results_interactively(tf_image: str, ie_image: str, combination_image: str, ie_fps:float, tf_fps:float):
    """
    Takes paths to three images and shows them with matplotlib on one screen
    """
    _ = plt.figure(figsize=(30, 10))
    gs1 = gridspec.GridSpec(1, 3)
    gs1.update(wspace=0.25, hspace=0.05)

    titles = [
        '(a) Tensorflow',
        '(b) Inference Engine',
        '(c) TensorFlow and Inference Engine\n predictions are identical'
    ]

    for i, path in enumerate([tf_image, ie_image, combination_image]):
        img_resized = cv2.imread(path)
        ax_plot = plt.subplot(gs1[i])
        ax_plot.axis("off")
        addon = ' '
        if i == 1:
            addon += '{:4.3f}'.format(ie_fps) + '(FPS)'
        elif i == 0:
            addon += '{:4.3f}'.format(tf_fps) + '(FPS)'

        ax_plot.text(0.5, -0.5, titles[i] + addon,
                     size=28, ha="center",
                     transform=ax_plot.transAxes)
        ax_plot.imshow(cv2.cvtColor(img_resized, cv2.COLOR_BGR2RGB))

    plt.show()

In [None]:
def load_graph(path_to_model: str):
    """
    Creates in memory graph in TensorFlow
    """
    tf.reset_default_graph()
    graph = tf.Graph()
    graph_def = tf.GraphDef()

    with open(path_to_model, "rb") as model_file:
        graph_def.ParseFromString(model_file.read())

    nodes_to_clear_device = graph_def.node if isinstance(
        graph_def, tf.GraphDef) else graph_def.graph_def.node
    for node in nodes_to_clear_device:
        node.device = ""

    with graph.as_default():
        tf.import_graph_def(graph_def, name='')

    log.info("tf graph was created")
    return graph

In [None]:
def children(op_name: str, graph: tf.Graph):
    """
    Get operation node children
    """
    op = graph.get_operation_by_name(op_name)
    return set(op for out in op.outputs for op in out.consumers())

In [None]:
def summarize_graph(graph_def) -> dict:
    unlikely_output_types = [
        'Const', 'Assign',
        'NoOp', 'Placeholder',
        'Assert', 'switch_t', 'switch_f'
    ]
    placeholders = dict()
    outputs = list()
    graph = tf.Graph()
    with graph.as_default():  # pylint: disable=not-context-manager
        tf.import_graph_def(graph_def, name='')
    for node in graph.as_graph_def().node:  # pylint: disable=no-member
        if node.op == 'Placeholder':
            node_dict = dict()
            node_dict['type'] = tf.DType(node.attr['dtype'].type).name
            new_shape = tf.TensorShape(node.attr['shape'].shape)
            node_dict['shape'] = str(new_shape).replace(' ', '').replace('?', '-1')
            placeholders[node.name] = node_dict
        if len(children(node.name, graph)) == 0:
            if node.op not in unlikely_output_types and \
                node.name.split('/')[-1] not in unlikely_output_types:
                outputs.append(node.name)
    result = dict()
    result['inputs'] = placeholders
    result['outputs'] = outputs
    return result

In [None]:
def get_refs(graph: tf.Graph, input_data: dict):
    """
    Return TensorFlow model reference results.
    """
    log.info("Running inference with tensorflow ...")
    feed_dict = {}
    summary_info = summarize_graph(graph.as_graph_def())
    input_layers, output_layers = list(summary_info['inputs'].keys()), summary_info['outputs']

    data_keys = [key for key in input_data.keys()]
    if sorted(input_layers) != sorted(data_keys):
        raise ValueError('input data keys: {0} do not match input '
                         'layers of network: {1}'.format(data_keys, input_layers))

    for input_layer_name in input_layers:
        tensor = graph.get_tensor_by_name(input_layer_name + ':0')
        feed_dict[tensor] = input_data[input_layer_name]
    output_tensors = []
    for name in output_layers:
        tensor = graph.get_tensor_by_name(name + ':0')
        output_tensors.append(tensor)

    log.info("Running tf.Session")
    os.environ['CUDA_VISIBLE_DEVICES'] = '-1' # force inference on CPU
    with graph.as_default():
        with tf.Session(graph=graph) as session:
            inference_start = time.time()
            outputs = session.run(output_tensors, feed_dict=feed_dict)
            inference_end = time.time()
    res = dict(zip(output_layers, outputs))
    log.info("TensorFlow reference collected successfully\n")
    return res, inference_end - inference_start

In [None]:
def parse_od_output(data: dict):
    predictions = []
    num_batches = len(data['detection_boxes'])
    target_layers = ['num_detections', 'detection_classes',
                     'detection_scores', 'detection_boxes']

    for b in range(num_batches):
        predictions.append([])
        num_detections = int(data['num_detections'][b])
        detection_classes = data['detection_classes'][b]
        detection_scores = data['detection_scores'][b]
        detection_boxes = data['detection_boxes'][b]
        for i in range(num_detections):
            obj = [
                b, detection_classes[i], detection_scores[i],
                detection_boxes[i][1], detection_boxes[i][0],
                detection_boxes[i][3], detection_boxes[i][2]
            ]
            predictions[b].append(obj)
    predictions = np.asarray(predictions)
    new_shape = (1, 1, predictions.shape[0] * predictions.shape[1], predictions.shape[2])
    predictions = np.reshape(predictions, newshape=new_shape)
    parsed_data = {'tf_detections': predictions}
    for layer, blob in data.items():
        if layer not in target_layers:
            parsed_data.update({layer: blob})
    return parsed_data

In [None]:
def tf_main(path_to_model: str, path_to_original_image: str, batch: int = 1):
    """
    Entrypoint for inferencing with TensorFlow
    """
    log.info('COMMON: image preprocessing')
    width = 300
    resized_image = read_resize_image(path_to_original_image, width, width)
    reshaped_image = np.reshape(resized_image, (width, width, 3))
    batched_image = np.array([reshaped_image for _ in range(batch)])
    
    log.info('Current shape: {}'.format(batched_image.shape))

    log.info('TENSORFLOW SPECIFIC: Loading a model with TensorFLow')
    graph = load_graph(path_to_model)

    input_data = {
        'image_tensor': batched_image,
    }

    raw_results, delta = get_refs(graph, input_data)
    log.info('TENSORFLOW SPECIFIC: Plain inference finished')

    log.info('TENSORFLOW SPECIFIC: Post processing started')
    processed_results = parse_od_output(raw_results)
    log.info('TENSORFLOW SPECIFIC: Post processing finished')

    return processed_results['tf_detections'], delta

In [None]:
def ie_main(path_to_model_xml: str, path_to_model_bin: str, path_to_original_image: str, device='CPU', batch=1):
    # First create Network (Note you need to provide model in IR previously converted with Model Optimizer)
    log.info("Reading IR...")
    net = IENetwork(model=path_to_model_xml, weights=path_to_model_bin)

    # Now let's create IECore() entity 
    log.info("Creating Inference Engine Core")   
    ie = IECore()


    input_blob = next(iter(net.inputs))
    out_blob = next(iter(net.outputs))

    n, c, h, w = net.inputs[input_blob].shape
    net.reshape({input_blob: (batch, c, h, w)})
    n, c, h, w = net.inputs[input_blob].shape
    
    log.info('COMMON: image preprocessing')
    image = read_resize_image(path_to_original_image, h, w)
    # Now we load Network to plugin
    log.info("Loading IR to the plugin...")
    exec_net = ie.load_network(network=net, device_name=device, num_requests=2)

    del net

    labels_map = None
    
    # Read and pre-process input image
    image = image[..., ::-1]
    in_frame = image.transpose((2, 0, 1))  # Change data layout from HWC to CHW
    batched_frame = np.array([in_frame for _ in range(batch)])
    log.info('Current shape: {}'.format(batched_frame.sTF_RESULT_IMAGEhape))

    # Now we run inference on target device
    inference_start = time.time()
    res = exec_net.infer(inputs={input_blob: batched_frame})
    inference_end = time.time()

    log.info('INFERENCE ENGINE SPECIFIC: no post processing')

    return res[out_blob], inference_end - inference_start

In [None]:
log.basicConfig(format="[ %(levelname)s ] %(message)s", level=log.INFO, stream=sys.stdout)

NUM_RUNS = 1
BATCH = 1

DATA = os.path.join('.', 'data')

IMAGE = os.path.join(DATA, 'images', 'input', 'dog.jpg')

SSD_ASSETS = os.path.join(DATA, 'public', 'ssd_mobilenet_v2_coco')

TF_MODEL = os.path.join(SSD_ASSETS, 'ssd_mobilenet_v2_coco_2018_03_29', 'frozen_inference_graph.pb')
TF_RESULT_IMAGE = os.path.join(DATA, 'images', 'output', 'tensorflow_output.png')

IE_MODEL_FP32_XML = os.path.join(SSD_ASSETS, 'FP32', 'ssd_mobilenet_v2_coco.xml')
IE_MODEL_FP32_BIN = os.path.join(SSD_ASSETS, 'FP32', 'ssd_mobilenet_v2_coco.bin')

IE_MODEL_DEFAULT_INT8_XML = os.path.join(SSD_ASSETS, 'INT8', 'default', 'optimized', 'ssd_mobilenet_v2_coco.xml')
IE_MODEL_DEFAULT_INT8_BIN = os.path.join(SSD_ASSETS, 'INT8', 'default', 'optimized', 'ssd_mobilenet_v2_coco.bin')

IE_MODEL_AA_INT8_XML = os.path.join(SSD_ASSETS, 'INT8', 'acuracy_aware', 'optimized', 'ssd_mobilenet_v2_coco.xml')
IE_MODEL_AA_INT8_BIN = os.path.join(SSD_ASSETS, 'INT8', 'acuracy_aware', 'optimized', 'ssd_mobilenet_v2_coco.bin')


IE_RESULT_IMAGE = os.path.join(DATA, 'images', 'output', 'inference_engine_output.png')

OPENVINO = os.getenv('INTEL_OPENVINO_DIR')
if not OPENVINO:
    print('Please, install OpenVINO and initialize the environment')
    sys.exit(1)

COMBO_RESULT_IMAGE = os.path.join(DATA, 'images', 'output', 'combo_output.png')

PERFORMANCE = {}

In [None]:
def ie_inference(xml:str, bin:str, device:str, postfix: str):
    name = '{f} {p} on {d}'.format(f='IE', p=postfix, d=device)

    ie_fps_collected = []

    for i in range(NUM_RUNS):
        predictions, inf_time = ie_main(xml, bin,
                                        IMAGE,
                                        device,
                                        batch=BATCH)
        ie_fps = 1 / inf_time
        ie_fps_collected.append(ie_fps)

    ie_avg_fps = (sum(ie_fps_collected) * BATCH) / (NUM_RUNS)

    PERFORMANCE[name] = ie_avg_fps

    log.info('{} FPS: {}'.format(name, ie_avg_fps))

    draw_image(IMAGE, predictions, IE_RESULT_IMAGE, color=(0, 0, 255))
    
    return ie_avg_fps, predictions

In [None]:
framework = 'TF'
device = 'CPU'
name = '{f} on {d}'.format(f=framework, d=device)

tf_fps_collected = []

for i in range(NUM_RUNS):
    predictions, inf_time = tf_main(TF_MODEL, 
                                    IMAGE,
                                    batch=BATCH)
    tf_fps = 1 / inf_time
    tf_fps_collected.append(tf_fps)

tf_avg_fps = (sum(tf_fps_collected) * BATCH) / (NUM_RUNS)

PERFORMANCE[name] = tf_avg_fps

log.info('{} FPS: {}'.format(name, tf_avg_fps))
draw_image(IMAGE, predictions, TF_RESULT_IMAGE, color=(255, 0, 0))

## Accuracy checker

![](pictures/accuracy_check.png)

In [None]:
models:
    - name:  ssd_mobilenet_v1_coco
      launchers:
        - framework: dlsdk
          tags:
            - FP32
          model: data/public/ssd_mobilenet_v2_coco/FP32/ssd_mobilenet_v2_coco.xml
          weights: data/public/ssd_mobilenet_v2_coco/FP32/ssd_mobilenet_v2_coco.bin
          adapter: ssd
          device: CPU
  
      datasets:
        - name: ms_coco_detection_91_classes
          data_source: data/datasets/COCO200
          annotation_conversion: 
              annotation_file: data/datasets/COCO200/instances_val2017_200pictures.json
              has_background: True
              use_full_label_map: True
              converter: mscoco_detection
          preprocessing:
            - type: resize
              size: 300
          postprocessing:
            - type: resize_prediction_boxes
          metrics:
            - type: coco_precision

In [None]:
!accuracy_check -c data/configs/accuracy_checker_config_tf.yml

In [None]:
device = 'CPU'
ie_avg_fps, predictions = ie_inference(IE_MODEL_FP32_XML, IE_MODEL_FP32_BIN, device, '')

draw_image(TF_RESULT_IMAGE, predictions, COMBO_RESULT_IMAGE, color=(0, 0, 255))

show_results_interactively(tf_image=TF_RESULT_IMAGE,
                           ie_image=IE_RESULT_IMAGE,
                           combination_image=COMBO_RESULT_IMAGE,
                           ie_fps=ie_avg_fps,
                           tf_fps=tf_avg_fps)

show_performance(PERFORMANCE)

In [None]:
!accuracy_check -c data/configs/accuracy_checker_config.yml

## Quantization

![](pictures/quantization.png)

## Post-training optimization toolkit
![](pictures/pot.png)

## Two types of optimization

![](pictures/pot_algos.png)

## Intermediate representation after quantizing

![](pictures/mo_result.png)

In [None]:
!python3 ${INTEL_OPENVINO_DIR}/deployment_tools/tools/post_training_optimization_toolkit/main.py \
-c data/configs/default/quantization_config.json \
--output-dir data/public/ssd_mobilenet_v2_coco/INT8/default \
--direct-dump

In [None]:
device = 'CPU'
ie_avg_fps, predictions = ie_inference(IE_MODEL_DEFAULT_INT8_XML, IE_MODEL_DEFAULT_INT8_BIN, device, 'INT8 D')

draw_image(TF_RESULT_IMAGE, predictions, COMBO_RESULT_IMAGE, color=(0, 0, 255))

show_results_interactively(tf_image=TF_RESULT_IMAGE,
                           ie_image=IE_RESULT_IMAGE,
                           combination_image=COMBO_RESULT_IMAGE,
                           ie_fps=ie_avg_fps,
                           tf_fps=tf_avg_fps)

show_performance(PERFORMANCE)

In [None]:
!accuracy_check -c data/configs/default/accuracy_checker_config.yml

In [None]:
!python3 ${INTEL_OPENVINO_DIR}/deployment_tools/tools/post_training_optimization_toolkit/main.py \
-c data/configs/accuracy_aware/quantization_config.json \
--output-dir data/public/ssd_mobilenet_v2_coco/INT8/acuracy_aware \
--direct-dump

In [None]:
device = 'CPU'
ie_avg_fps, predictions = ie_inference(IE_MODEL_AA_INT8_XML, IE_MODEL_AA_INT8_BIN, device, 'INT8 AA')

draw_image(TF_RESULT_IMAGE, predictions, COMBO_RESULT_IMAGE, color=(0, 0, 255))

show_results_interactively(tf_image=TF_RESULT_IMAGE,
                           ie_image=IE_RESULT_IMAGE,
                           combination_image=COMBO_RESULT_IMAGE,
                           ie_fps=ie_avg_fps,
                           tf_fps=tf_avg_fps)

show_performance(PERFORMANCE)

In [None]:
!accuracy_check -c data/configs/accuracy_aware/accuracy_checker_config.yml

In [None]:
!${INTEL_OPENVINO_DIR}/deployment_tools/inference_engine/samples/cpp/build/intel64/Release/benchmark_app -h

In [None]:
!${INTEL_OPENVINO_DIR}/deployment_tools/inference_engine/samples/cpp/build/intel64/Release/benchmark_app -m workshop/data/public/ssd_mobilenet_v2_coco/FP32/ssd_mobilenet_v2_coco.xml

# Practice

In [None]:
!pip install ipywebrtc

In [None]:
import os

# Read/write video, work with images
import cv2

# Inference
from openvino.inference_engine import IENetwork, IECore

# Show videos in the notebook
from ipywidgets import Video

In [None]:
MODEL_PATH_XML = 'data/public/ssd_mobilenet_v2_coco/FP32/ssd_mobilenet_v2_coco.xml'
MODEL_PATH_BIN = 'data/public/ssd_mobilenet_v2_coco/FP32/ssd_mobilenet_v2_coco.bin'
DEVICE = 'CPU'
INPUT_VIDEO = 'practice/data/artyom.MP4'
OUTPUT_VIDEO = 'practice/data/out_artyom.MP4'
LABELS_PATH = 'practice/data/coco_labels.txt'

In [None]:
Video.from_file(INPUT_VIDEO)

In [None]:
def prapare_out_video_stream(input_video_stream):
    width  = int(input_video_stream.get(3))
    height = int(input_video_stream.get(4))
    return cv2.VideoWriter(OUTPUT_VIDEO, cv2.VideoWriter_fourcc(*'X264'), 20, (width, height))

In [None]:
# Create object of IECore. 
# This class represents an Inference Engine entity 
# and allows you to manipulate with plugins using unified interfaces
ie = IECore()

# Load network as Intermediate Representation 
# The IENetwork class contains the information about the network model read from Intermediate Representation
# and allows you to manipulate with some model parameters such as layers affinity and output layers
net = IENetwork(model=MODEL_PATH_XML, weights=MODEL_PATH_BIN)

# Get names of input layers of the network
input_blob = next(iter(net.inputs))

print('Input layer of the network is {}'.format(input_blob))

# Get shape (dimensions) of the input layer of the network
# n - number of batch
# c - number of an input image channels (usualy 3 - R, G and B) 
# h - height
# w - width
n, c, h, w = net.inputs[input_blob].shape

print('Input shape of the network is [{}, {}, {}, {}]'.format(n, c, h, w))

# Get names of output layers of the network
out_blob = next(iter(net.outputs))

print('Output layer of the network is {}'.format(out_blob))

# Load names of COCO classes from the file 
with open(LABELS_PATH, 'r') as f:
    labels_map = [x.strip() for x in f]

# Load the network to the device
# The load_network function returns an object of ExecutableNetwork
# This class represents a network instance loaded to plugin and ready for inference
exec_net = ie.load_network(network=net, num_requests=2, device_name=DEVICE)


# Open an input video
input_video_stream = cv2.VideoCapture(INPUT_VIDEO)

# Create an output video stream
out = prapare_out_video_stream(input_video_stream)



feed_dict = {}

cur_request_id = 0
next_request_id = 1

# Do loop ny input video
while input_video_stream.isOpened():
    
    # Read the next frame from the intput video 
    ret, frame = input_video_stream.read()
    # Check if video is over
    if not ret:
        # Exit from the loop if video is over
        break 
    # Get height and width of the frame
    frame_h, frame_w = frame.shape[:2]
    
    # Resize the frame to network's input 
    in_frame = cv2.resize(frame, (w, h))
    
    # Change data layout from HWC to CHW
    in_frame = in_frame.transpose((2, 0, 1))  
    
    # Reshape the frame to network's input 
    in_frame = in_frame.reshape((n, c, h, w))
    
    # Prepare data for network.
    # This must be a dictionary: 
    #   key - name of the input layer
    #   value - input data (the prepared frame)  
    feed_dict[input_blob] = in_frame
    
    # Start Asynchronous Inference.
    # We must set request_id - number or identificator of Inference Request
    # and input data - the dictionary
    exec_net.start_async(request_id=cur_request_id, inputs=feed_dict)
    
    # Wait the inference request until Inference Engine finished the inference of the request
    if exec_net.requests[cur_request_id].wait(-1) == 0:
        # Read result of the inference from the out layer of the execution network 
        inference_request_result = exec_net.requests[cur_request_id].outputs[out_blob]
        
        # Iterate by all found objects
        for obj in inference_request_result[0][0]:
            # Draw only objects when probability more than specified threshold
            if obj[2] > 0.5:
                # Get coordinates of the found object
                # and scale it to the original size of the frame
                xmin = int(obj[3] * frame_w)
                ymin = int(obj[4] * frame_h)
                xmax = int(obj[5] * frame_w)
                ymax = int(obj[6] * frame_h)
                
                # Get class ID of the found object
                class_id = int(obj[1])
                
                # Get confidence for the found object.
                confidence = round(obj[2] * 100, 1)
                
                # Draw box and label
                color = (min(class_id * 12.5, 255), min(class_id * 7, 255), min(class_id * 5, 255))
                cv2.rectangle(frame, (xmin, ymin), (xmax, ymax), color, 2)
                
                # Get label of the class
                label = labels_map[class_id]
                
                # Create titel of the object
                text = '{}: {}% '.format(label, confidence)
                
                # Put the titel to the frame
                cv2.putText(frame, text, (xmin, ymin - 7), cv2.FONT_HERSHEY_COMPLEX, 2, color, 2)
        
    # Write the result frame to the out stream
    out.write(frame)

# Save result video
out.release()

In [None]:
Video.from_file(OUTPUT_VIDEO)