![](pictures/openvino_start.png)

# Section 0: Register and Set Up Environment

The first step of the workshop is initializing the OpenVINO™ environment in this Jupyter notebook. 
The OpenVINO™ 2020.1 package have been installed to `intel/openvino/` already.

To initialize the OpenVINO™ environment, run the `intel/openvino/bin/setupvars.sh` script.
If the prerequisite steps have been done right, you will see the output: 

```
[setupvars.sh] OpenVINO environment initialized
OpenVINO Inference Engine version is: 2.1.37988
```

In [None]:
!bash ~/intel/openvino/bin/setupvars.sh

from openvino import inference_engine as ie
print('OpenVINO Inference Engine version: {}'.format(ie.__version__))

# Agenda

##  1. Introduction

##  2. What is SSD MobileNet V2?

# Section 1: Introduction

![](pictures/training_vs_inference.png)

![](pictures/about_vino.png)

In [None]:
# mostly for working with paths: os.path
import os

# working with arrays
import numpy as np 

# path to data for the workshop
WORKSHOP_DATA_PATH = os.path.join('.', 'data')

## Inference in 4 Lines

Once [OpenVINO™](https://docs.openvinotoolkit.org/) is installed, you can run an inference:

In [None]:
from openvino import inference_engine as ie

# Create an instance of the OpenVINO Inference Engine Core 
# This is the key module of the OpenVINO Inference Engine
ie_core = ie.IECore()

# Read a network from the Intermediate Representation (IR)
network = ie.IENetwork(os.path.join(WORKSHOP_DATA_PATH, 'model.xml'), 
                       os.path.join(WORKSHOP_DATA_PATH, 'model.bin'))

# Load the network that was read from the Intermediate Representation (IR) 
# to the CPU device 
network_loaded_on_device = ie_core.load_network(network=network, device_name='CPU')

# Start an inference of the loaded network and return output data
network_loaded_on_device.infer(inputs={'data': np.random.rand(1, 3, 227, 227)})

For more information, go to references of [OpenVINO Inference Engine Python API](https://docs.openvinotoolkit.org/latest/ie_python_api/annotated.html).

# Section 2: What is SSD MobileNet V2?

![](pictures/mobileNet-SSD-network-architecture.png)

The `ssd_mobilenet_v2_coco` model is a [Single-Shot multibox Detection (SSD)](https://arxiv.org/pdf/1801.04381.pdf) network for object detection. The model has been trained from the Common Objects in Context (COCO) image dataset.

The model input is a blob that consists of a single image of `1x3x300x300` in the `RGB` order.

The model output is a typical vector containing the tracked object data. Note that the `class_id` data is now significant and should be used to determine the classification for any detected object.

Model outputs:

1. Classifier, name - `detection_classes`, contains predicted bounding boxes classes in range `[1, 91]`. The model was trained on Microsoft\* COCO dataset version with 90 categories of objects.
2. Probability, name - `detection_scores`, contains probability of detected bounding boxes.
3. Detection box, name - `detection_boxes`, contains detection boxes coordinates in format `[y_min, x_min, y_max, x_max]`, where (`x_min`, `y_min`)  are coordinates top left corner, (`x_max`, `y_max`) are coordinates of the right bottom corner. Coordinates are rescaled to the input image size.
4. Detections number, name - `num_detections`, contains the number of predicted detection boxes.

# Section 3: Where Can I Find the Model?

With the OpenVINO™ toolkit, you can easily download models from the [Intel&reg; Open Model Zoo](https://github.com/opencv/open_model_zoo).


To see all available models (both public open-sourse from different frameworks (TensorFlow\*, Caffe\*, MxNet\*, PyTorch\* and others) and Intel&reg; ones), run the `downloader.py` script with the `--print_all` parameter. 

In [None]:
!python3 ${INTEL_OPENVINO_DIR}/deployment_tools/open_model_zoo/tools/downloader/downloader.py --print_all

Let's try to download an object-detection model called `ssd_mobilenet_v2_coco` using the [Model Downloader](https://github.com/opencv/open_model_zoo/tree/master/tools/downloader).

In [None]:
!python3 ${INTEL_OPENVINO_DIR}/deployment_tools/open_model_zoo/tools/downloader/downloader.py \
--name ssd_mobilenet_v2_coco \
--output_dir ./data

As we can see, the Model Downloader can load not only publicly famous model, but also various models created at Intel for a range of tasks.

![](pictures/models.png)

Model Downloader downloaded the model to the following directory: `data/public/ssd_mobilenet_v2_coco/ssd_mobilenet_v2_coco_2018_03_29`.

In [None]:
!ls data/public/ssd_mobilenet_v2_coco/ssd_mobilenet_v2_coco_2018_03_29

# Section 4: Infer SSD MobileNet V2 on TensorFlow 

In [None]:
import os
import tensorflow as tf
from tensorflow.gfile import GFile

# Path to the TensorFlow model
model = os.path.join('data', 'public', 'ssd_mobilenet_v2_coco',
                     'ssd_mobilenet_v2_coco_2018_03_29', 'frozen_inference_graph.pb')
# SSD mobilenet v2 contains following output nodes
output_names = ['detection_classes:0','detection_scores:0', 'detection_boxes:0', 'num_detections:0']

# Create a graph
graph = tf.Graph()
# Create graph definitions
graph_def = tf.GraphDef()

# Read model to the graph definitions
with open(model, "rb") as model_file:
    graph_def.ParseFromString(model_file.read())


with graph.as_default():
    # Import the graph definitions to TensorFlow
    tf.import_graph_def(graph_def, name='')
    # Get tensors for output nodes
    output_tensors = [graph.get_tensor_by_name(layer_name) for layer_name in output_names] 

    with tf.Session(graph=graph) as session:
        # Inference the model for random datates
        print(session.run(output_tensors, feed_dict = {'image_tensor:0' : np.random.rand(1, 300, 300, 3)}))

# Section 5: Infer on Real Data on TensorFlow

To run the TensorFlow `ssd_mobilenet_v2_coco` model, we need some utility functions and constant values:

In [None]:
import logging as log
import os
import sys

# Initialize logging
log.basicConfig(format="[ %(levelname)s ] %(message)s", level=log.INFO, stream=sys.stdout)

# Define how many times we run inference to get better performance
NUM_RUNS = 1 
# Number of images for one inference
BATCH = 1

# Contains all data for the workshop
WORKSHOP_DATA_PATH = os.path.join('.', 'data')

# Path to a test image
IMAGE = os.path.join(WORKSHOP_DATA_PATH, 'images', 'input', 'cats.jpg')

# Path to the downloaded TensorFlow image
SSD_ASSETS = os.path.join(WORKSHOP_DATA_PATH, 'public', 'ssd_mobilenet_v2_coco')

# Path to the downloaded frozen TensorFlow image
TF_MODEL = os.path.join(SSD_ASSETS, 'ssd_mobilenet_v2_coco_2018_03_29', 'frozen_inference_graph.pb')

# Path to the resulting TensorFlow image
TF_RESULT_IMAGE = os.path.join(WORKSHOP_DATA_PATH, 'images', 'output', 'tensorflow_output.png')

# Path to the Inference Engine FP32 model
IE_MODEL_FP32_XML = os.path.join(SSD_ASSETS, 'FP32', 'ssd_mobilenet_v2_coco.xml')
IE_MODEL_FP32_BIN = os.path.join(SSD_ASSETS, 'FP32', 'ssd_mobilenet_v2_coco.bin')

# Path to the Inference Engine INT8 model optimized with the Default algorithm
IE_MODEL_DEFAULT_INT8_XML = os.path.join(SSD_ASSETS, 'INT8', 'default', 'optimized', 'ssd_mobilenet_v2_coco.xml')
IE_MODEL_DEFAULT_INT8_BIN = os.path.join(SSD_ASSETS, 'INT8', 'default', 'optimized', 'ssd_mobilenet_v2_coco.bin')

# Path to the Inference Engine INT8 model optimized  with the AccuracyAware algorithm
IE_MODEL_AA_INT8_XML = os.path.join(SSD_ASSETS, 'INT8', 'acuracy_aware', 'optimized', 'ssd_mobilenet_v2_coco.xml')
IE_MODEL_AA_INT8_BIN = os.path.join(SSD_ASSETS, 'INT8', 'acuracy_aware', 'optimized', 'ssd_mobilenet_v2_coco.bin')

# Path to the resulting TensorFlow image
IE_RESULT_IMAGE = os.path.join(WORKSHOP_DATA_PATH, 'images', 'output', 'inference_engine_output.png')

# Path to the combination of the resulting TensorFlow and Inference Engine images
COMBO_RESULT_IMAGE = os.path.join(WORKSHOP_DATA_PATH, 'images', 'output', 'combo_output.png')

PERFORMANCE = {}

In [None]:
# Import OpenCV for image processing
import cv2

def read_resize_image(path_to_image: str, width: int, height: int):
    """
    Takes an image and resizes it to the given dimensions.
    """
    # Load the image 
    raw_image = cv2.imread(path_to_image)
    # Return the image resized to the (width, height) format
    return cv2.resize(raw_image, (width, height), interpolation=cv2.INTER_NEAREST)

In [None]:
# Import required functions from TensorFlow
import tensorflow as tf

import time

def tf_inference(graph: tf.Graph, input_data, input_name: str, outputs_names: list) -> tuple:
    """
    Returns TensorFlow model inference results.
    """
    
    log.info("Running inference with TensorFlow ...")
  
    # Get the input tensor by name
    input_tensor =  graph.get_tensor_by_name('{}:0'.format(input_name))
    
    # Fill input data
    feed_dict = {
        input_tensor: [input_data, ] 
    }

    # Collect output tensors
    output_tensors = []
    
    for output_name in outputs_names:
        tensor = graph.get_tensor_by_name('{}:0'.format(output_name))
        output_tensors.append(tensor)
    
    # Run inference and get performance
    log.info("Running tf.Session")
    with graph.as_default():
        with tf.Session(graph=graph) as session:
            inference_start = time.time()
            outputs = session.run(output_tensors, feed_dict=feed_dict)
            inference_end = time.time()
    
    # Collect inference results
    res = dict(zip(outputs_names, outputs))
    
    log.info("TensorFlow reference collected successfully")
    
    return res, inference_end - inference_start

In [None]:
import tensorflow as tf
import numpy as np

def tf_main(path_to_pb_model: str, 
            path_to_original_image: str, 
            number_inference: int = 1):
    """
    Entrypoint to infer with TensorFlow.
    """
    log.info('COMMON: image preprocessing')
    
    # Size of the image is 300x300 pixels, 3 channels in the RGB format
    width = 300
    
    image_shape = (300, 300, 3)
    
    resized_image = read_resize_image(path_to_original_image, width, width)
    
    reshaped_image = np.reshape(resized_image, image_shape)
    
    log.info('Current shape: {}'.format(reshaped_image.shape))

    log.info('TENSORFLOW SPECIFIC: Loading a model with TensorFlow')
    
    tf.reset_default_graph()
    graph = tf.Graph()
    graph_def = tf.GraphDef()

    with open(path_to_pb_model, "rb") as model_file:
        graph_def.ParseFromString(model_file.read())

    with graph.as_default():
        tf.import_graph_def(graph_def, name='')

    log.info("TensorFlow graph was created")
    
    # We use SSD MobileNet V2 and we know the name of the input 
    input_layer = 'image_tensor'
    
    # And we know names of outputs
    output_layers = ['num_detections', 'detection_classes', 'detection_scores', 'detection_boxes']
    
    collected_inference_time = []
    
    for run in range(number_inference):
        raw_results, inference_time = tf_inference(graph, reshaped_image, input_layer, output_layers)
        collected_inference_time.append(inference_time)
    
    tensorflow_average_inference_time = sum(collected_inference_time) / number_inference
    
    log.info('TENSORFLOW SPECIFIC: Plain inference finished')

    return raw_results, tensorflow_average_inference_time

In [None]:
# Import the Image class from display to show an image
from IPython.display import Image
# Show the image in the notebok
Image(filename=IMAGE)

## Infer the Model on the Real Image

In [None]:
framework = 'TF'
device = 'CPU'
name = '{f} on {d}'.format(f=framework, d=device)

tensorflow_fps_collected = []

# Run inference on TensorFlow
tensorflow_predictions, tensorflow_average_inference_time = tf_main(TF_MODEL, IMAGE, number_inference=NUM_RUNS)
    
log.info('Inference Time of SSD MobileNet V2 {} is {} seconds'.format(name, tensorflow_average_inference_time))

# Calculate FPS from inference time
tensorflow_average_fps = 1 / tensorflow_average_inference_time

log.info('{} FPS: {}'.format(name, tensorflow_average_fps))

In [None]:
print(tensorflow_predictions['num_detections']) # get number of detected objects
print(tensorflow_predictions['detection_classes'][0])# get predicted classes IDs
print(tensorflow_predictions['detection_scores'][0]) # get probabilities for predicted classes
print(tensorflow_predictions['detection_boxes'][0]) # get boxes for predicted objects

In [None]:
# Import utility functions to process images from TensorFlow and draw images
from utils import parse_od_output, draw_image

# Import the Image class from display to show an image
from IPython.display import Image

processd_tensorflow_predictions = parse_od_output(tensorflow_predictions)
draw_image(IMAGE, processd_tensorflow_predictions, TF_RESULT_IMAGE)


# Show the image in the notebok
Image(filename=TF_RESULT_IMAGE)

# Section 6: OpenVINO&trade; Overview

![](pictures/openvino_toolkit.png)

![](pictures/additional_tools.png)

# Section 7: Model Optimizer - Entry to OpenVINO&trade;

![](pictures/model_optimizer.png)

 Let's convert the TensorFlow model to the IR format:

In [None]:
!python3 ${INTEL_OPENVINO_DIR}/deployment_tools/model_optimizer/mo.py \
--output_dir=data/public/ssd_mobilenet_v2_coco/FP32 \
--reverse_input_channels \
--model_name=ssd_mobilenet_v2_coco \
--transformations_config=${INTEL_OPENVINO_DIR}/deployment_tools/model_optimizer/extensions/front/tf/ssd_v2_support.json \
--tensorflow_object_detection_api_pipeline_config=data/public/ssd_mobilenet_v2_coco/ssd_mobilenet_v2_coco_2018_03_29/pipeline.config \
--output=detection_classes,detection_scores,detection_boxes,num_detections \
--input_model=data/public/ssd_mobilenet_v2_coco/ssd_mobilenet_v2_coco_2018_03_29/frozen_inference_graph.pb

![](./pictures/openvino_support.png)

We can view the Intermediate Representation of the SSD MobileNet V2:

In [None]:
!cat data/public/ssd_mobilenet_v2_coco/FP32/ssd_mobilenet_v2_coco.xml

# Section 8: Inference of SSD MobileNet V2 on OpenVINO&trade; Inference Engine 

In [None]:
from openvino.inference_engine import IECore, IENetwork
import numpy as np
import time

def ie_inference(path_to_model_xml: str, path_to_model_bin: str, path_to_original_image: str, device='CPU', batch=1):
    """
    Entrypoint to infer with the OpenVINO Inference Engine
    """

    # Now let's create the IECore() entity 
    log.info("Creating Inference Engine Core")   
    ie = IECore()

    # First, create a network (Note: you need to provide model in the IR previously converted with Model Optimizer)
    log.info("Reading IR...")
    net = IENetwork(model=path_to_model_xml, weights=path_to_model_bin)

    # Get input and output blob of the network
    input_blob = next(iter(net.inputs))
    out_blob = next(iter(net.outputs))

    # Reshape the network to the needed batch
    n, c, h, w = net.inputs[input_blob].shape
    net.reshape({input_blob: (batch, c, h, w)})
    n, c, h, w = net.inputs[input_blob].shape
    
    # Resize the image 
    log.info('COMMON: image preprocessing')
    image = read_resize_image(path_to_original_image, h, w)
    
    # Now we load Network to the plugin
    log.info("Loading IR to the plugin...")
    exec_net = ie.load_network(network=net, device_name=device, num_requests=2)

    del net

    labels_map = None
    
    # Read and preprocess the input image
    image = image[..., ::-1]
    in_frame = image.transpose((2, 0, 1))  # Change data layout from HWC to CHW
    batched_frame = np.array([in_frame for _ in range(batch)])
    log.info('Current shape: {}'.format(batched_frame.shape))

    # Now we run an inference on the target device
    inference_start = time.time()
    res = exec_net.infer(inputs={input_blob: batched_frame})
    inference_end = time.time()

    log.info('INFERENCE ENGINE SPECIFIC: no post-processing')

    return res[out_blob], inference_end - inference_start

In [None]:
def ie_main(xml:str, bin:str, device:str, postfix: str = ''):
    name = '{f} {p} on {d}'.format(f='IE', p=postfix, d=device)

    inference_engine_fps_collected = []

    for i in range(NUM_RUNS):
        # Run an inference on OpenVINO Inference Engine
        predictions, inference_time = ie_inference(xml, bin,
                                                   IMAGE,
                                                   device,
                                                   batch=BATCH)
        
        log.info('Inference Time of SSD MobileNet V2 {}: {}'.format(name, inference_time))
        # Calculate FPS from inference time
        inference_engine_fps = 1 / inference_time
        
        inference_engine_fps_collected.append(inference_engine_fps)

    # Calculate the average FPS for all inferences
    inference_engine_avg_fps = (sum(inference_engine_fps_collected) * BATCH) / (NUM_RUNS)
    
    PERFORMANCE[name] = inference_engine_avg_fps

    log.info('{} FPS: {}'.format(name, inference_engine_avg_fps))
    
    return inference_engine_avg_fps, predictions

In [None]:
device = 'CPU'

# Run the inference 
inference_engine_average_fps, inference_engine_predictions = ie_main(IE_MODEL_FP32_XML, 
                                                                     IE_MODEL_FP32_BIN, 
                                                                     device)

In [None]:
inference_engine_predictions[0] # get data for the image

In [None]:
from utils import parse_od_output, draw_image

draw_image(IMAGE, inference_engine_predictions, IE_RESULT_IMAGE, color=(255, 0, 0))

In [None]:
# Import the Image class from display to show an image
from IPython.display import Image

# Show the image in the notebok
Image(filename=IE_RESULT_IMAGE)

In [None]:
# Import functions from Matplotlib to show barcharts
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec


def show_results_interactively(tf_image: str, ie_image: str, combination_image: str, ie_fps:float, tf_fps:float):
    """
    Takes paths to three images and shows them with Matplotlib on one screen.
    """
    _ = plt.figure(figsize=(30, 10))
    gs1 = gridspec.GridSpec(1, 3)
    gs1.update(wspace=0.25, hspace=0.05)

    titles = [
        '(a) Tensorflow',
        '(b) Inference Engine',
        '(c) TensorFlow and Inference Engine\n predictions are identical'
    ]

    for i, path in enumerate([tf_image, ie_image, combination_image]):
        img_resized = cv2.imread(path)
        ax_plot = plt.subplot(gs1[i])
        ax_plot.axis("off")
        addon = ' '
        if i == 1:
            addon += '{:4.3f}'.format(ie_fps) + '(FPS)'
        elif i == 0:
            addon += '{:4.3f}'.format(tf_fps) + '(FPS)'

        ax_plot.text(0.5, -0.5, titles[i] + addon,
                     size=28, ha="center",
                     transform=ax_plot.transAxes)
        ax_plot.imshow(cv2.cvtColor(img_resized, cv2.COLOR_BGR2RGB))

    plt.show()

In [None]:
from utils import draw_image

# Draw inference results from the Inference Engine in the image with TensorFlow inference results
draw_image(TF_RESULT_IMAGE, inference_engine_predictions, COMBO_RESULT_IMAGE, color=(255, 0, 0))

show_results_interactively(tf_image=TF_RESULT_IMAGE,
                           ie_image=IE_RESULT_IMAGE,
                           combination_image=COMBO_RESULT_IMAGE,
                           ie_fps=inference_engine_average_fps,
                           tf_fps=tensorflow_average_fps)

In [None]:
from utils import show_performance

performance_data = {
    'TF on CPU': tensorflow_average_fps,
    'IE on CPU': inference_engine_average_fps
}

show_performance(performance_data)

**Oh, this is good - we got the same results in one image. But it is only ONE image. We need check accuracy on the whole dataset. So how can we do this?**

# Section 9: [Accuracy Checker](https://github.com/opencv/open_model_zoo/tree/master/tools/accuracy_checker) - OpenVINO&trade; Accuracy Validation Framework

![](pictures/accuracy_check.png)

In [None]:
# Replace variables to the real path in the Accuracy Checker config:
!WORKSHOP_PATH=$(pwd) envsubst '\${WORKSHOP_PATH}' <data/configs/accuracy_checker_config_tf_template.yml >data/configs/accuracy_checker_config_tf.yml

# Run the Accuracy Checker:
!accuracy_check -c data/configs/accuracy_checker_config_tf.yml

***Coffee Break***

In [None]:
# Replace variables to the real path in the Accuracy Checker config:
!WORKSHOP_PATH=$(pwd) envsubst '\${WORKSHOP_PATH}' <data/configs/accuracy_checker_config_template.yml >data/configs/accuracy_checker_config.yml

# Run the Accuracy Checker:
!accuracy_check -c data/configs/accuracy_checker_config.yml

In [None]:
!cat data/configs/accuracy_checker_config.yml

# Section 10: [Quantize the Model to Low Precision](https://docs.openvinotoolkit.org/latest/_compression_algorithms_quantization_README.html)

![](pictures/quantization.png)

![](pictures/quantize.PNG)

# Section 11: [Post-Trainig Optimization Toolkit](https://docs.openvinotoolkit.org/latest/_README.html)

Post-Training Optimization Toolkit includes standalone command-line tool and Python* API that provide the following key features:

* Two supported post-training quantization algorithms: fast [DefaultQuantization](https://docs.openvinotoolkit.org/latest/_compression_algorithms_quantization_default_README.html) and precise [AccuracyAwareQuantization](https://docs.openvinotoolkit.org/latest/_compression_algorithms_quantization_accuracy_aware_README.html).
as well as multiple experimental methods including global optimization.
* Symmetric and asymmetric quantization schemes. For more details, see the [Quantization](https://docs.openvinotoolkit.org/latest/_compression_algorithms_quantization_README.html) section.
* Compression for different hardware targets such as CPU, GPU.
* Per-channel quantization for Convolutional and Fully-Connected layers.
* Multiple domains: Computer Vision, Recommendation Systems.
* Ability to implement custom calibration pipeline via supported [API](https://docs.openvinotoolkit.org/latest/_sample_README.html).

<br>   

![](pictures/pot.png)

<br>   

In [None]:
!python3 ${INTEL_OPENVINO_DIR}/deployment_tools/tools/post_training_optimization_toolkit/main.py \
-c data/configs/default/quantization_config.json \
--output-dir data/public/ssd_mobilenet_v2_coco/INT8/default \
--direct-dump

In [None]:
! cat data/configs/default/quantization_config.json

DefaultQuantization algorithm performs a fast but at the same time accurate INT8 calibration of NNs. It consists of three algorithms that are sequentially applied to a model:
*  ActivationChannelAlignment - Used as a preliminary step before quantization and allows you to align ranges of output activations of Convolutional layers in order to reduce the quantization error.
*  MinMaxQuantization - This is a vanilla quantization method that automatically inserts `FakeQuantize` operations into the model graph based on the specified  target hardware and initializes them
using statistics collected on the calibration dataset.
*  BiasCorrection - Adjusts biases of Convolutional and Fully-Connected layers based on the quantization error of the layer in order to make the overall error unbiased.

In [None]:
device = 'CPU'
ie_avg_fps, predictions = ie_main(IE_MODEL_DEFAULT_INT8_XML, IE_MODEL_DEFAULT_INT8_BIN, device, 'INT8 D')

draw_image(TF_RESULT_IMAGE, predictions, COMBO_RESULT_IMAGE, color=(0, 0, 255))

show_results_interactively(tf_image=TF_RESULT_IMAGE,
                           ie_image=IE_RESULT_IMAGE,
                           combination_image=COMBO_RESULT_IMAGE,
                           ie_fps=ie_avg_fps,
                           tf_fps=tensorflow_average_fps)

show_performance(PERFORMANCE)

In [None]:
# Replace variables to the real path in the Accuracy Checker config:
!WORKSHOP_PATH=$(pwd) envsubst '\${WORKSHOP_PATH}' <data/configs/default/accuracy_checker_config_template.yml >data/configs/default/accuracy_checker_config.yml

# Run the Accuracy Checker
!accuracy_check -c data/configs/default/accuracy_checker_config.yml

# Section 12: AccuracyAware Algorithm

In [None]:
!python3 ${INTEL_OPENVINO_DIR}/deployment_tools/tools/post_training_optimization_toolkit/main.py \
-c data/configs/accuracy_aware/quantization_config.json \
--output-dir data/public/ssd_mobilenet_v2_coco/INT8/acuracy_aware \
--direct-dump

In [None]:
! cat data/configs/accuracy_aware/quantization_config.json

1. The model gets fully quantized using the DefaultQuantization algorithm.
2. The quantized and full-precision models are compared on a subset of the validation set in order to find mismatches in the target accuracy metric. A ranking subset is extracted based on the mismatches.
3. A layer-wise ranking is performed in order to get a contribution of each quantized layer into the accuracy drop.
4. Based on the ranking, the most "problematic" layer is reverted back to the original precision. This change is followed by the evaluation of the obtained model on the full validation set in order to get a new accuracy drop.
5. If the accuracy criteria are satisfied for all pre-defined accuracy metrics, the algorithm finishes. Otherwise, it continues reverting the next "problematic" layer.
6. It may happen that regular reverting does not get any accuracy improvement or even worsen the accuracy. Then the re-ranking is triggered as it is described in step 3.

In [None]:
device = 'CPU'
ie_avg_fps, predictions = ie_main(IE_MODEL_AA_INT8_XML, IE_MODEL_AA_INT8_BIN, device, 'INT8 AA')

draw_image(TF_RESULT_IMAGE, predictions, COMBO_RESULT_IMAGE, color=(0, 0, 255))

show_results_interactively(tf_image=TF_RESULT_IMAGE,
                           ie_image=IE_RESULT_IMAGE,
                           combination_image=COMBO_RESULT_IMAGE,
                           ie_fps=ie_avg_fps,
                           tf_fps=tensorflow_average_fps)

show_performance(PERFORMANCE)

In [None]:
# Replace variables to the real path in the Accuracy Checker config:
!WORKSHOP_PATH=$(pwd) envsubst '\${WORKSHOP_PATH}' <data/configs/accuracy_aware/accuracy_checker_config_template.yml >data/configs/accuracy_aware/accuracy_checker_config.yml

# Run the Accuracy Checker
!accuracy_check -c data/configs/accuracy_aware/accuracy_checker_config.yml

# Section 13: [Deep Learning Boost](https://www.intel.ai/intel-deep-learning-boost/)

![](pictures/dl_boost.png)

# Section 14: Get Even Better Performance

Great performance results! But if you want the best performance, use C++.

There is a C++ benchmark application in Inference samples. Let's build it from sources and try!

In [None]:
! mkdir ./data/samples/cpp/build 
! cd ${INTEL_OPENVINO_DIR}/deployment_tools/inference_engine/samples/cpp/build && cmake .. && make benchmark_app -j8

In [None]:
!${INTEL_OPENVINO_DIR}/deployment_tools/inference_engine/samples/cpp/build/intel64/Release/benchmark_app -h

In [None]:
!${INTEL_OPENVINO_DIR}/deployment_tools/inference_engine/samples/cpp/build/intel64/Release/benchmark_app -m data/public/ssd_mobilenet_v2_coco/FP32/ssd_mobilenet_v2_coco.xml

In [None]:
PERFORMANCE['OpenVINO IE Benchmark INT8'] = 139.86
show_performance(PERFORMANCE)

# Section 15: Practice

In [None]:
import os

# Read/write video, work with images
import cv2

# Inference
from openvino.inference_engine import IENetwork, IECore

# Show videos in the notebook
from ipywidgets import Video

In [None]:
MODEL_PATH_XML = IE_MODEL_FP32_XML
MODEL_PATH_BIN = IE_MODEL_FP32_BIN

DEVICE = 'CPU'

DATA_PATH = os.path.join('practice', 'data')
INPUT_VIDEO = os.path.join(DATA_PATH, 'artyom.MP4')
OUTPUT_VIDEO = os.path.join(DATA_PATH, 'out_artyom.MP4')

LABELS_PATH = os.path.join(DATA_PATH, 'coco_labels.txt')

In [None]:
Video.from_file(INPUT_VIDEO)

In [None]:
def prapare_out_video_stream(input_video_stream):
    width  = int(input_video_stream.get(3))
    height = int(input_video_stream.get(4))
    return cv2.VideoWriter(OUTPUT_VIDEO, cv2.VideoWriter_fourcc(*'X264'), 20, (width, height))

In [None]:
# Create an object of IECore
# This class represents an Inference Engine entity 
# and allows you to manipulate with plugins using unified interfaces
ie = IECore()

# Load network as Intermediate Representation 
# The IENetwork class contains information about the network model read from the Intermediate Representation
# and allows you to manipulate with some model parameters such as layers affinity and output layers
net = IENetwork(model=MODEL_PATH_XML, weights=MODEL_PATH_BIN)

# Get names of input layers of the network
input_blob = next(iter(net.inputs))

print('Input layer of the network is {}'.format(input_blob))

# Get shape (dimensions) of the input layer of the network
# n - number of batches
# c - number of an input image channels (usualy 3 - R, G and B) 
# h - height
# w - width
n, c, h, w = net.inputs[input_blob].shape

print('Input shape of the network: [{}, {}, {}, {}]'.format(n, c, h, w))

# Get names of output layers of the network
out_blob = next(iter(net.outputs))

print('Output layer of the network: {}'.format(out_blob))

# Load names of COCO classes from the file 
with open(LABELS_PATH, 'r') as f:
    labels_map = [x.strip() for x in f]

# Load the network to the device
# The load_network function returns an object of ExecutableNetwork
# This class represents a network instance loaded to plugin and ready for inference
exec_net = ie.load_network(network=net, num_requests=2, device_name=DEVICE)

# Open an input video
input_video_stream = cv2.VideoCapture(INPUT_VIDEO)

# Create an output video stream
out = prapare_out_video_stream(input_video_stream)

feed_dict = {}

cur_request_id = 0
next_request_id = 1

# Loop over frames in the input video
while input_video_stream.isOpened():
    
    # Read the next frame from the intput video 
    ret, frame = input_video_stream.read()
    # Check if the video is over
    if not ret:
        # Exit from the loop if the video is over
        break 
    # Get height and width of the frame
    frame_h, frame_w = frame.shape[:2]
    
    # Resize the frame to the network input 
    in_frame = cv2.resize(frame, (w, h))
    
    # Change the data layout from HWC to CHW
    in_frame = in_frame.transpose((2, 0, 1))  
    
    # Reshape the frame to the network input 
    in_frame = in_frame.reshape((n, c, h, w))
    
    # Prepare data for the network.
    # This must be a dictionary: 
    #   key - name of the input layer
    #   value - input data (the prepared frame)  
    feed_dict[input_blob] = in_frame
    
    # Start asynchronous inference
    # We must set request_id - number or identificator of the Inference Request
    # and input data - the dictionary
    exec_net.start_async(request_id=cur_request_id, inputs=feed_dict)
    
    # Wait the inference request until Inference Engine finishes the inference of the request
    if exec_net.requests[cur_request_id].wait(-1) == 0:
        # Read the result of the inference from the output layer of the execution network 
        inference_request_result = exec_net.requests[cur_request_id].outputs[out_blob]
        
        # Iterate over all discovered objects
        for obj in inference_request_result[0][0]:
            # Draw a bounding box only for objects the confidence of which is greater than a specified threshold
            if obj[2] > 0.5:
                # Get coordinates of the discovered object
                # and scale it to the original size of the frame
                xmin = int(obj[3] * frame_w)
                ymin = int(obj[4] * frame_h)
                xmax = int(obj[5] * frame_w)
                ymax = int(obj[6] * frame_h)
                
                # Get class ID of the discovered object
                class_id = int(obj[1])
                
                # Get confidence for the discovered object
                confidence = round(obj[2] * 100, 1)
                
                # Draw a box and label
                color = (min(class_id * 12.5, 255), min(class_id * 7, 255), min(class_id * 5, 255))
                cv2.rectangle(frame, (xmin, ymin), (xmax, ymax), color, 2)
                
                # Get label of the class
                label = labels_map[class_id]
                
                # Create the title of the object
                text = '{}: {}% '.format(label, confidence)
                
                # Put the title to the frame
                cv2.putText(frame, text, (xmin, ymin - 7), cv2.FONT_HERSHEY_COMPLEX, 2, color, 2)
        
    # Write the resulting frame to the output stream
    out.write(frame)

# Save the resulting video
out.release()

In [None]:
Video.from_file(OUTPUT_VIDEO)