# OpenVINO: Демократизация оптимизации нейронных сетей
![](workshop/pictures/openvino_start.png)

## OpenVINO Toolkit

![](workshop/pictures/openvino_toolkit.png)

The first step of this workshop is initializing OpenVINO environment in this Jupyter notebook. 
The OpenVINO 2020.1 package have been installed to `intel/openvino/` already.
For initializing the OpenVINO environment you should run the script `intel/openvino/bin/setupvars.sh`

In [None]:
!bash intel/openvino/bin/setupvars.sh

Let's try to check the environment

In [None]:
!echo LD_LIBRARY_PATH is $LD_LIBRARY_PATH
!echo
!echo PYTHONPATH is $PYTHONPATH

As you can see there are paths to the OpenVINO in LD_LIBRARY_PATH and PYTHONPATH variables.
So you can already use the OpenVINO

## Open Model Zoo

![](workshop/pictures/models.png)

The OpenVINO package contains tools for easy download model from [OpenModelZoo](https://github.com/opencv/open_model_zoo) 
and convert the model to Intermediate Representation that OpenVINO supports

To see all available models (both public open-sourse from original frameworks (TensorFlow, Caffe, MxNet, Pytorch e.t.c),
and made in Intel).

In [None]:
!python3 ${INTEL_OPENVINO_DIR}/deployment_tools/open_model_zoo/tools/downloader/downloader.py --print_all

For downloading any of these models you need to use downloader

In [None]:
!python3 ${INTEL_OPENVINO_DIR}/deployment_tools/open_model_zoo/tools/downloader/downloader.py -h

Let's try to download an object detection model `ssd_mobilenet_v2_coco`

In [None]:
!python3  ${INTEL_OPENVINO_DIR}/deployment_tools/open_model_zoo/tools/downloader/downloader.py \
                                                                                --name ssd_mobilenet_v2_coco \
                                                                                --output_dir ./workshop/data

Model Downloader has downloaded the model to `./workshop/data/public/ssd_mobilenet_v2_coco/ssd_mobilenet_v2_coco_<DATE>`

In [None]:
!ls -la ./workshop/data/public/ssd_mobilenet_v2_coco/ssd_mobilenet_v2_coco_<DATE>

But the Model Downloader downloaded the model in TensorFlow format.
You need convert this model to IR format. 
For this you need run converter script
converter script runs the Model Optimizer with right parameters to converting the model with to IR.
Of course  we can run the Model Optimizer directly. But for this we need pass right arguments to the Model Optimizer.
All information about converting

In [None]:
!python3 ${INTEL_OPENVINO_DIR}/deployment_tools/open_model_zoo/tools/downloader/converter.py \
                                                         --name ssd_mobilenet_v2_coco \
                                                         --download_dir ./workshop/data \
                                                         --output_dir ./workshop/data \
                                                         --precisions FP32

In [None]:
!ls -la ./workshop/data/public/ssd_mobilenet_v2_coco/FP32/

You can find a command of running OpenVINO Model Optimizer in the output of the converter.py script.
You can try this command:

In [None]:
!python3 ${INTEL_OPENVINO_DIR}/deployment_tools/model_optimizer/mo.py \
    --output_dir=./workshop/data/public/ssd_mobilenet_v2_coco/FP32 \
    --reverse_input_channels \
    --model_name=ssd_mobilenet_v2_coco \
    --transformations_config=${INTEL_OPENVINO_DIR}/deployment_tools/model_optimizer/extensions/front/tf/ssd_v2_support.json \
    --tensorflow_object_detection_api_pipeline_config=./workshop/data/public/ssd_mobilenet_v2_coco/ssd_mobilenet_v2_coco_2018_03_29/pipeline.config \
    --output=detection_classes,detection_scores,detection_boxes,num_detections \
    --input_model=./workshop/data/public/ssd_mobilenet_v2_coco/ssd_mobilenet_v2_coco_2018_03_29/frozen_inference_graph.pb

Some function are needed for the next part of the workshop:

In [None]:
import cv2

def read_resize_image(path_to_image: str, width: int, height: int):
    """
    Takes an image and resizes it to the given dimensions
    """
    #Load the image 
    raw_image = cv2.imread(path_to_image)
    #Return the resized to (width, height) size image  
    return cv2.resize(raw_image, (width, height), interpolation=cv2.INTER_NEAREST)

In [None]:
def show_performance(performance_data: dict):
    """
    Takes dictionary contains name of configurations as keys and FPS for it as values
    """
    y_pos = np.arange(len(performance_data))
    performance = [fps for case, fps in performance_data.items()]
    plt.bar(y_pos, performance, align='center', alpha=0.5)
    plt.xticks(y_pos, performance_data.keys())
    plt.ylabel('FPS')
    plt.title('Configurations')
    plt.show()

In [None]:
def draw_image(original_image: str, res: tuple, path_to_image: str, prob_threshold: float=0.8, color: tuple=(0, 255, 0)):
    """
    Takes a path to the image and bounding boxes. Draws those boxes on the new image and saves it
    """
    raw_image = cv2.imread(original_image)
    initial_w = raw_image.shape[1]
    initial_h = raw_image.shape[0]
    labels_map = {
        18: 'dog',
        21: 'cat'
    }
    for obj in res[0][0]:
        # Draw only objects when probability more than specified threshold
        if obj[2] > prob_threshold:
            xmin = int(obj[3] * initial_w)
            ymin = int(obj[4] * initial_h)
            xmax = int(obj[5] * initial_w)
            ymax = int(obj[6] * initial_h)
            class_id = int(obj[1])
            confidence = round(obj[2] * 100, 1)
            cv2.rectangle(raw_image, (xmin, ymin), (xmax, ymax), color, 2)
            det_label = labels_map[class_id] if labels_map else str(class_id)
            box_title = '{} {}%'.format(det_label, confidence)
            cv2.putText(raw_image,
                        box_title,
                        (xmin, ymin - 7),
                        cv2.FONT_HERSHEY_COMPLEX, 5, color, cv2.LINE_AA)
    cv2.imwrite(path_to_image, raw_image)

In [None]:
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec

def show_results_interactively(tf_image, ie_image, combination_image, ie_fps, tf_fps):
    """
    Takes paths to three images and shows them with matplotlib on one screen
    """
    _ = plt.figure(figsize=(30, 10))
    gs1 = gridspec.GridSpec(1, 3)
    gs1.update(wspace=0.25, hspace=0.05)

    titles = [
        '(a) Tensorflow',
        '(b) Inference Engine',
        '(c) TensorFlow and Inference Engine\n predictions are identical'
    ]

    for i, path in enumerate([tf_image, ie_image, combination_image]):
        img_resized = cv2.imread(path)
        ax_plot = plt.subplot(gs1[i])
        ax_plot.axis("off")
        addon = ' '
        if i == 1:
            addon += '{:4.3f}'.format(ie_fps) + '(FPS)'
        elif i == 0:
            addon += '{:4.3f}'.format(tf_fps) + '(FPS)'

        ax_plot.text(0.5, -0.5, titles[i] + addon,
                     size=28, ha="center",
                     transform=ax_plot.transAxes)
        ax_plot.imshow(cv2.cvtColor(img_resized, cv2.COLOR_BGR2RGB))

    plt.show()

In [None]:
def load_graph(path_to_model: str):
    """
    Creates in memory graph in TensorFlow
    """
    tf.reset_default_graph()
    graph = tf.Graph()
    graph_def = tf.GraphDef()

    with open(path_to_model, "rb") as model_file:
        graph_def.ParseFromString(model_file.read())

    nodes_to_clear_device = graph_def.node if isinstance(
        graph_def, tf.GraphDef) else graph_def.graph_def.node
    for node in nodes_to_clear_device:
        node.device = ""

    with graph.as_default():
        tf.import_graph_def(graph_def, name='')

    log.info("tf graph was created")
    return graph

In [None]:
import tensorflow as tf
from tensorflow.python.framework import graph_io

import numpy as np
import logging as log
import time
import os

def children(op_name: str, graph: tf.Graph):
    """
    Get operation node children.
    """
    op = graph.get_operation_by_name(op_name)
    return set(op for out in op.outputs for op in out.consumers())

In [None]:
def summarize_graph(graph_def):
    unlikely_output_types = [
        'Const', 'Assign',
        'NoOp', 'Placeholder',
        'Assert', 'switch_t', 'switch_f'
    ]
    placeholders = dict()
    outputs = list()
    graph = tf.Graph()
    with graph.as_default():  # pylint: disable=not-context-manager
        tf.import_graph_def(graph_def, name='')
    for node in graph.as_graph_def().node:  # pylint: disable=no-member
        if node.op == 'Placeholder':
            node_dict = dict()
            node_dict['type'] = tf.DType(node.attr['dtype'].type).name
            new_shape = tf.TensorShape(node.attr['shape'].shape)
            node_dict['shape'] = str(new_shape).replace(' ', '').replace('?', '-1')
            placeholders[node.name] = node_dict
        if len(children(node.name, graph)) == 0:
            if node.op not in unlikely_output_types and \
                node.name.split('/')[-1] not in unlikely_output_types:
                outputs.append(node.name)
    result = dict()
    result['inputs'] = placeholders
    result['outputs'] = outputs
    return result

In [None]:
def get_refs(graph: tf.Graph, input_data: dict):
    """
    Return TensorFlow model reference results.
    """
    log.info("Running inference with tensorflow ...")
    feed_dict = {}
    summary_info = summarize_graph(graph.as_graph_def())
    input_layers, output_layers = list(summary_info['inputs'].keys()), summary_info['outputs']

    data_keys = [key for key in input_data.keys()]
    if sorted(input_layers) != sorted(data_keys):
        raise ValueError('input data keys: {0} do not match input '
                         'layers of network: {1}'.format(data_keys, input_layers))

    for input_layer_name in input_layers:
        tensor = graph.get_tensor_by_name(input_layer_name + ':0')
        feed_dict[tensor] = input_data[input_layer_name]
    output_tensors = []
    for name in output_layers:
        tensor = graph.get_tensor_by_name(name + ':0')
        output_tensors.append(tensor)

    log.info("Running tf.Session")
    os.environ['CUDA_VISIBLE_DEVICES'] = '-1' # force inference on CPU
    with graph.as_default():
        with tf.Session(graph=graph) as session:
            inference_start = time.time()
            outputs = session.run(output_tensors, feed_dict=feed_dict)
            inference_end = time.time()
    res = dict(zip(output_layers, outputs))
    log.info("TensorFlow reference collected successfully\n")
    return res, inference_end - inference_start

In [None]:
def parse_od_output(data: dict):
    predictions = []
    num_batches = len(data['detection_boxes'])
    target_layers = ['num_detections', 'detection_classes',
                     'detection_scores', 'detection_boxes']

    for b in range(num_batches):
        predictions.append([])
        num_detections = int(data['num_detections'][b])
        detection_classes = data['detection_classes'][b]
        detection_scores = data['detection_scores'][b]
        detection_boxes = data['detection_boxes'][b]
        for i in range(num_detections):
            obj = [
                b, detection_classes[i], detection_scores[i],
                detection_boxes[i][1], detection_boxes[i][0],
                detection_boxes[i][3], detection_boxes[i][2]
            ]
            predictions[b].append(obj)
    predictions = np.asarray(predictions)
    new_shape = (1, 1, predictions.shape[0] * predictions.shape[1], predictions.shape[2])
    predictions = np.reshape(predictions, newshape=new_shape)
    parsed_data = {'tf_detections': predictions}
    for layer, blob in data.items():
        if layer not in target_layers:
            parsed_data.update({layer: blob})
    return parsed_data

In [None]:
def tf_main(path_to_model: str, path_to_original_image: str, batch: int = 1):
    """
    Entrypoint for inferencing with TensorFlow
    """
    log.info('COMMON: image preprocessing')
    width = 300
    resized_image = read_resize_image(path_to_original_image, width, width)
    
    reshaped_image = np.reshape(resized_image, (width, width, 3))
    batched_image = np.array([reshaped_image for _ in range(batch)])
    
    log.info('Current shape: {}'.format(batched_image.shape))

    log.info('TENSORFLOW SPECIFIC: Loading a model with TensorFLow')
    graph = load_graph(path_to_model)

    input_data = {
        'image_tensor': batched_image,
    }

    raw_results, delta = get_refs(graph, input_data)
    log.info('TENSORFLOW SPECIFIC: Plain inference finished')

    log.info('TENSORFLOW SPECIFIC: Post processing started')
    processed_results = parse_od_output(raw_results)
    log.info('TENSORFLOW SPECIFIC: Post processing finished')

    return processed_results['tf_detections'], delta

In [None]:
from openvino.inference_engine import IENetwork, IEPlugin, IECore

def ie_main(path_to_model_xml: str, path_to_model_bin: str, path_to_original_image: str, device='CPU', batch=1):
    log.info('COMMON: image preprocessing')
    width = 300
    image = read_resize_image(path_to_original_image, width, width)

    # First create Network (Note you need to provide model in IR previously converted with Model Optimizer)
    log.info("Reading IR...")
    net = IENetwork(model=path_to_model_xml, weights=path_to_model_bin)

    # Now let's create IECore() entity 
    log.info("Creating Inference Engine Core")   
    ie = IECore()


    input_blob = next(iter(net.inputs))
    out_blob = next(iter(net.outputs))

    n, c, h, w = net.inputs[input_blob].shape
    net.reshape({input_blob: (batch, c, h, w)})
    n, c, h, w = net.inputs[input_blob].shape

    # Now we load Network to plugin
    log.info("Loading IR to the plugin...")
    exec_net = ie.load_network(network=net, device_name=device, num_requests=2)

    del net

    labels_map = None
    
    # Read and pre-process input image
    image = image[..., ::-1]
    in_frame = image.transpose((2, 0, 1))  # Change data layout from HWC to CHW
    batched_frame = np.array([in_frame for _ in range(batch)])
    log.info('Current shape: {}'.format(batched_frame.shape))

    # Now we run inference on target device
    inference_start = time.time()
    res = exec_net.infer(inputs={input_blob: batched_frame})
    inference_end = time.time()

    log.info('INFERENCE ENGINE SPECIFIC: no post processing')

    return res[out_blob], inference_end - inference_start

In [None]:
import sys
import platform

log.basicConfig(format="[ %(levelname)s ] %(message)s", level=log.INFO, stream=sys.stdout)

NUM_RUNS = 1
BATCH = 1

DATA = os.path.join('.', 'workshop', 'data')

IMAGE = os.path.join(DATA, 'images', 'input', 'dog.jpg')

SSD_ASSETS = os.path.join(DATA, 'public', 'ssd_mobilenet_v2_coco')

TF_MODEL = os.path.join(SSD_ASSETS, 'ssd_mobilenet_v2_coco_2018_03_29', 'frozen_inference_graph.pb')
TF_RESULT_IMAGE = os.path.join(DATA, 'images', 'output', 'tensorflow_output.png')

IE_MODEL_FP32_XML = os.path.join(SSD_ASSETS, 'FP32', 'ssd_mobilenet_v2_coco.xml')
IE_MODEL_FP32_BIN = os.path.join(SSD_ASSETS, 'FP32', 'ssd_mobilenet_v2_coco.bin')
IE_MODEL_FP16_XML = os.path.join(SSD_ASSETS, 'FP16', 'ssd_mobilenet_v2_coco.xml')
IE_MODEL_FP16_BIN = os.path.join(SSD_ASSETS, 'FP16', 'ssd_mobilenet_v2_coco.bin')
IE_MODEL_DEFAULT_INT8_XML = os.path.join(SSD_ASSETS, 'INT8', 'default', 'optimized', 'ssd_mobilenet_v2_coco.xml')
IE_MODEL_DEFAULT_INT8_BIN = os.path.join(SSD_ASSETS, 'INT8', 'default', 'optimized', 'ssd_mobilenet_v2_coco.bin')
IE_MODEL_AA_INT8_XML = os.path.join(SSD_ASSETS, 'INT8', 'acuracy_aware', 'optimized', 'ssd_mobilenet_v2_coco.xml')
IE_MODEL_AA_INT8_BIN = os.path.join(SSD_ASSETS, 'INT8', 'acuracy_aware', 'optimized', 'ssd_mobilenet_v2_coco.bin')


IE_RESULT_IMAGE = os.path.join(DATA, 'images', 'output', 'inference_engine_output.png')

OPENVINO = os.getenv('INTEL_OPENVINO_DIR')
if not OPENVINO:
    print('Please, install OpenVINO and initialize the environment')
    sys.exit(1)
    
if platform.system() == 'Linux':
    ext = '.so'
else:
    print('You are running this demo on Windows OS or maxOS. However, this is demo for Linux.')
    sys.exit(1)

COMBO_RESULT_IMAGE = os.path.join(DATA, 'images', 'output', 'combo_output.png')

PERFORMANCE = {}

In [None]:
def ie_inference(xml:str, bin:str, device:str, postfix: str):
    name = '{f} {p} on {d}'.format(f='IE', p=postfix, d=device)

    ie_fps_collected = []

    for i in range(NUM_RUNS):
        predictions, inf_time = ie_main(xml, bin,
                                        IMAGE,
                                        device,
                                        batch=BATCH)
        ie_fps = 1 / inf_time
        ie_fps_collected.append(ie_fps)

    ie_avg_fps = (sum(ie_fps_collected) * BATCH) / (NUM_RUNS)

    PERFORMANCE[name] = ie_avg_fps

    log.info('{} FPS: {}'.format(name, ie_avg_fps))

    draw_image(IMAGE, predictions, IE_RESULT_IMAGE, color=(0, 0, 255))
    
    return ie_avg_fps, predictions

In [None]:
framework = 'TF'
device = 'CPU'
name = '{f} on {d}'.format(f=framework, d=device)

tf_fps_collected = []

for i in range(NUM_RUNS):
    predictions, inf_time = tf_main(TF_MODEL, 
                                    IMAGE,
                                    batch=BATCH)
    tf_fps = 1 / inf_time
    tf_fps_collected.append(tf_fps)

tf_avg_fps = (sum(tf_fps_collected) * BATCH) / (NUM_RUNS)

PERFORMANCE[name] = tf_avg_fps

log.info('{} FPS: {}'.format(name, tf_avg_fps))

draw_image(IMAGE, predictions, TF_RESULT_IMAGE, color=(255, 0, 0))

In [None]:
!accuracy_check -c ./workshop/data/configs/accuracy_checker_config_tf.yml

In [None]:
device = 'CPU'
ie_avg_fps, predictions = ie_inference(IE_MODEL_FP32_XML, IE_MODEL_FP32_BIN, device, '')

draw_image(TF_RESULT_IMAGE, predictions, COMBO_RESULT_IMAGE, color=(0, 0, 255))

show_results_interactively(tf_image=TF_RESULT_IMAGE,
                           ie_image=IE_RESULT_IMAGE,
                           combination_image=COMBO_RESULT_IMAGE,
                           ie_fps=ie_avg_fps,
                           tf_fps=tf_avg_fps)

show_performance(PERFORMANCE)

In [None]:
!accuracy_check -c ./workshop/data/configs/accuracy_checker_config.yml

In [None]:
!python3 ${INTEL_OPENVINO_DIR}/deployment_tools/model_optimizer/mo.py \
    --data_type FP16 \
    --output_dir=./workshop/data/public/ssd_mobilenet_v2_coco/FP16 \
    --reverse_input_channels \
    '--input_shape=[1,300,300,3]'\
    --model_name=ssd_mobilenet_v2_coco \
    --transformations_config=${INTEL_OPENVINO_DIR}/deployment_tools/model_optimizer/extensions/front/tf/ssd_v2_support.json \
    --tensorflow_object_detection_api_pipeline_config=./workshop/data/public/ssd_mobilenet_v2_coco/ssd_mobilenet_v2_coco_2018_03_29/pipeline.config \
    --output=detection_classes,detection_scores,detection_boxes,num_detections \
    --input_model=./workshop/data/public/ssd_mobilenet_v2_coco/ssd_mobilenet_v2_coco_2018_03_29/frozen_inference_graph.pb

In [None]:
device = 'GPU'
ie_avg_fps, predictions = ie_inference(IE_MODEL_FP16_XML, IE_MODEL_FP16_BIN, device, '')

draw_image(TF_RESULT_IMAGE, predictions, COMBO_RESULT_IMAGE, color=(0, 0, 255))

show_results_interactively(tf_image=TF_RESULT_IMAGE,
                           ie_image=IE_RESULT_IMAGE,
                           combination_image=COMBO_RESULT_IMAGE,
                           ie_fps=ie_avg_fps,
                           tf_fps=tf_avg_fps)

show_performance(PERFORMANCE)

In [None]:
!python3 ${INTEL_OPENVINO_DIR}/deployment_tools/tools/post_training_optimization_toolkit/main.py -c ./workshop/data/configs/default/quantization_config.json \
                                                                                                 --output-dir ./workshop/data/public/ssd_mobilenet_v2_coco/INT8/default \
                                                                                                 --direct-dump \
                                                                                                 -e

In [None]:
device = 'CPU'
ie_avg_fps, predictions = ie_inference(IE_MODEL_DEFAULT_INT8_XML, IE_MODEL_DEFAULT_INT8_BIN, device, 'INT8 D')

draw_image(TF_RESULT_IMAGE, predictions, COMBO_RESULT_IMAGE, color=(0, 0, 255))

show_results_interactively(tf_image=TF_RESULT_IMAGE,
                           ie_image=IE_RESULT_IMAGE,
                           combination_image=COMBO_RESULT_IMAGE,
                           ie_fps=ie_avg_fps,
                           tf_fps=tf_avg_fps)

show_performance(PERFORMANCE)

In [None]:
!accuracy_check -c ./workshop/data/configs/default/accuracy_checker_config.yml | grep %

In [None]:
!python3 ${INTEL_OPENVINO_DIR}/deployment_tools/tools/post_training_optimization_toolkit/main.py \
    -c ./workshop/data/configs/accuracy_aware/quantization_config.json \
    --output-dir ./workshop/data/public/ssd_mobilenet_v2_coco/INT8/acuracy_aware \
    --direct-dump

In [None]:
device = 'CPU'
ie_avg_fps, predictions = ie_inference(IE_MODEL_AA_INT8_XML, IE_MODEL_AA_INT8_BIN, device, 'INT8 AA')

draw_image(TF_RESULT_IMAGE, predictions, COMBO_RESULT_IMAGE, color=(0, 0, 255))

show_results_interactively(tf_image=TF_RESULT_IMAGE,
                           ie_image=IE_RESULT_IMAGE,
                           combination_image=COMBO_RESULT_IMAGE,
                           ie_fps=ie_avg_fps,
                           tf_fps=tf_avg_fps)

show_performance(PERFORMANCE)

In [None]:
!accuracy_check -c ./workshop/data/configs/accuracy_aware/accuracy_checker_config.yml

In [None]:
!${INTEL_OPENVINO_DIR}/deployment_tools/inference_engine/samples/cpp/build/intel64/Release/benchmark_app -h

In [None]:
!${INTEL_OPENVINO_DIR}/deployment_tools/inference_engine/samples/cpp/build/intel64/Release/benchmark_app -m /home/atugarev/Developer/workshop/data/public/ssd_mobilenet_v2_coco/FP32/ssd_mobilenet_v2_coco.xml -d GPU