Dive deeper into the Inference Engine, and perform inference in the OpenVINO Toolkit. By the end, you'll know the full workflow for OpenVINO fundamentals and be ready to integrate into an app.

## Introduction

[Youtube Video](https://youtu.be/BUpkwGhboLg)

## The Inference Engine

[Youtube Video](https://youtu.be/dZA4QGbDrs4)

We can check on [Inference Engine Developer Guide](https://docs.openvinotoolkit.org/2019_R3/_docs_IE_DG_Deep_Learning_Inference_Engine_DevGuide.html).

## Supported Devices

[Youtube Video](https://youtu.be/m2d1urdJegA)

[Use the Model Downloader and Model Optimizer for the Intel® Distribution of OpenVINO™ Toolkit on Raspberry Pi*](https://software.intel.com/content/www/us/en/develop/articles/model-downloader-optimizer-for-openvino-on-raspberry-pi.html)

We can check on [Supported Devices](https://docs.openvinotoolkit.org/2019_R3/_docs_IE_DG_supported_plugins_Supported_Devices.html).

## Using the Inference Engine with an IR

[Youtube Video](https://youtu.be/b90ny0AmQF8)

We can review:
- [IE Python API](https://docs.openvinotoolkit.org/2019_R3/ie_python_api.html)
- [IE Network](https://docs.openvinotoolkit.org/2019_R3/classie__api_1_1IENetwork.html)
- [IE Core](https://docs.openvinotoolkit.org/2019_R3/classie__api_1_1IECore.html)

## Exercise: Feed an IR to the Inference Engine

We use IR in the Inference Engine.

## Solution: Feed an IR to the Inference Engine

[Youtube Video](https://youtu.be/jEmebNVBlc4)

### Solution Code

In [None]:
### Load the necessary libraries
import os
from openvino.inference_engine import IENetwork, IECore

In [None]:
def load_to_IE(model_xml):
    ### Load the Inference Engine API
    plugin = IECore()

    ### Load IR files into their related class
    model_bin = os.path.splitext(model_xml)[0] + ".bin"
    net = IENetwork(model=model_xml, weights=model_bin)

    ### Add a CPU extension, if applicable.
    plugin.add_extension(CPU_EXTENSION, "CPU")

    ### Get the supported layers of the network
    supported_layers = plugin.query_network(network=net, device_name="CPU")

    ### Check for any unsupported layers, and let the user
    ### know if anything is missing. Exit the program, if so.
    unsupported_layers = [l for l in net.layers.keys() if l not in supported_layers]
    if len(unsupported_layers) != 0:
        print("Unsupported layers found: {}".format(unsupported_layers))
        print("Check whether extensions are available to add to IECore.")
        exit(1)

    ### Load the network into the Inference Engine
    plugin.load_network(net, "CPU")

    print("IR successfully loaded into Inference Engine.")

    return

### Running Your Implementation

In [None]:
!python feed_network.py -m /home/workspace/models/human-pose-estimation-0001.xml

## Sending Inference Requests to the IE

[Youtube Video](https://youtu.be/wLN8HYZ05rg)

After we load the IENetwork into the IECore, we get back an ExecutableNetwork, which is what we will send inference requests to. There are two types of inference requests we can make: Synchronous and Asynchronous.

With an ExecutableNetwork, synchronous requests just use the infer function, while asynchronous requests begin with start_async, and then we can wait until the request is complete. These requests are InferRequest objects, which will hold both the input and output of the request.

We can review:
- [Executable Network documentation](https://docs.openvinotoolkit.org/2019_R3/classie__api_1_1ExecutableNetwork.html)
- [Infer Request documentation](https://docs.openvinotoolkit.org/2019_R3/classie__api_1_1InferRequest.html)

## Asynchronous Requests

[Youtube Video](https://youtu.be/JGuUIDpn1PY)

### Synchronous

Synchronous requests wait and do nothing until the inference response is returned. They block the main thread.

### Asynchronous

Thanks to asynchronous requests, we can continue processing while another process is working.

We can review:
- [About Synchronous vs. Asynchronous blog post](https://whatis.techtarget.com/definition/synchronous-asynchronous-API)
- [Integrate the Inference Engine with Your Application](https://docs.openvinotoolkit.org/2019_R3/_docs_IE_DG_Integrate_with_customer_application_new_API.html)
- [Asynchronous Inference Requests Demo](https://github.com/opencv/open_model_zoo/blob/master/demos/object_detection_demo_ssd_async/README.md)

## Exercise: Inference Requests

This exercise is about requests.

## Solution: Inference Requests

[Youtube Video](https://youtu.be/QeBpEkkoZ74)

### Synchronous Solution

In [None]:
def sync_inference(exec_net, input_blob, image):
    '''
    Performs synchronous inference
    Return the result of inference
    '''
    result = exec_net.infer({input_blob: image})

    return result

### Asynchronous Solution

In [None]:
def async_inference(exec_net, input_blob, image):
    '''
    Performs asynchronous inference
    Returns the `exec_net`
    '''
    exec_net.start_async(request_id=0, inputs={input_blob: image})
    while True:
        status = exec_net.requests[0].wait(-1)
        if status == 0:
            break
        else:
            time.sleep(1)
    return exec_net

We don't actually need time.sleep() here, because using the -1 with wait() is able to perform similar functionality.

## Handling Results

[Youtube Video](https://youtu.be/wO_Io3wDwTM)

[InferenceEngine::Blob Class Reference](https://docs.openvinotoolkit.org/2019_R3/classInferenceEngine_1_1Blob.html)

## Integrating into Your App

[Youtube Video](https://youtu.be/vQpLv1Y3pnU)

We can review:
- [Intel®’s IoT Apps Across Industries](https://www.intel.com/content/www/us/en/internet-of-things/industry-solutions.html)
- [Starting Your First IoT Project](https://hackernoon.com/the-ultimate-guide-to-starting-your-first-iot-project-8b0644fbbe6d)
- [OpenVINO™ on a Raspberry Pi and Intel® Neural Compute Stick](https://www.pyimagesearch.com/2019/04/08/openvino-opencv-and-movidius-ncs-on-the-raspberry-pi/)

## Exercise: Integrate into an App

This is the final exercise of this lesson.

## Solution: Integrate into an App

[Youtube Video](https://youtu.be/BIdLJkDD5vM)

**Note:** There is a small change from the code on-screen for running on Linux machines versus Mac. On Mac, `cv2.VideoWriter` uses `cv2.VideoWriter_fourcc('M','J','P','G')` to write an .mp4 file, while Linux uses `0x00000021`.

### Functions in `inference.py`

We use the async and wait functions here as it's split out slightly differently than we saw in the last exercise. 

Initially, we need to know that output and input blobs were grabbed higher above when the network model is loaded:

In [None]:
self.input_blob = next(iter(self.network.inputs))
self.output_blob = next(iter(self.network.outputs))

We can use similar code as before:

In [None]:
def async_inference(self, image):
    '''
    Makes an asynchronous inference request, given an input image.
    '''
    self.exec_network.start_async(request_id=0, 
        inputs={self.input_blob: image})
    return


def wait(self):
    '''
    Checks the status of the inference request.
    '''
    status = self.exec_network.requests[0].wait(-1)
    return status

We can grab the network output using the appropriate request with the output_blob key:

In [None]:
def extract_output(self):
    '''
    Returns a list of the results for the output layer of the network.
    '''
    return self.exec_network.requests[0].outputs[self.output_blob]

### Functions in `app.py`

The next steps in app.py, before customization, are largely based on using the functions in inference.py:

In [None]:
### Initialize the Inference Engine
plugin = Network()

### Load the network model into the IE
plugin.load_model(args.m, args.d, CPU_EXTENSION)
net_input_shape = plugin.get_input_shape()

...

    ### Pre-process the frame
    p_frame = cv2.resize(frame, (net_input_shape[3], net_input_shape[2]))
    p_frame = p_frame.transpose((2,0,1))
    p_frame = p_frame.reshape(1, *p_frame.shape)

    ### Perform inference on the frame
    plugin.async_inference(p_frame)

    ### Get the output of inference
    if plugin.wait() == 0:
        result = plugin.extract_output()
        ### Update the frame to include detected bounding boxes
        frame = draw_boxes(frame, result, args, width, height)
        # Write out the frame
        out.write(frame)

The draw_boxes function is used to extract the bounding boxes and draw them back onto the input image.

In [None]:
def draw_boxes(frame, result, args, width, height):
    '''
    Draw bounding boxes onto the frame.
    '''
    for box in result[0][0]: # Output shape is 1x1x100x7
        conf = box[2]
        if conf >= 0.5:
            xmin = int(box[3] * width)
            ymin = int(box[4] * height)
            xmax = int(box[5] * width)
            ymax = int(box[6] * height)
            cv2.rectangle(frame, (xmin, ymin), (xmax, ymax), (0, 0, 255), 1)
    return frame

### Customizing `app.py`

#### Parsing the command line arguments

In [None]:
c_desc = "The color of the bounding boxes to draw; RED, GREEN or BLUE"
ct_desc = "The confidence threshold to use with the bounding boxes"

# ...

optional.add_argument("-c", help=c_desc, default='BLUE')
optional.add_argument("-ct", help=ct_desc, default=0.5)

#### Handle the new arguments

In [None]:
def convert_color(color_string):
    '''
    Get the BGR value of the desired bounding box color.
    Defaults to Blue if an invalid color is given.
    '''
    colors = {"BLUE": (255,0,0), "GREEN": (0,255,0), "RED": (0,0,255)}
    out_color = colors.get(color_string)
    if out_color:
        return out_color
    else:
        return colors['BLUE']

We need to call this with the related argument, as well as make sure the confidence threshold argument is a float value.

In [None]:
args.c = convert_color(args.c)
args.ct = float(args.ct)

#### Adding customization to `draw_boxes()`

In [None]:
frame = draw_boxes(frame, result, args, width, height)

We can use them where appropriate in the updated function.

In [None]:
def draw_boxes(frame, result, args, width, height):
    '''
    Draw bounding boxes onto the frame.
    '''
    for box in result[0][0]: # Output shape is 1x1x100x7
        conf = box[2]
        if conf >= args.ct:
            xmin = int(box[3] * width)
            ymin = int(box[4] * height)
            xmax = int(box[5] * width)
            ymax = int(box[6] * height)
            cv2.rectangle(frame, (xmin, ymin), (xmax, ymax), args.c, 1)
    return frame

#### We run the application

In [None]:
!python app.py -m frozen_inference_graph.xml -ct 0.6 -c BLUE

## Behind the Scenes of Inference Engine

[Youtube Video](https://youtu.be/ZWpNQjXSEEc)

We can review:
- [What is the best programming language for Machine Learning? - Blog Post](https://towardsdatascience.com/what-is-the-best-programming-language-for-machine-learning-a745c156d6b7)
- [Optimization Guide ](https://docs.openvinotoolkit.org/2019_R3/_docs_optimization_guide_dldt_optimization_guide.html)

## Recap

[Youtube Video](https://youtu.be/AVmFgZyk0T0)

## Lesson Glossary

### Inference Engine

Performs optimized inference using Intermediate Representation models.

### Synchronous

### Asynchronous

### [IECore](https://docs.openvinotoolkit.org/2019_R3/classie__api_1_1IECore.html)

### [IENetwork](https://docs.openvinotoolkit.org/2019_R3/classie__api_1_1IENetwork.html)

### [ExecutableNetwork](https://docs.openvinotoolkit.org/2019_R3/classie__api_1_1ExecutableNetwork.html)

### [InferRequest](https://docs.openvinotoolkit.org/2019_R3/classie__api_1_1InferRequest.html)