<a id="top"></a>
# Learn OpenVINO™ C++ API

## Introduction

The purpose of this tutorial is to examine a sample computer vision application created using the [Intel® Distribution of Open Visual Inference & Neural Network Optimization (OpenVINO™) toolkit](https://www.intel.com/content/www/us/en/developer/tools/openvino-toolkit/overview.html). 

The tutorial demonstrates how to build a simple computer vision application using OpenVINO C++ API. This tutorial was created for educational purposes and does not guarantee the immediate creation of a highly efficient application. You can build your own application based on the code from this tutorial. Find an example of a well-organized and architected solution that can be used as a basis for your future applications in the [Benchmark Tool documentation](https://docs.openvino.ai/latest/openvino_inference_engine_samples_benchmark_app_README.html).

The tutorial guides you through the following steps:

1. [Learn about OpenVINO™ inference](#theory)
2. [*Optional*. Download and convert a pretrained model from the Open Model Zoo](#model)
3. [Build the executable](#build)
4. [Execute the applicationn](#execute)
5. [Explore the application](#experiment)
6. [Experiment with OpenVINO Runtime C++ API ](#explore)

## 1. Learn about OpenVINO™ Inference <a name="theory"></a>

Inference plays a crucial part in the OpenVINO toolkit. In this tutorial, we will cover only the basics using CPU specific terminology. Note that for devices other than the CPU, the terms may be different.

 You can find more detailed inference information in the documentation:
- [Benchmark Tool](https://docs.openvino.ai/latest/openvino_inference_engine_samples_benchmark_app_README.html)
- [OpenVINO Runtime Developer Guide](https://docs.openvino.ai/latest/openvino_docs_OV_Runtime_User_Guide.html)
- [Overview of OpenVINO Runtime Plugin Library](https://docs.openvino.ai/latest/groupie_dev_api.html)

The inference of a network is the execution of a computational graph consisting of different operations. Inference requests – abstraction over neural network execution (inference) used by OpenVINO [OpenVINO Runtime](https://docs.openvino.ai/latest/openvino_docs_OV_Runtime_User_Guide.html) (IE) runtime. Available cores are evenly distributed between the streams. Internally, the execution resources are split/pinned into execution streams.  

Streams – number of inference requests running in parallel. 
Batch – number of images propagated to the network at a time. The time required to process one image is called Latency. The lower the value, the better.

![](diagram_stream_batch.png)


### Asynchronous API

Asynchronous inference request runs an inference pipeline asynchronously in one or several task executors depending on a device pipeline structure. While some of the Infer Requests are processed by OpenVINO Runtime, the other ones can be filled with new frame data and asynchronously started, rather than wait for the inference to complete. Multiple requests are executed asynchronously and the throughput is measured in images per second by dividing the number of images that were processed by the processing time.

For example, you can run inference and simultaneously encode the resulting or previous frames or run further inference, like emotion detection on top of the face detection results.

**NOTE:** Changing the number of streams and batches usually is more effective in the high-performance applications, where a lot of images are being processed simultaneously.

## 2. _Optional_. Download and Convert a Pretrained Model from the Open Model Zoo <a name="model"></a>

> **NOTE**: If you already imported a model in the DL Workbench, skip this step and proceed to [the next step](#build).

OpenVINO™ toolkit includes the [Model Optimizer](https://docs.openvino.ai/latest/openvino_docs_MO_DG_Deep_Learning_Model_Optimizer_DevGuide.html) used to convert and optimize trained models into Intermediate Representation (IR) model files, and the [OpenVINO Runtime](https://docs.openvino.ai/latest/openvino_docs_OV_Runtime_User_Guide.html), which uses the IR model files to run inference on hardware devices. The IR model files are created from models trained in popular frameworks, like Caffe\*, TensorFlow\*, and others. 

OpenVINO™ [Model Downloader](https://docs.openvino.ai/latest/omz_tools_downloader.html) downloads common inference models from the [Intel® Open Model Zoo](https://github.com/openvinotoolkit/open_model_zoo). 

Before downloading a model, you need to configure a Python* environment to convert the model from Caffe* framework. To do this, create a new virtual environment and install required packages.

In [None]:
%%bash
python3 -m pip install -r requirements.txt
python3 -m virtualenv /tmp/virtualenvs/tutorial_sample_application
source /tmp/virtualenvs/tutorial_sample_application/bin/activate

python -m pip install --upgrade pip
pip uninstall openvino openvino_dev -y
pip install --upgrade openvino-dev[caffe]==2022.3.0.dev20221103

Let's download the `squeezenet1.1` model first.

In [None]:
%%bash 
source /tmp/virtualenvs/tutorial_sample_application/bin/activate

omz_downloader \
    --name squeezenet1.1 \
    -o raw_model

The next step is to translate the model into the OpenVINO™ IR format.

In [None]:
%%bash
source /tmp/virtualenvs/tutorial_sample_application/bin/activate

omz_converter \
    --name squeezenet1.1 \
    -d raw_model \
    -o model

## 3. Build the Executable <a id="build"></a>

In [None]:
%%bash

# Remove previous assets if they exist
rm -rf sample_app && rm -rf build

# Create a directory for building
mkdir build

# Build the executable using 'cmake'
cd build
cmake ..
cmake --build .

# Copy the executable from the build directory
cp sample_app ..

## 4. Execute the Application <a id="execute"></a> 

During the execution of a model, streams, as well as inference requests in a stream, can be distributed inefficiently among cores of hardware, which can reduce model speed. The optimal combination of batches and streams is specific to each particular accelerator. Therefore, the easiest way to speed up the model is to try different combinations. 
Refer to the documentation to [learn more](https://docs.openvino.ai/latest/workbench_docs_Workbench_DG_Run_Single_Inference.html).

Copy the configurations from DL Workbench:

![](copy_sample.png)


In [None]:
%%bash

# Executable application accepts several arguments:
# Path to the model (.xml)
MODEL="model/public/squeezenet1.1/FP16/squeezenet1.1.xml"
# Batch size - how many images should be fed to the network at one time
BATCH_SIZE=4
# Number of streams
STREAMS=5
# Device
DEVICE="CPU"

# Usage: ./sample_app path_to_model_xml number_of_batches number_of_streams
./sample_app ${MODEL} ${BATCH_SIZE} ${STREAMS} ${DEVICE}

## 5. Explore the Application <a id="explore"></a> 

Inspect and change the [main](main.cpp) file with an application that uses the IE asynchronous C++ API.

![](main_file.png)

To check that you have learned the inference basics, open the file and try to answer the following questions:

1. Where does the inference happen?
2. Where the inference requests are created?
3. Where does the processing of the results in the inference request happen?

## 6. Experiment with OpenVINO Runtime C++ API <a id="experiment"></a>

To continue experimenting with the sample application, return to the [Step 3](#build) or proceed to explore DL Workbench functionality.

After learning about OpenVINO™ inference, you can go to The [Benchmark Tool documentation](https://docs.openvino.ai/latest/openvino_inference_engine_samples_benchmark_app_README.html) to find an example of a well-structured solution that may be used as a basis for your future applications.

Congratulations! Now you can proceed to building your own application on the basis of this tutorial and try numerous DL Workbench features, such as:

* [Analyse how the model works and its quality](https://docs.openvino.ai/latest/workbench_docs_Workbench_DG_Visualize_Accuracy.html)
* [Perform a baseline inference and analyze model performance](https://docs.openvino.ai/latest/workbench_docs_Workbench_DG_Run_Single_Inference.html)
* [Tune the performance of the model by selecting optimal inference parameters](https://docs.openvino.ai/latest/workbench_docs_Workbench_DG_Run_Range_of_Inferences.html)
* [Preparing the model for deployment](https://docs.openvino.ai/latest/workbench_docs_Workbench_DG_Deploy_and_Integrate_Performance_Criteria_into_Application.html)