# Introduction to DL Streamer

[Deep Learning(DL) Streamer](https://github.com/opencv/gst-video-analytics) is an easy way to construct media analytics pipelines using OpenVINO™. It leverages the open source media framework [GStreamer](https://gstreamer.freedesktop.org/) to provide optimized media operations and OpenVINO™ to provide optimized inference.
The elements packaged in the DL Streamer binary release can be divided into three categories:
- Elements for optimized streaming media operations (usb and ip camera support, file handling, decode, color-space-conversion, scaling, encoding, rendering, etc.). These elements are developed by the larger GStreamer community.

- Elements that integrate the OpenVINO™ inference engine for optimized video analytics (detection, classification, tracking). These elements are provided as part of the OpenVINO™ toolkit.

- Elements that convert and publish inference results to the screen as overlaid bounding boxes, to a file as a list of JSON Objects, or to popular message brokers (Kafka or MQTT) as JSON messages. These elements are provided as part of the OpenVINO™ toolkit.


## OpenVINO version check:
You are currently using the latest development version of Intel® Distribution of OpenVINO™ Toolkit. Alternatively, you can open a version of this notebook for the Intel® Distribution of OpenVINO™ Toolkit LTS version by running the cell below and following the link it generates.

In [None]:
from qarpo import displayMultiversionURL
import os
displayMultiversionURL(os.path.abspath(""), "DL_Streamer_Benchmark.ipynb", "openvino-dev-latest", ['openvino-lts'])

## Tutorial Overview
[Section 1](#about-gstreamer): we will introduce the basic GStreamer concept by defining some terms that you will see and giving a few examples.  
[Section 2](#setup): This is where we will setup for the final two sections. We will go over DevCloud concepts like viewing target platforms as well as do some preparation for inferencing by downloading OpenVino models and preparing what DL Streamer calls a model_proc file.  
[Section 3](#vehicle-classification): We will create our first pipeline and perform vehicle detection and classification. In this section you will learn to use DL Streamer for inferencing and view the results overlayed onto the video.  
[Section 4](#benchmarking): Finally we will look at performance of our inferencing pipelines on various Intel platforms as well as show how you can improve your performance. We will modify the pipeline from section 3 to measure its FPS and then add tracking to see how it can be improved.

Once you complete this tutorial you will be able to quickly prototype media analytics pipelines, evaluate their performance on Intel® platforms and understand some of the configuration options from DL Streamer that can affect performance.

<a id='about-gstreamer'></a>
## About GStreamer
In this section we introduce basic GStreamer concepts that you will use in the rest of the tutorial. If you are already familiar with GStreamer feel free to skip ahead to [Section 2 'Setup'](#setup).  

[GStreamer](https://gstreamer.freedesktop.org/) is a flexible, fast and multiplatform open-source multimedia framework. It has an easy to use command line tool for running  pipelines, as well as an API with bindings in C, Python, Javascript and [more](https://gstreamer.freedesktop.org/bindings/).
In this tutorial we will use the GStreamer command line tool `gst-launch-1.0`. For more information and examples please refer to the online documentation [gst-launch-1.0](https://gstreamer.freedesktop.org/documentation/tools/gst-launch.html?gi-language=c).  

### Pipelines
The command line tool `gst-launch-1.0` enables developers to describe media analytics pipeline as a series of connected elements. The list of elements, their configuration properties, and their connections are all specified as a list of strings seperated by exclamation marks (!). `gst-launch-1.0` parses the string and instantiates the software modules which perform the individual media analytics operations. Internally the GStreamer library constructs a pipeline object that contains the individual elements and handles common operations such as clocking, messaging, and state changes.

**Example**:
```gst-launch-1.0 videotestsrc ! ximagesink```

### Elements
An [element](https://gstreamer.freedesktop.org/documentation/application-development/basics/elements.html?gi-language=c) is the fundamental building block of a pipeline. Elements perform specific operations on incoming frames and then push the resulting frames downstream for further processing. Elements are linked together textually by exclamation marks (`!`) with the full chain of elements representing the entire pipeline. Each element will take data from its upstream element, process it and then output the data for processing by the next element.

Elements designated as **source** elements provide input into the pipeline from external sources. In this tutorial we use the [filesrc](https://gstreamer.freedesktop.org/documentation/coreelements/filesrc.html?gi-language=c#filesrc) element that reads input from a local file.  

Elements designated as **sink** elements represent the final stage of a pipeline. As an example, a sink element could write transcoded frames to a file on the local disk or open a window to render the video content to the screen or even restream the content via rtsp. In the benchmarking section of this tutorial our primary focus will be to compare the performance of media analytics pipelines on different types of hardware and so we will use the standard [fakesink](https://gstreamer.freedesktop.org/documentation/coreelements/fakesink.html?gi-language=c#fakesink) element to end the pipeline immediately after the analytics is complete without further processing.

We will also use the [decodebin](https://gstreamer.freedesktop.org/documentation/playback/decodebin.html#decodebin) utility element. The `decodebin` element constructs a concrete set of decode operations based on the given input format and the decoder and demuxer elements available in the system. At a high level the decodebin abstracts the individual operations required to take encoded frames and produce raw video frames suitable for image transformation and inferencing.

The next step in the pipeline after decoding is color space conversion which is handled by the [videoconvert](https://gstreamer.freedesktop.org/documentation/videoconvert/index.html?gi-language=c#videoconvert) element. The exact transformation required is specified by placing a [capsfilter](https://gstreamer.freedesktop.org/documentation/coreelements/capsfilter.html?gi-language=c#capsfilter) on the output of the videoconvert element. In this case we specify BGRx because this is the format used by the detection model.
<a id='dl-streamer'></a>
#### DL Streamer elements
Elements that start with the prefix 'gva' are from DL Streamer and are provided as part of the OpenVINO™ toolkit. There are five DL Streamer elements used in this tutorial which we will describe here along with the properties that will be used. Refer to [DL Streamer elements page](https://github.com/opencv/gst-video-analytics/wiki/Elements) for the list of all DL Streamer elements and usages.  

* [gvadetect](https://github.com/opencv/gst-video-analytics/wiki/gvadetect) - Runs detection with the OpenVINO™ inference engine. We will use it to detect vehicles in a frame and output their bounding boxes.
	- `model` - path to the inference model network file.
	- `device` - device to run inferencing on. 
	- `inference-interval` - interval between inference requests, the bigger the value, the better the throughput. i.e. setting this property to 1 would mean run deteciton on every frame while setting it to 5 would run detection on every fifth frame.
* [gvaclassify](https://github.com/opencv/gst-video-analytics/wiki/gvaclassify) - Runs classification with the OpenVINO™ inference engine. We will use it to label the bounding boxes that `gvadetect` output with the type and color of the vehicle. 
	- `model` - path to the inference model network file.
	- `model-proc` - path to the model-proc file. More information on what a model-proc file is can be found in [section 2.4](#model-proc).
	- `device` - device to run inferencing on. 
    - `reclassify-interval` - How often to reclassify tracked objects. Only valid when used with `gvatrack`.
* [gvawatermark](https://github.com/opencv/gst-video-analytics/wiki/gvawatermark) - Overlays detection and classification results on top of video data. We will do exeactly that. Parse the detected vehicle results metadata and create a video frame rendered with the bounding box aligned to the vehicle position; parse the classified vehicle result and label it on the bounding box.  
* [gvafpscounter](https://github.com/opencv/gst-video-analytics/wiki/gvafpscounter) - Measure Frames Per Second across multiple streams and print to the output. 
	- `starting-frame` specifies the frame to start collecting fps measurements. In this tutorial, we start at frame 10 to not include initialization time in our performance output.
* [gvatrack](https://github.com/opencv/gst-video-analytics/wiki/gvatrack) - Identifies objects in frames where detection is skipped. This allows us to run object detection on fewer frames and increases overall throughput while still tracking the position and type of objects in every frame.

### Properties
Elements are configured using key, value pairs called properties. As an example the filesrc element has a property named `location` which specifies the file path for input.

**Example**:
 ```filesrc location=cars_1900.mp4```.

The documentation for each element (which can be viewed using the command line tool `gst-inspect-1.0`) describes its properties as well as the valid range of values for each property.

<a id='setup'></a>
## Setup
### Import dependencies
Import Python dependencies needed for displaying the results in this notebook  
*Tip: select the cell and use **Shift+Enter** to run the cell*


In [None]:
from qarpo.demoutils import *
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
import re

### Target platforms
DevCloud edge has many Intel® platforms that can be used for performance evaluation. In this tutorial we will use the following platforms: Xeon, CPU on Core, GPU on Core, Atom and HDDL.  
To check the available target platforms and their properties, use the following command:
<a id='target-platforms'></a>

In [None]:
!pbsnodes | grep compnode | sort | uniq -c

### Downloading OpenVINO™ models
In this tutorial we use the `pedestrian-and-vehicle-detector-adas-0001` and the `vehicle-attributes-recognition-barrier-0039`models provided in the OpenVINO™ Open Model Zoo repository. We can download the OpenVINO™ IR model directly from the Open Model Zoo using the model downloader. To use models not provided in the repository (e.g. *mobilenet-ssd*) first download the model and use the [model optimizer](https://docs.openvinotoolkit.org/latest/_docs_MO_DG_Deep_Learning_Model_Optimizer_DevGuide.html) to convert it to the OpenVINO™ IR format.

In [None]:
!downloader.py --name pedestrian-and-vehicle-detector-adas-0001 -o models

In [None]:
!downloader.py --name vehicle-attributes-recognition-barrier-0039 --precisions FP16 -o models

<a id='model-proc'></a>
### Preparing the model_proc file
When performing inferencing, the model chosen will have a set of defined inputs and outputs. This information is shared with the inferencing elements in the form of a JSON file called a model_proc file. The elements can perform certain pre and post-processing operations to prepare the input video for inferencing and to make output more meaningful based on information in the model_proc. Model_proc files are not always required. They are used for configuration of your element but if the default values are sufficient it can be omitted.

We will go over the basics here but you can go to [this page](https://github.com/opencv/gst-video-analytics/wiki/Model-preparation#2-model-pre--and-post-processing-specificaiton-file-) for more details on model_proc files and examples of .json files for various models from Open Model Zoo. 

#### Model_proc format
The model_proc file is split into a few sections. 
First is the schema version. The schema can be found [here](https://github.com/opencv/gst-video-analytics/blob/master/gst/inference_elements/base/model_proc_schema.h). The version you are referencing when creating your model_proc should be reflected as `json_schema_version`.

The input_preproc and output_postproc sections are both arrays of JSON objects, with each object representing a layer of the model. In version 1 of the schema, the input_preproc can be used to set the color format of the video coming in. 

In the output_postproc section, a layer's output can be converted to something meaningful to future elements in the pipeline, the application or the user viewing the data. 
One option for a converter is `tensor_to_bbox_ssd` which would be used with a single shot detector (SSD) model for things like object detection. This converter will create a GstVideoRegionOfInterestMeta metadata type, fill it with the bounding box of the detected object and confidence data from the model's output and post it on the GStreamer bus so it can be used by future elements.  
Another converter is `tensor_to_label` which is used for classification models. It uses the `labels` property to put text labels on the output of the model. This is what we will use below in the model_proc file for our vehicle classification. The color layer of our model will tell us that the detected vehicle is part of class 1 but using our provided labels, `gvaclassify` will label the vehicle as "white".

#### Default behavior
Most commonly, a model will input video in BGR color format. If no model_proc is provided or if the input_preproc section is left empty it will default to this format.  
Post processing is optional. The output will be GstGVATensorMeta if post processing is not performed. This will typically be the case if using the `gvainference` element. Post processing will be enabled if provided in the model_proc file **or** if the element can automatically detect the model output format. Currently the only automatically recognizable models for the inference elements is single shot detector models like object detection. The element will use the `tensor_to_bbox_ssd` converter to output bounding boxes of detected objects.

We will not supply a model_proc file to our detection model. The input to the model should be BGR format (and we will do this conversion with another GStreamer element). It is an SSD based model so will output bounding boxes by default for the vehicles detected. We do not need to label the vehicles detected since we will be doing this with a classification model anyway. 
#### Make a directory to hold the model_proc file

In [None]:
!mkdir -p model_proc

#### vehicle-attributes-recognition-barrier-0039
Below we will generate a model_proc file for the classification model to configure our `gvaclassify` element. We skip the input_preproc section since our video will already be in the appropriate format for our model. We then provide two sections in the output_postproc for the two layers of the classiciation model `color` and `type`. In each layer's section we define that we want to use the `tensor_to_label` converter to label our output and the method `max` tells the element to choose the label with the highest confidence.

In [None]:
%%writefile model_proc/vehicle-attributes-recognition-barrier-0039.json
{
  "json_schema_version": "1.0.0",
  "input_preproc": [],
  "output_postproc": [
    {
      "layer_name": "color",
      "attribute_name": "color",
      "labels": [
        "white",
        "gray",
        "yellow",
        "red",
        "green",
        "blue",
        "black"
      ],
      "converter": "tensor_to_label",
      "method": "max"
    },
    {
      "layer_name": "type",
      "attribute_name": "type",
      "labels": [
        "car",
        "bus",
        "truck",
        "van"
      ],
      "converter": "tensor_to_label",
      "method": "max"
    }
  ]
}


### The input video file
We will use this input video when running our pipelines. Run the following cell to create a symlink and view the input video.

In [None]:
!ln -sf /data/reference-sample-data/object-detection-python/cars_1900.mp4 
videoHTML('Input Video', ['cars_1900.mp4'])

<a id='vehicle-classification'></a>
## Vehicle detection and classification
### Section Overview
In this section, you will learn how to:
* Construct a media analytics pipeline using Gstreamer.
* Submit the pipeline for execution on edge nodes.
* Run the pipeline for vehicle detection and classification.
* Watermark the inference results on the video.
* Save the video file.
* Play the video in Jupyter Notebook.

### Pipeline
The script in the next cell constructs a media analytics pipeline for vehicle detection and classification. For more information on GStreamer basics, see [Section 2 'About GStreamer'](#about-gstreamer) above or visit GStreamer's documentation pages [here](https://gstreamer.freedesktop.org/documentation/).

The pipeline you will create will accept a video file input, decode it and run vehicle detection, followed by classification. It overlays the bounding boxes for detected vehicles and classification results on the video frame, downscales the video for better viewing in a browser and outputs it as an mp4 video file.
```shell
gst-launch-1.0 filesrc location=<input file> ! decodebin ! videoconvert ! \
gvadetect model=<model file> device=<device id> inference-interval=1 ! \
gvaclassify model=<model file 1> model-proc=<model_proc file 1> device=CPU ! \
gvawatermark ! videoconvert ! videoscale ! video/x-raw,width=960,height=540 ! vaapih264enc ! mp4mux ! filesink location=<output file>
```
The input video has 1080P(1920x1080) resolution and is downscaled to 960x540 at the output using the standard GStreamer element `videoscale`. This is done simply to reduce the buffering time for playback in a web browser and to make the bounding boxes and labels easier to see. You can remove the `videoscale ! video/x-raw,width=960,height=540` to output the high resolution video if you prefer.

### Create job script

In [None]:
%%writefile vehicle_detection_and_classification_job.sh

# The default path for the job is your home directory, so we change directory to where the files are.
cd $PBS_O_WORKDIR

DEVICE=${2:-CPU}
DETECT_MODEL=models/intel/pedestrian-and-vehicle-detector-adas-0001/FP16/pedestrian-and-vehicle-detector-adas-0001.xml
CLASSIFY_MODEL=models/intel/vehicle-attributes-recognition-barrier-0039/FP16/vehicle-attributes-recognition-barrier-0039.xml
CLASSIFY_MODEL_PROC=model_proc/vehicle-attributes-recognition-barrier-0039.json
INPUT_FILE=${1:-cars_1900.mp4}
INFERENCE_INTERVAL=1
OUTPUT_FILE=resources/video/output/vehicle_detect_classify_output.mp4

PIPELINE="filesrc location=$INPUT_FILE ! decodebin ! \
    videoconvert n-threads=4 ! capsfilter caps=\"video/x-raw,format=BGRx\" ! \
    gvadetect model=$DETECT_MODEL device=$DEVICE inference-interval=$INFERENCE_INTERVAL ! \
    gvaclassify model=$CLASSIFY_MODEL model-proc=$CLASSIFY_MODEL_PROC device=$DEVICE ! \
    gvawatermark ! videoconvert ! videoscale ! video/x-raw,width=960,height=540 ! vaapih264enc ! mp4mux ! filesink location=$OUTPUT_FILE"
echo "gst-launch-1.0 $PIPELINE"

mkdir -p $(dirname "${OUTPUT_FILE}")
rm -f "$OUTPUT_FILE"
gst-launch-1.0 $PIPELINE

### Submit a job to run the pipeline
In the cell below, we submit the job script created above to an edge node. We specify the node based on properties found with the `pbsnodes` command that was run above in [section 2.2](#target-platforms). The job will run the pipeline and generate the video output. 

In [None]:
vehicle_classify_xeon = "vehicle_classify_xeon"

vehicle_classify_xeon_id = !qsub vehicle_detection_and_classification_job.sh -l nodes=1:idc007xv5 -N $vehicle_classify_xeon
print(vehicle_classify_xeon_id[0])

### Monitor the job queue
We need to wait for the job to complete before we can view the output. Check the progress of the job with the command below. `Q` status stands for `queued`, `R` for `running`. How long a job is being queued is dependent on number of the users. If the job is no longer listed, it's done. 

In [None]:
liveQstat()

### View the results
After the job is completed, you can play the resulting mp4 video in the following cell. You will see the input video with the detections and classifications overlayed onto it. The detection is seen as a bounding box drawn around the vehicle. They will be labelled with text showing the classifications. In this case, type and color.

In [None]:
videoHTML('Vehicle Classification', 
          ['resources/video/output/vehicle_detect_classify_output.mp4'])

<a id='benchmarking'></a>
## Pipeline Benchmarking
### Section Overview
In this section you will learn how to:
* Construct/adjust the previous GStreamer pipeline to check performance. 
* Run pipelines demonstrating vehicle detection and vehicle tracking.
* Submit pipelines for execution on edge nodes with different hardware configurations.
* Compare the performance of pipelines on different hardware configurations by measuring and reporting frames per second (FPS).

### View job results
After a job is submitted, it will take time to complete and return the output. This is a helper function that will wait for the job to be completed and print out the results. 


In [None]:
def wait_for_job_completion_print_results(job_name, job_id):
    if job_id:
        print("Waiting for job {} to complete .".format(job_id), end="")
        output_file = "{}.o{}".format(job_name, job_id[0].split(".")[0])
        error_file = "{}.e{}".format(job_name, job_id[0].split(".")[0])
        while not os.path.exists(output_file):  # Wait until the file report is created.
            time.sleep(1)
            print(".", end="")
        print("Done!")
        !cat $output_file | grep -e "FpsCounter" -e "FPSCounter"

### Vehicle Detection
#### Pipeline
This pipeline is similar to the one we created above in [Section 3](#vehicle-classification) with a few key differences. In Section 3 we wanted to view the results of our inferencing. To do that we were watermarking the inferences onto the video, encoding it and saving it to an .mp4 file. Now we will remove all the video processing elements after `gvaclassify` and focus on the performance of inferencing.

The pipeline will still accept a video file input, decode it and run vehicle detection, followed by classification. But instead of processing for video output, we will add the `gvafpscounter` element from DL Streamer to measure the FPS and then fakesink to not process the video any further.

```shell
gst-launch-1.0 filesrc location=<input file> ! decodebin ! videoconvert ! \
gvadetect model=<detection_model_file> device=<device_id> inference-interval=1 ! \
gvaclassify model=<classify_model_file> model-proc=<classify_model_proc> device=<device_id> ! \
gvafpscounter starting-frame=10 ! fakesink
```

There are three DL Streamer elements used. For a reminder of what they are used for see [Section 1.2.1](#dl-streamer).

#### Create a job script for vehicle detection
In the script below we achieve a stream density of four by repeating the pipeline command multiple times (designated by CHANNELS_COUNT) to start multiple pipeline instances.

The script has two arguments, `$1` is the device type, CPU/GPU/HDDL; `$2` is the number of buffers that will be read from the input file. Lowering this number will reduce the overall running time but should not affect the average FPS.

In [None]:
%%writefile vehicle_detection_benchmark.sh
# The default path for the job is your home directory, so we change directory to where the files are.
cd $PBS_O_WORKDIR

DEVICE=${1:-CPU}
NUM_BUFFERS=${2:--1}

DETECT_MODEL=models/intel/pedestrian-and-vehicle-detector-adas-0001/FP16/pedestrian-and-vehicle-detector-adas-0001.xml
CLASSIFY_MODEL=models/intel/vehicle-attributes-recognition-barrier-0039/FP16/vehicle-attributes-recognition-barrier-0039.xml
CLASSIFY_MODEL_PROC=model_proc/vehicle-attributes-recognition-barrier-0039.json
INPUT_FILE=cars_1900.mp4
INFERENCE_INTERVAL=1
CHANNELS_COUNT=4
PIPELINE="filesrc location=$INPUT_FILE num-buffers=$NUM_BUFFERS ! decodebin ! \
    videoconvert n-threads=4 ! capsfilter caps=\"video/x-raw,format=BGRx\" ! \
    gvadetect model=$DETECT_MODEL device=$DEVICE inference-interval=$INFERENCE_INTERVAL ! queue ! \
    gvaclassify model=$CLASSIFY_MODEL model-proc=$CLASSIFY_MODEL_PROC device=$DEVICE reclassify-interval=$INFERENCE_INTERVAL ! queue ! \
    gvafpscounter starting-frame=10 ! fakesink "

FINAL_PIPELINE_STR=""
for (( CURRENT_CHANNELS_COUNT=0; CURRENT_CHANNELS_COUNT < $CHANNELS_COUNT; ++CURRENT_CHANNELS_COUNT ))
do
  FINAL_PIPELINE_STR+=$PIPELINE
done
echo "gst-launch-1.0 $FINAL_PIPELINE_STR"
gst-launch-1.0 $FINAL_PIPELINE_STR

#### Submitting jobs
When submitting jobs, we specify the node based on the properties we found when running the `pbsnodes` command that was run above in [section 2.2](#target-platforms).   

In the cells below, we submit the job script to various edge nodes configured with different hardware. 
##### Submit a job to Xeon node
After submission, the job will be in the job queue which we will monitor with the `liveQstat()` function.  
This job will run the vehicle detection pipeline on CPU of a Xeon E3 processor.

In [None]:
job_name_xeon = "detect_xeon"

job_id_xeon = !qsub vehicle_detection_benchmark.sh -l nodes=1:idc007xv5 -N $job_name_xeon
print(job_id_xeon[0])

##### Submit a job to CPU on Core node.
This job will run the vehicle detection pipeline on CPU of a Core processor.

In [None]:
job_name_core_cpu = "detect_core_cpu"

job_id_core_cpu = !qsub vehicle_detection_benchmark.sh -l nodes=1:idc001skl -N $job_name_core_cpu
print(job_id_core_cpu[0])

##### Submit a job to GPU on Core node
This job will run the vehicle detection pipeline on GPU of a Core processor.
Notice that we supply `-F "GPU 500"` to the qsub command. `-F` specifies the arguments that will be sent to the job script. So in this case, it will set the DEVICE to GPU and NUM_BUFFERS to 500 in our job script above. Inferencing is going to run on GPU and we will only process the first 500 buffers from the video through the pipeline. 

In [None]:
job_name_core_gpu = "detect_core_gpu"

job_id_core_gpu = !qsub vehicle_detection_benchmark.sh -l nodes=1:idc001skl -N $job_name_core_gpu -F "GPU 500"
print(job_id_core_gpu[0])

##### Submit a job to Atom node called UP-SQUARED(UP2)
This job will run the vehicle detection pipeline on CPU of an Atom processor.  

In [None]:
job_name_up2 = "detect_up2"
job_id_up2 = !qsub vehicle_detection_benchmark.sh -l nodes=1:idc008u2g -N $job_name_up2 -F "CPU 500"
print(job_id_up2[0])

##### Submit a job to HDDL-R node
This job will run the vehicle detection pipeline on HDDL-R accelerator.

In [None]:
job_name_hddlr = "detect_hddlr"
job_id_hddlr = !qsub vehicle_detection_benchmark.sh -l nodes=1:idc002mx8 -N $job_name_hddlr -F "HDDL"
print(job_id_hddlr[0])

#### Monitor the job queue
Check the progress of the jobs. `Q` status stands for `queued`, `R` for `running`. How long a job is being queued is dependent on number of the users. It could take up to 5 minutes for a job to run. If the job is no longer listed, it's done. 

In [None]:
liveQstat()

#### Wait for job completion and print results
After the job is completed, the detection pipeline performance data will be shown for each second that the pipeline was run and toward the bottom have an average of the total FPS for all 4 instances.

In [None]:
wait_for_job_completion_print_results(job_name_xeon, job_id_xeon)

In [None]:
wait_for_job_completion_print_results(job_name_core_cpu, job_id_core_cpu)

In [None]:
wait_for_job_completion_print_results(job_name_core_gpu, job_id_core_gpu)

In [None]:
wait_for_job_completion_print_results(job_name_up2, job_id_up2)

In [None]:
wait_for_job_completion_print_results(job_name_hddlr, job_id_hddlr)

### Vehicle Tracking
#### Pipeline
Here we are going to make a few adjustments to improve our performance. We set the `inference-interval` property on `gvadetect` to 10 meaning that detection will only be performed on every 10th frame. Similarly, we set the `reclassify-interval` on `gvaclassify` to 10. In addtion, we add the element `gvatrack` to identify vehicles in the frames where detection is skipped. **This allows us to run detection on fewer frames and increases overall throughput while still tracking the position and type of objects in every frame.**

```shell
gst-launch-1.0 filesrc location=<input file> ! decodebin ! videoconvert ! \
gvadetect model=<detection_model_file> device=<device_id> inference-interval=10 ! \
gvatrack ! gvaclassify model=<classify_model_file> model-proc=<classify_model_proc> device=<device_id> ! \
gvafpscounter starting-frame=10 ! fakesink
```

#### Create a job script for vehicle tracking
After constructing the vehicle tracking pipeline, the script also increases stream density by repeating the pipeline to start multiple pipeline instances like before.

In [None]:
%%writefile vehicle_tracking_benchmark_job.sh
# The default path for the job is your home directory, so we change directory to where the files are.
cd $PBS_O_WORKDIR

DEVICE=${1:-CPU}
NUM_BUFFERS=${2:--1}

DETECT_MODEL=models/intel/pedestrian-and-vehicle-detector-adas-0001/FP16/pedestrian-and-vehicle-detector-adas-0001.xml
CLASSIFY_MODEL=models/intel/vehicle-attributes-recognition-barrier-0039/FP16/vehicle-attributes-recognition-barrier-0039.xml
CLASSIFY_MODEL_PROC=model_proc/vehicle-attributes-recognition-barrier-0039.json
INPUT_FILE=cars_1900.mp4
INFERENCE_INTERVAL=10
CHANNELS_COUNT=4
PIPELINE="filesrc location=$INPUT_FILE num-buffers=$NUM_BUFFERS ! decodebin ! \
    videoconvert n-threads=4 ! capsfilter caps="video/x-raw,format=BGRx" ! \
    gvadetect model=$DETECT_MODEL device=$DEVICE inference-interval=$INFERENCE_INTERVAL ! queue ! \
    gvatrack ! queue ! gvaclassify model=$CLASSIFY_MODEL model-proc=$CLASSIFY_MODEL_PROC device=$DEVICE reclassify-interval=$INFERENCE_INTERVAL ! queue ! \
    gvafpscounter starting-frame=10 ! fakesink "
FINAL_PIPELINE_STR=""
for (( CURRENT_CHANNELS_COUNT=0; CURRENT_CHANNELS_COUNT < $CHANNELS_COUNT; ++CURRENT_CHANNELS_COUNT ))
do
  FINAL_PIPELINE_STR+=$PIPELINE
done
echo "gst-launch-1.0 $FINAL_PIPELINE_STR"
gst-launch-1.0 $FINAL_PIPELINE_STR

#### Submitting jobs
Here we submit the vehicle tracking job to the same node types as we did for vehicle detection above.
##### Submit a job to Xeon node
This job will run the vehicle tracking pipeline on CPU of a Xeon E3 processor.

In [None]:
job_name_tracking_xeon = "track_xeon"
job_id_tracking_xeon = !qsub vehicle_tracking_benchmark_job.sh -l nodes=1:idc007xv5 -N $job_name_tracking_xeon
print(job_id_tracking_xeon[0])

##### Submit a job to CPU on Core node.
This job will run the vehicle tracking pipeline on CPU of a Core processor.

In [None]:
job_name_tracking_core_cpu = "track_core_cpu"
job_id_tracking_core_cpu = !qsub vehicle_tracking_benchmark_job.sh -l nodes=1:idc001skl -N $job_name_tracking_core_cpu
print(job_id_tracking_core_cpu[0])

##### Submit a job to GPU on Core node
This job will run the vehicle tracking pipeline on GPU of a Core processor.

In [None]:
job_name_tracking_core_gpu = "track_core_gpu"
job_id_tracking_core_gpu = !qsub vehicle_tracking_benchmark_job.sh -l nodes=1:idc001skl -N $job_name_tracking_core_gpu -F "GPU 500"
print(job_id_tracking_core_gpu[0])

##### Submit a job to Atom node called UP-SQUARED(UP2)
This job will run the vehicle tracking pipeline on CPU of an Atom processor.

In [None]:
job_name_tracking_up2 = "track_up2"
job_id_tracking_up2 = !qsub vehicle_tracking_benchmark_job.sh -l nodes=1:idc008u2g -N $job_name_tracking_up2 -F "CPU 500"
print(job_id_tracking_up2[0])

##### Submit a job to HDDL-R node
This job will run the vehicle tracking pipeline on HDDL-R accelerator.

In [None]:
job_name_tracking_hddlr = "track_hddlr"
job_id_tracking_hddlr = !qsub vehicle_tracking_benchmark_job.sh -l nodes=1:idc002mx8 -N $job_name_tracking_hddlr -F "HDDL"
print(job_id_tracking_hddlr[0])

#### Monitor the job queue
Check the progress of the jobs. `Q` status stands for `queued`, `R` for `running`. How long a job is being queued is dependent on number of the users. It could take up to 5 minutes for a job to run. If the job is no longer listed, it's done. 

In [None]:
liveQstat()

#### Wait for job completion and print results
After the job is completed, the tracking pipeline performance data will be shown for each second that the pipeline was run and toward the bottom have an average of the total FPS for all 4 instances.

In [None]:
wait_for_job_completion_print_results(job_name_tracking_xeon, job_id_tracking_xeon)

In [None]:
wait_for_job_completion_print_results(job_name_tracking_core_cpu, job_id_tracking_core_cpu)

In [None]:
wait_for_job_completion_print_results(job_name_tracking_core_gpu, job_id_tracking_core_gpu)

In [None]:
wait_for_job_completion_print_results(job_name_tracking_up2, job_id_tracking_up2)

In [None]:
wait_for_job_completion_print_results(job_name_tracking_hddlr, job_id_tracking_hddlr)

### Performance Comparison Chart
For the jobs run above, we collect the performance for each job and plot a bar chart.
#### Helper functions
This function will parse the output file to find the average FPS. Since we ran four instances of the pipelines simultaneously, it adds the average of each individual pipeline.

In [None]:
def find_average_fps(file_name):
    if not os.path.exists(file_name):
        return None
    with open(file_name, 'r') as fps_file:
        for line in fps_file:
            match_object=re.search(r'FPSCounter\(average\):\ total=([0-9\.]*).*',line)
            if match_object:
                return float(match_object.group(1))
    return None

This function will create the bar chart and table for 2 sets of average FPS numbers, one for the vehicle detection pipeline; one for the vehicle tracking pipeline

In [None]:
def display_results(cols, rows, detection_fps, tracking_fps):
    fig,ax = plt.subplots(figsize=(10,5))
    cell_text=[]
    cell_text.append(detection_fps)
    cell_text.append(tracking_fps)
    results_table = plt.table(cellText=cell_text, rowLabels=rows, colLabels=cols, cellLoc="center")
    results_table.scale(1,4)
    results_table.auto_set_font_size(False)
    results_table.set_fontsize(12)
    x = np.arange(len(detection_fps))
    width=0.4
    det_bar = plt.bar(x - width/2, detection_fps, width=width, color="xkcd:blue", label='Detection')
    track_bar = plt.bar(x + width/2, tracking_fps, width=width, color="xkcd:azure", label='Tracking')
    plt.tick_params(axis='x', which='both', bottom=False, top=False, labelbottom=False)
    plt.title('Vehicle Detection and Tracking')
    plt.ylabel('Frames Per Second')
    plt.legend()
    plt.show()

#### Displaying the Results
The bar chart created below will show the average FPS numbers for all designated Intel® platforms; for each plaform, vehicle detection and vehicle tracking results will be drawn side-by-side to show the performance improvement.  

**In general the vehicle tracking pipeline has better performance than the vehicle detection pipeline. This is due to the increase in the inference-interval. Since the vehicle is being tracked across frames we do not need to perform inference on every frame and we end up with higher throughput. Adjusting this value is critical to your use case to ensure that you are getting the best performance while still making sure to capture new objects that come into the frame.**

In [None]:
arch_list = [('core_cpu', 'Intel Core\ni5-6500TE\nCPU'),
             ('core_gpu', 'Intel Core\ni5-6500TE\nGPU'),
             ('xeon', 'Intel Xeon\nE3-1268L v5\nCPU'),
             ('hddlr', ' IEI Mustang\nV100-MX8\nVPU'),
             ('up2', 'Intel Atom\nx7-E3950\nUP2/GPU')]

detection_fps_results = []
tracking_fps_results = []
column_names = []
row_names = ('Vehicle Detection\nInference-interval=1','Vehicle Tracking\nInference-interval=10')

for arch, a_name in arch_list:
    column_names.append(a_name)
    if 'job_id_'+arch in vars():
        fps = find_average_fps('{}.o{}'.format(vars()['job_name_'+arch], vars()['job_id_'+arch][0].split(".")[0]))
        if fps:
            detection_fps_results.append(fps)
        else:
            detection_fps_results.append(0)
    else:
        detection_fps_results.append(0)
    if 'job_id_tracking_'+arch in vars():
        fps = find_average_fps('{}.o{}'.format(vars()['job_name_tracking_'+arch], vars()['job_id_tracking_'+arch][0].split(".")[0]))
        if fps:
            tracking_fps_results.append(fps)
        else:
            tracking_fps_results.append(0)
    else:
        tracking_fps_results.append(0)
        
display_results(column_names, row_names, detection_fps_results, tracking_fps_results)
display(widgets.HTML(value=defaultDisclaimer()))