# Object Detection Demo: Car Detection

This is a sample reference implementation to showcase Object detection (car in this case) with SSD and Async API.
Async API improves the overall frame-rate of the application by not waiting for the inference to complete but continue doing things ont he host while accelerator is busy. 
Specifically, this code demonstrates two parallel infer requests by processing the current frame while the next input frame is being captured. This essentially hides the latency of capturing.

## Overview of How it works?
The inference executable (tutorial1) reads the command line arguments and loads a network and image from the video input to the Inference Engine (IE) plugin. 
A job must be submitted to run the inference executable on a hardware accelerator (Intel® Core CPU, Intel® HD Graphics GPU, Intel® Core CPU, Intel® Movidius™ and/or Neural Compute Stick).
After the inference is completed, the output videos are appropriately stored in the /results directory which can then be viewed within the Jupyter Notebook instance

## Demonstration objectives
* Video as input is supported using **OpenCV**
* Inference performed actual Edge hardware
* **OpenCV** provides the bounding boxes, labels and other information
* Visualization of the resulting bounding boxes
* Demonstrate the Async API in action


## Step 1: Compile the code

The code in this demo is separated into two parts.
First part is responsible for reading the input stream and running the object detection inference workload on the stream. 
This part outputs Region Of Interest (ROI), in terms of coordinates, for each frame.
The source code for this part can be found in [main.cpp](./main.cpp), and the executable will be named "tutorial1".
Output ROI will be written into a text file, "ROIs.txt".

The second part reads the ROIs.txt file, and overlays boxes on each frame of the stream based on the coordinates.
Then the output video is written into a file. 
The source code for this step is in [ROI_writer.cpp](./ROI_writer.cpp).

We have provided a Makefile for compiling the examples. Run the following cell to compile the application.
(tip: use **crtl+enter** to run the cell)

In [None]:
from IPython.display import HTML
import matplotlib.pyplot as plt
import os
import time
import sys
from pathlib import Path
sys.path.insert(0, str(Path().resolve().parent))
from demoTools.demoutils import *

In [None]:
!/opt/intel/computer_vision_sdk/deployment_tools/model_downloader/downloader.py --print_all

In [None]:
!/opt/intel/computer_vision_sdk/deployment_tools/model_downloader/downloader.py --name mobilenet-ssd -o raw_models

In [None]:
!/opt/intel/computer_vision_sdk/deployment_tools/model_optimizer/mo.py \
--input_model raw_models/object_detection/common/mobilenet-ssd/caffe/mobilenet-ssd.caffemodel \
--data_type FP32 \
-o models/mobilenet-ssd/FP32 \
--scale 256 \
--mean_values [127,127,127] 

In [None]:
!/opt/intel/computer_vision_sdk/deployment_tools/model_optimizer/mo.py \
--input_model raw_models/object_detection/common/mobilenet-ssd/caffe/mobilenet-ssd.caffemodel \
--data_type FP16 \
-o models/mobilenet-ssd/FP16 \
--scale 256 \
--mean_values [127,127,127] 

In [None]:
!make 

### Commandline flags

The two executables, tutorial1 and ROIwriter, take a number of commandline arguments.

Run the following cells to see the list of the available arguments: 

In [None]:
!./tutorial1 -h

In [None]:
!./ROI_writer -h

## Step 2: Running the inference

Now we are ready to run the inference workload. In this step we will be submitting the workload as a job to the job queue.

Currently, you are on what is called a "devnode". On this system, you are alloated just one core on a large Xeon CPU. The purpose of this node is to develop code and run minimal jupyter notebooks, but it is not meant for compute jobs like deep learning inference. So we need to request additional resources from the cluster to run the inference, and this is done through the job queue.

To put an item on the job queue, we must first create a bash script that run the workload we want. Run the following cell to create bash script [object_detection_job.sh](object_detection_job.sh) which will be our job script. 

### Writing the job script

In [None]:
%%writefile object_detection_job.sh

# The default path for the job is your home directory, so we change directory to where the files are.
cd $PBS_O_WORKDIR

# Object detection script writes output to a file inside a directory. We make sure that this directory exists.
#  The output directory is the first argument of the bash script
mkdir -p $1
ROIFILE=$1/ROIs.txt
OVIDEO=$1/output.mp4

if [ "$2" = "HETERO:FPGA,CPU" ]; then
    source /opt/intel/computer_vision_sdk/bin/setup_hddl.sh
    aocl program acl0 /opt/intel/computer_vision_sdk_2018.4.420/bitstreams/a10_vision_design_bitstreams/4-0_PL1_FP11_MobileNet_ResNet_VGG_Clamp.aocx
fi

if [ "$3" = "FP32"]; then
    config_file="conf_fp32.txt"
else
    config_file="conf_fp16.txt"
fi

# Running the object detection code
SAMPLEPATH=$PBS_O_WORKDIR
./tutorial1 -i /data/reference-sample-data/object-detection-python/cars_1900.mp4 \
            -m /data/reference-sample-data/models/mobilenet-ssd/$3/mobilenet-ssd.xml \
            -d $2 \
            -o $1\
            -fr 3000 

# Converting the text output to a video
./ROI_writer -i /data/reference-sample-data/object-detection-python/cars_1900.mp4 \
             -o $1 \
             -ROIfile $ROIFILE
             -l pascal_voc_classes.txt
             -r 2.0 # output in half res

To put this script on the job queue, we use the command `qsub`.
There are three important arguments we use with this command.

First, the `-l` flag.
This flag is used to specify what type of resources to request from the cluster.
For example this can be used to request a Intel Xeon system, or it can be used to request a system with an FPGA.
The syntax is `-l nodes=1:<tag>` where `<tag>` is the descriptor tag for the resource you want.
For example, `-l nodes=1:iei-tank-xeon` will request an Intel Xeon system.
To see the list of available tags, and the number of avilable systems, run the following cell.

In [None]:
!pbsnodes | grep properties | sort | uniq -c

Then there is the `-F` flag, which is used to pass in arguments to the job script.
The [object_detection_job.sh](object_detection_job.sh) takes in 3 arguments:
1. the path to the directory for the output video and performance stats
2. targeted device (e.g. CPU,GPU,MYRIAD)
3. the floating precision to use for inference
The job scheduler will use the contents of `-F` flag as the argument to the job script.


Finally, the `-N` flag is used to name the job itself. 
By default the jobs take on the name of the job script, which in this case would be "object_detection_job.sh".
But because we are submitting these jobs with different arguments, it is useful for record-keeping to name the job differently based on the arguments.

The following line will request an Intel Xeon system, passes in "results/xeon CPU FP32" to the job script, and names the job "obj_det_xeon". Run the cell to submit this job. 

In [None]:
!qsub object_detection_job.sh -l nodes=1:iei-tank-xeon -F "results/xeon CPU FP32" -N obj_det_xeon

#### submitting to a node with Intel® Core CPU

In [None]:
print("Submitting job to Intel Core CPU...")
#Submit job to the queue
job_id_core = !qsub object_detection_job.sh -l nodes=1:iei-tank-core -F "results/core CPU FP32" -N obj_det_core

#Progress indicators
if job_id_core:
    progressIndicator('results/core', 'i_progress_'+job_id_core[0]+'.txt', "Inference", 0, 100)
    progressIndicator('results/core', 'v_progress_'+job_id_core[0]+'.txt', "Rendering", 0, 100)

#### submitting to a node with Intel® Xeon CPU

In [None]:
print("Submitting job to Intel Xeon CPU...")
#Submit job to the queue
job_id_xeon = !qsub object_detection_job.sh -l nodes=1:iei-tank-xeon -F "results/xeon CPU FP32" -N obj_det_xeon

#Progress indicators
if job_id_xeon:
    progressIndicator('results/xeon', 'i_progress_'+job_id_xeon[0]+'.txt', "Inference", 0, 100)
    progressIndicator('results/xeon', 'v_progress_'+job_id_xeon[0]+'.txt', "Rendering", 0, 100)

#### submitting to a node with Intel® Core CPU and using the onboard Intel GPU

In [None]:
print("Submitting job to Intel Core CPU with Intel GPU...")
#Submit job to the queue
job_id_gpu = !qsub object_detection_job.sh -l nodes=1:iei-tank-core -F "results/gpu GPU FP32" -N obj_det_gpu

#Progress indicators
if job_id_gpu:
    progressIndicator('results/gpu', 'i_progress_'+job_id_gpu[0]+'.txt', "Inference", 0, 100)
    progressIndicator('results/gpu', 'v_progress_'+job_id_gpu[0]+'.txt', "Rendering", 0, 100)

#### submitting to a node with Intel® Movidius Stick

In [None]:
print("Submitting job to Intel Movidius NCS...")
#Submit job to the queue
job_id_myriad = !qsub object_detection_job.sh -l nodes=1:iei-tank-movidius -F "results/myriad MYRIAD FP16" -N obj_det_myriad

#Progress indicators
if job_id_myriad:
    progressIndicator('results/myriad', 'i_progress_'+job_id_myriad[0]+'.txt', "Inference", 0, 100)
    progressIndicator('results/myriad', 'v_progress_'+job_id_myriad[0]+'.txt', "Rendering", 0, 100)

#### submitting to a node with Intel FPGA HDDL-F (High Density Deep Learning)

In [None]:
print("Submitting job to node with Intel FPGA HDDL-F...")
#Submit job to the queue
job_id_fpga = !qsub object_detection_job.sh -l nodes=1:iei-tank-fpga -F "results/fpga HETERO:FPGA,CPU FP32" -N obj_det_fpga
    
#Progress indicators
if job_id_fpga:
    progressIndicator('results/fpga', 'i_progress_'+job_id_fpga[0]+'.txt', "Inference", 0, 100)
    progressIndicator('results/fpga', 'v_progress_'+job_id_fpga[0]+'.txt', "Rendering", 0, 100)

### Check if the jobs are done

Run the following cell to bring the custom qstat widget. 

In [None]:
liveQstat()

You should see the jobs you have submitted (referenced by `Job ID`).
It should also show the jupyter notebook job as well. 
### Before moving to step 3, make sure that all the obj_det_*  jobs submitted to the queue are completed.

## Step 3: Results

Once the jobs are complete, the stdout and stderr are store in files with names of the form (based on our `-N` argument):

`obj_det_{type}.o{JobID}`

`obj_det_{type}.e{JobID}`

But for this script, the main output is the mp4 videos which are stored in the `results/` directory.
We wrote a short utility script that will display these videos in the notebook.
See `demoutils.py` if interested in the script.

Run the following cell to see the results.

In [None]:
videoHTML('IEI Tank (Intel Core CPU)', ['results/core/output.mp4'], 'results/core/stats.txt')

In [None]:
videoHTML('IEI Tank Xeon (Intel Xeon CPU)', ['results/xeon/output.mp4'] ,'results/xeon/stats.txt')

In [None]:
videoHTML('IEI Intel GPU (Intel Core + Onboard GPU)', ['results/gpu/output.mp4'], 'results/gpu/stats.txt' 

In [None]:
videoHTML('IEI Tank + (Intel CPU + Movidius)', ['results/myriad/output.mp4'])

In [None]:
videoHTML('IEI Tank + Intel FPGA HDDL-F', ['results/fpga/output.mp4'])

In [None]:
summaryPlot({'results/core/stats.txt':'Core', 'results/xeon/stats.txt':'Xeon', 'results/gpu/stats.txt':'GPU', 'results/fpga/stats.txt':'FPGA', 'results/myriad/stats.txt':'Myriad'}, 'Architecture', 'Time, seconds', 'Inference engine processing time' )