# Smart Queue Monitoring System - Retail Scenario

In this project, you will build a people counter app to reduce congestion in queuing systems by guiding people to the least congested queue. You will have to use Intel's OpenVINO API and the person detection model from their open model zoo to build this project. It demonstrates how to create a smart video IoT solution using Intel® hardware and software tools. This solution detects people in a designated area, providing the number of people in the frame.

## Overview of how it works
Your code should read the equivalent of command line arguments and loads a network and image from the video input to the Inference Engine (IE) plugin. A job is submitted to an edge compute node with a hardware accelerator such as Intel® HD Graphics GPU, Intel® Movidius™ Neural Compute Stick 2 and Intel® Arria® 10 FPGA.
After the inference is completed, the output videos are appropriately stored in the /results/[device] directory, which can then be viewed within the Jupyter Notebook instance.

## Demonstration objectives
* Video as input is supported using **OpenCV**
* Inference performed on edge hardware (rather than on the development node hosting this Jupyter notebook)
* **OpenCV** provides the bounding boxes, labels and other information
* Visualization of the resulting bounding boxes


## Step 0: Set Up

### 0.1: Import dependencies

Run the below cell to import Python dependencies needed for displaying the results in this notebook
(tip: select the cell and use **Ctrl+enter** to run the cell)

In [1]:
#Import your dependencies here
from demoTools.demoutils import *
import matplotlib.pyplot as plt

### 0.2  (Optional-step): Original video without inference

If you are curious to see the input video, run the following cell to view the original video stream used for inference and people counter.

In [2]:
videoHTML('People Counter Video', ['./resources/retail.mp4'])

## Step 1: Using Intel® Distribution of OpenVINO™ toolkit

We will be using Intel® Distribution of OpenVINO™ toolkit Inference Engine (IE) to locate people in frame.
There are five steps involved in this task:

1. Download the model using the open_model_zoo
2. Choose a device and create IEPlugin for the device
3. Read the Model using IENetwork
4. Load the IENetwork into the Plugin
5. Run inference.

### 1.1 Downloading Model

Write a command to download the  **person-detection-retail-0013** model in an IR format

In [None]:
## Write your command here

# !wget https://download.01.org/opencv/2020/openvinotoolkit/2020.1/open_model_zoo/models_bin/1/person-detection-retail-0013/FP16/person-detection-retail-0013.xml
# !wget https://download.01.org/opencv/2020/openvinotoolkit/2020.1/open_model_zoo/models_bin/1/person-detection-retail-0013/FP16/person-detection-retail-0013.bin    


In [2]:
!ls -la model

total 1768
drwx------ 2 u40443 u40443    4096 Apr  4 14:45 .
drwxr-xr-x 9 u40443 u40443    4096 Apr 10 11:06 ..
-rw------- 1 u40443 u40443 1445736 Jan 24 03:21 person-detection-retail-0013.bin
-rw------- 1 u40443 u40443  354107 Jan 24 03:21 person-detection-retail-0013.xml


## Step 2 : Inference on a video

By now you should have already completed the inference code in <a href="person_detect.py">person_detect.py</a>. If you haven't done so already, then you should do it now.

The Python code should take in command line arguments for video, model etc.

While the type of command line options is up to you, the command below is an example 

```
python3 main.py -m ${MODELPATH} \
                -i ${INPUT_FILE} \
                -o ${OUTPUT_FILE} \
                -d ${DEVICE} \
                -pt ${THRESHOLD}\

```

##### The description of the arguments used in the argument parser is the command line executable equivalent.
* -m location of the pre-trained IR model which has been pre-processed using the model optimizer. There is automated support built in this argument to support both FP32 and FP16 models targeting different hardware
* -i  location of the input video stream
* -o location where the output file with inference needs to be stored (results/[device])
* -d type of Hardware Acceleration (CPU, GPU, MYRIAD, HDDL or HETERO:FPGA,CPU)
* -pt probability threshold value for the person detection

### 2.1 Creating job file

To run inference on the video, we need more compute power.
We will run the workload on several edge compute nodes present in the IoT DevCloud. We will send work to the edge compute nodes by submitting the corresponding non-interactive jobs into a queue. For each job, we will specify the type of the edge compute server that must be allocated for the job.

The job file is written in Bash, and will be executed directly on the edge compute node.
You will have to create the job file by running the cell below.

In [3]:
%%writefile person_detect_job.sh
# The writefile magic command can be used to create and save a file

MODEL=$1
DEVICE=$2
VIDEO=$3
QUEUE=$4
OUTPUT=$5
PEOPLE=$6
THRESHOLD=$7

mkdir -p $5

if [ $DEVICE = "HETERO:FPGA,CPU" ]; then
    #Environment variables and compilation for edge compute nodes with FPGAs
    source /opt/intel/init_openvino.sh
    aocl program acl0 /opt/intel/openvino/bitstreams/a10_vision_design_sg1_bitstreams/2019R4_PL1_FP16_MobileNet_Clamp.aocx
fi

echo "Running person_detect.py"
echo "Model: $MODEL"
echo "Device: $DEVICE"
echo "Video: $VIDEO"
echo "Queue: $QUEUE"
echo "Output: $OUTPUT"
echo "People: $PEOPLE"
echo "Threshold: $THRESHOLD"

python3 person_detect.py  --model ${MODEL} \
                          --visualise \
                          --queue_param ${QUEUE} \
                          --device ${DEVICE} \
                          --video ${VIDEO} \
                          --output_path ${OUTPUT} \
                          --max_people ${PEOPLE} \
                          --threshold ${THRESHOLD}

Overwriting person_detect_job.sh


### 2.2 Understand how jobs are submitted into the queue

Now that we have the job script, we can submit the jobs to edge compute nodes. In the IoT DevCloud, you can do this using the `qsub` command.
We can submit people_counter to several different types of edge compute nodes simultaneously or just one node at a time.

There are three options of `qsub` command that we use for this:
- `-l` : this option let us select the number and the type of nodes using `nodes={node_count}:{property}`. 
- `-F` : this option let us send arguments to the bash script. 
- `-N` : this option let us name the job so that it is easier to distinguish between them.

Example using `qsub` command:

`!qsub person_detect_job.sh -l nodes=1:tank-870:i5-6500te -d . -F "models/intel/PATH-TO-MODEL DEVICE resources/retail.mp4 bin/queue_param/retail.npy results/retail/DEVICE MAX-PEOPLE" -N JOB-NAME`

You will need to change the following variables, `models/intel/PATH-TO-MODEL`, `DEVICE`, `results/retail/DEVICE`, `MAX-PEOPLE`, and `JOB-NAME` to the appropriate values.

If you are curious to see the available types of nodes on the IoT DevCloud, run the following optional cell.

In [None]:
!pbsnodes | grep compnode | awk '{print $3}' | sort | uniq -c

Here, the properties describe the node, and number on the left is the number of available nodes of that architecture.

### 2.3 Job queue submission

Each of the cells below should submit a job to different edge compute nodes.
The output of the cell is the `JobID` of your job, which you can use to track progress of a job.

**Note** You can submit all jobs at once or one at a time. 

After submission, they will go into a queue and run as soon as the requested compute resources become available. 
(tip: **shift+enter** will run the cell and automatically move you to the next cell. So you can hit **shift+enter** multiple times to quickly run multiple cells)

If your job successfully runs and completes, it will output a video, `output_video.mp4`, and a text file, `stats.txt`, in the `results/retail/DEVICE` folder.

#### Constants & helper functions

In [4]:
MODEL_PATH = 'model/person-detection-retail-0013.xml'
VIDEO = 'resources/retail.mp4'
QUEUE = 'bin/queue_param/retail.npy'
PEOPLE = 2
THRESHOLD = 0.7


def submit(device, node):
    device_postfix = device
    if device.startswith('HETERO:FPGA'):
        device_postfix = 'FPGA'
        
    job_name = 'RETAIL_' + device_postfix
    output = 'results/retail/' + device_postfix
    params = '{} {} {} {} {} {} {}'.format(MODEL_PATH, device, VIDEO, QUEUE, output, PEOPLE, THRESHOLD)

    job_id = !qsub person_detect_job.sh -l nodes=1:{node} -d . -F "{params}" -N {job_name}
    job_id_number = job_id[0].split('.')[0]
    
    print(f'Job ID: {job_id}, #: {job_id_number}')
    return job_id_number

#### Submitting to an edge compute node with an Intel® CPU
In the cell below, write a script to submit a job to an <a 
    href="https://software.intel.com/en-us/iot/hardware/iei-tank-dev-kit-core">IEI 
    Tank* 870-Q170</a> edge node with an <a 
    href="https://ark.intel.com/products/88186/Intel-Core-i5-6500TE-Processor-6M-Cache-up-to-3-30-GHz-">Intel® Core™ i5-6500TE processor</a>. The inference workload will run on the CPU.

In [5]:
#Submit job to the queue
job_number_cpu = submit('CPU', 'tank-870:i5-6500te')

Job ID: ['32111.v-qsvr-1.devcloud-edge'], #: 32111


#### Submitting to an edge compute node with Intel® Core CPU and using the onboard Intel® GPU
In the cell below, write a script to submit a job to an <a 
    href="https://software.intel.com/en-us/iot/hardware/iei-tank-dev-kit-core">IEI 
    Tank* 870-Q170</a> edge node with an <a href="https://ark.intel.com/products/88186/Intel-Core-i5-6500TE-Processor-6M-Cache-up-to-3-30-GHz-">Intel® Core i5-6500TE</a>. The inference workload will run on the Intel® HD Graphics 530 card integrated with the CPU.

In [6]:
#Submit job to the queue
job_number_gpu = submit('GPU', 'tank-870:i5-6500te:intel-hd-530')

Job ID: ['32112.v-qsvr-1.devcloud-edge'], #: 32112


#### Submitting to an edge compute node with Intel® NCS 2 (Neural Compute Stick 2)
In the cell below, write a script to submit a job to an <a 
    href="https://software.intel.com/en-us/iot/hardware/iei-tank-dev-kit-core">IEI 
    Tank 870-Q170</a> edge node with an <a href="https://ark.intel.com/products/88186/Intel-Core-i5-6500TE-Processor-6M-Cache-up-to-3-30-GHz-">Intel Core i5-6500te CPU</a>. The inference workload will run on an <a 
    href="https://software.intel.com/en-us/neural-compute-stick">Intel Neural Compute Stick 2</a> installed in this  node.

In [7]:
#Submit job to the queue
job_number_ncs2 = submit('MYRIAD', 'tank-870:i5-6500te:intel-ncs2')

Job ID: ['32113.v-qsvr-1.devcloud-edge'], #: 32113


#### Submitting to an edge compute node with IEI Mustang-F100-A10 (Intel® Arria® 10 FPGA)
In the cell below, write a script to submit a job to an <a 
    href="https://software.intel.com/en-us/iot/hardware/iei-tank-dev-kit-core">IEI 
    Tank 870-Q170</a> edge node with an <a href="https://ark.intel.com/products/88186/Intel-Core-i5-6500TE-Processor-6M-Cache-up-to-3-30-GHz-">Intel Core™ i5-6500te CPU</a> . The inference workload will run on the <a href="https://www.ieiworld.com/mustang-f100/en/"> IEI Mustang-F100-A10 </a> card installed in this node.

In [8]:
#Submit job to the queue
job_number_fpga = submit('HETERO:FPGA,CPU', 'tank-870:i5-6500te:iei-mustang-f100-a10')

Job ID: ['32114.v-qsvr-1.devcloud-edge'], #: 32114


### 2.4 Check if the jobs are done

To check on the jobs that were submitted, use a command to check the status of the job.

Column `S` shows the state of your running jobs.

For example:
- If `JOB ID`is in Q state, it is in the queue waiting for available resources.
- If `JOB ID` is in R state, it is running.

In [9]:
# Enter your command here to check the status of your jobs
%matplotlib inline
liveQstat()

Output(layout=Layout(border='1px solid gray', height='300px', width='100%'))

Button(description='Stop', style=ButtonStyle())

***Wait!***

Please wait for the inference jobs and video rendering to complete before proceeding to the next step.

## Step 3: View Results

Write a short utility script that will display these videos within the notebook.

*Tip*: See `demoutils.py` if you are interested in understanding further on how the results are displayed in notebook.

In [10]:
#Write your script for Intel Core CPU video results here
!cat RETAIL_CPU.o{job_number_cpu}
videoHTML('Results on CPU', ['./results/retail/CPU/out.mp4'])


########################################################################
#      Date:           Fri Apr 10 11:09:52 PDT 2020
#    Job ID:           32111.v-qsvr-1.devcloud-edge
#      User:           u40443
# Resources:           neednodes=1:tank-870:i5-6500te,nodes=1:tank-870:i5-6500te,walltime=01:00:00
########################################################################

[setupvars.sh] OpenVINO environment initialized
Running person_detect.py
Model: model/person-detection-retail-0013.xml
Device: CPU
Video: resources/retail.mp4
Queue: bin/queue_param/retail.npy
Output: results/retail/CPU
People: 2
Threshold: 0.7
Model loaded
Core created
Network loaded
Input key: data input shape: [1, 3, 320, 544]
Output key: detection_out
Model loaded, loading time: 0:00:01.587233
Total frames 167, processing time: 0:00:04.516149

########################################################################
# End of output for job 32111.v-qsvr-1.devcloud-edge
# Date: Fri Ap

In [11]:
#Write your script for Intel Core CPU +GPU video results here
!cat RETAIL_GPU.o{job_number_gpu}
videoHTML('Results on GPU', ['./results/retail/GPU/out.mp4'])


########################################################################
#      Date:           Fri Apr 10 11:09:56 PDT 2020
#    Job ID:           32112.v-qsvr-1.devcloud-edge
#      User:           u40443
# Resources:           neednodes=1:tank-870:i5-6500te:intel-hd-530,nodes=1:tank-870:i5-6500te:intel-hd-530,walltime=01:00:00
########################################################################

[setupvars.sh] OpenVINO environment initialized
Running person_detect.py
Model: model/person-detection-retail-0013.xml
Device: GPU
Video: resources/retail.mp4
Queue: bin/queue_param/retail.npy
Output: results/retail/GPU
People: 2
Threshold: 0.7
Model loaded
Core created
Network loaded
Input key: data input shape: [1, 3, 320, 544]
Output key: detection_out
Model loaded, loading time: 0:00:36.118535
Total frames 167, processing time: 0:00:05.499037

########################################################################
# End of output for job 32112.v-qsvr-1.dev

In [12]:
#Write your script for Intel CPU + Intel NCS2 video results here
!cat RETAIL_MYRIAD.o{job_number_ncs2}
videoHTML('Results on NCS2', ['./results/retail/MYRIAD/out.mp4'])


########################################################################
#      Date:           Fri Apr 10 11:10:00 PDT 2020
#    Job ID:           32113.v-qsvr-1.devcloud-edge
#      User:           u40443
# Resources:           neednodes=1:tank-870:i5-6500te:intel-ncs2,nodes=1:tank-870:i5-6500te:intel-ncs2,walltime=01:00:00
########################################################################

[setupvars.sh] OpenVINO environment initialized
Running person_detect.py
Model: model/person-detection-retail-0013.xml
Device: MYRIAD
Video: resources/retail.mp4
Queue: bin/queue_param/retail.npy
Output: results/retail/MYRIAD
People: 2
Threshold: 0.7
Model loaded
Core created
Network loaded
Input key: data input shape: [1, 3, 320, 544]
Output key: detection_out
Model loaded, loading time: 0:00:02.566522
Total frames 167, processing time: 0:00:24.505567

########################################################################
# End of output for job 32113.v-qsvr-1.d

In [13]:
#Write your script for Intel® Arria® 10 FPGA video results here
!cat RETAIL_FPGA.o{job_number_fpga}
videoHTML('Results on FPGA', ['./results/retail/FPGA/out.mp4'])


########################################################################
#      Date:           Fri Apr 10 11:10:02 PDT 2020
#    Job ID:           32114.v-qsvr-1.devcloud-edge
#      User:           u40443
# Resources:           neednodes=1:tank-870:i5-6500te:iei-mustang-f100-a10,nodes=1:tank-870:i5-6500te:iei-mustang-f100-a10,walltime=01:00:00
########################################################################

[setupvars.sh] OpenVINO environment initialized
INTELFPGAOCLSDKROOT is set to /opt/altera/aocl-pro-rte/aclrte-linux64. Using that.

aoc was not found, but aocl was found. Assuming only RTE is installed.

AOCL_BOARD_PACKAGE_ROOT is set to /opt/intel/openvino/bitstreams/a10_vision_design_sg1_bitstreams/BSP/a10_1150_sg1. Using that.
Adding /opt/altera/aocl-pro-rte/aclrte-linux64/bin to PATH
Adding /opt/altera/aocl-pro-rte/aclrte-linux64/host/linux64/lib to LD_LIBRARY_PATH
Adding /opt/intel/openvino/bitstreams/a10_vision_design_sg1_bitstreams/BSP/a10_1150_sg1

## Step 4: Assess Performance

This is where you need to write code to asses how well your model is performing. You will use the `stats.txt` file located in your results directory.
You need to compare the following timings for all the models across all 4 devices:

- Model loading time
- Average Inference Time
- FPS

Show your results in the form of a bar chart using matplotlib

In [17]:
#TODO Write your code here for model loading time on all 4 device types
#TODO Write your code here for model average inference time on all 4 device types
#TODO Write your code here for model FPS on all 4 device types

def get_stats(device):
    output_file = 'results/retail/' + device + '/stats.txt'

    load_time = !cat {output_file} | grep 'Model loading time'
    ind = len('[\'Model loading time: ')
    load_time = str(load_time)[ind:-2]
    
    average_frame_time = !cat {output_file} | grep 'Inference time per frame'
    ind = len('[\'Inference time per frame (ms): ')
    average_frame_time = str(average_frame_time)[ind:-2]

    fps = !cat {output_file} | grep 'Inference FPS: '
    ind = len('[\'Inference FPS: ')
    fps = str(fps)[ind:-2]
    
    return load_time, average_frame_time, fps
    

load_time_cpu, ave_cpu, fps_cpu = get_stats('CPU')
load_time_gpu, ave_gpu, fps_gpu = get_stats('GPU')
load_time_ncs2, ave_ncs2, fps_ncs2 = get_stats('MYRIAD')
load_time_fpga, ave_fpga, fps_fpga = get_stats('FPGA')

print('Device\t Loading time \t   Inference/frame(ms)\t Inference FPS')    
print('{}\t {}\t   {}\t\t {}'.format('CPU', load_time_cpu, ave_cpu, fps_cpu))
print('{}\t {}\t   {}\t\t {}'.format('GPU', load_time_gpu, ave_gpu, fps_gpu))
print('{}\t {}\t   {}\t\t {}'.format('MYRIAD', load_time_ncs2, ave_ncs2, fps_ncs2))
print('{}\t {}\t   {}\t\t {}'.format('FPGA', load_time_fpga, ave_fpga, fps_fpga))



Device	 Loading time 	   Inference/frame(ms)	 Inference FPS
CPU	 0:00:01.587233	   27.043		 36.978
GPU	 0:00:36.118535	   32.928		 30.369
MYRIAD	 0:00:02.566522	   146.740		 6.815
FPGA	 0:00:29.120876	   19.558		 51.130
