Media Analytics Software Stack

Contents

Overview
Host requirements
Input content requirements
How to install?
How to run?
- VA Samples
  - Object Detecton
  - Object Detection and Classification
- DL Streamer
Analytics architecture/solutions
Further reading

Overview

This project provides samples to demonstrate the use of Intel GPU in a simplified real-life scenarios involving media analytics. It leverages a Software Stack which consists of the following ingredients:

OpenVINO Toolkit
GStreamer plugins:
- For OpenVINO: DL Streamer
- For media: VAAPI plugins
Intel oneVPL
Intel Media SDK
Intel Media Driver
VA Samples (sources in this repo)

Provided samples focus on the key aspects of the proper Intel software setup and integration with other popular tools you will likely use in your final product. As a key aspect, they demonstrate how to connect media components to AI inference and share video data efficiently without CPU involvement.

There are 2 groups of samples provided in this repo. One is focused on GStreamer command line examples connecting media and inference elements. Another on how to write your own C++ application with OpenVINO and oneVPL.

In these samples we focus on Object Detection + Object Classification pipelines which use the following models:

ssd_mobilenet_v1_coco (INT8 quantized) - quantized from ssd_mobilenet_v1_coco
resnet-50-tf (INT8 quantized) - quantized from resnet-50-tf

See below architecture diagram for VA Sample.

Host requirements

To run these samples, you need to:

Have a system with enabled Intel GPU card supported by (refer to respective component documentation for the list of supported GPUs):
- Intel media driver (https://github.com/intel/media-driver)
- Intel OpenCL driver (https://github.com/intel/compute-runtime)
Run Linux OS with up-to-date Linux kernel supporting underlying Intel GPU
Have installed and configured Docker version 17.05 or later (see instructions)

Coming soon

Above drivers and Linux OSes might not currently support ATS-M Intel GPU cards. Specific setup instructions for systems with these cards will be provided as soons as they are available.

Input content requirements

These samples might use AVC or HEVC rough input video streams. Streams in container formats (like .mp4) are not supported by VA Samples. Mind that GStreamer elements to support such streams might not be included in docker images as well.

Please, make sure to select input video streams with the objects which AI models are capable to recognize. Arbitrary input streams might not have such objects. For the reference, we use the following models in our samples:

Object detection with SSD MobileNetV1:
- Model used: SSD MobileNetV1
- Default resolution: 300x300
Object detection with Yolov4:
- Model used: Yolov4
- Default resolution: 608x608
- Recommended resolution: 416x416 (model supports dynamic reshape)
Object сlassification
- Model used: Resnet50 v1.5
- Default resolution 224x224

Mind that input video will be converted (scaled and/or color converted to nv12 or rgbp) to satisfy inference model requirements highlighted above.

How to install?

Hint: to install a docker refer to Docker install instructions.

Samples are available in a form of Docker containers which you need to build locally.

To build docker image with VA Samples, run:

docker build \
  $(env | grep -E '(_proxy=|_PROXY)' | sed 's/^/--build-arg /') \
  --file docker/va-samples/ubuntu20.04/intel-gfx/Dockerfile \
  -t intel-va-samples \
  .

To build docker image with DL Streamer, run:

docker build \
  $(env | grep -E '(_proxy=|_PROXY)' | sed 's/^/--build-arg /') \
  --file docker/gst-gva/ubuntu20.04/intel-gfx/Dockerfile \
  -t intel-gva-samples \
  .

To build "developer" docker image (which has both VA Samples and DL Streamer Samples inside, plus has tools to download/convert/quantize models), run:

docker build \
  $(env | grep -E '(_proxy=|_PROXY)' | sed 's/^/--build-arg /') \
  --file docker/devel/ubuntu20.04/intel-gfx/Dockerfile \
  -t intel-va-devel \
  .

Above docker files will self-build OpenVINO and DL Streamer and fetch binary packages for media and compute stacks from Intel Graphics Package Repository.

Above dockerfiles are being generated from m4 templates via cmake build system. Refer to generating dockerfiles document for further details.

How to run?

Note that examples given below for VA samples and DL streamer are applicable for "developer" container build as well. For details on additional tools exposed in "developer" container, see OpenVINO tools article.

VA Samples

VA samples leverage Intel MediaSDK and OpenVino to run media analytics pipeline.

To run VA Samples enter container you've built allowing GPU access from inside the container:

DEVICE=${DEVICE:-/dev/dri/renderD128}
DEVICE_GRP=$(ls -g $DEVICE | awk '{print $3}' | \
  xargs getent group | awk -F: '{print $3}')
docker run --rm -it \
  -e DEVICE=$DEVICE --device $DEVICE --group-add $DEVICE_GRP \
  --cap-add SYS_ADMIN \
  intel-va-samples

The major artifact produced by samples is output.csv file with the following format:

channel#, frame#, object#, left, top, right, bottom, id, probability

where

channel - inference channel number
frame# - frame number starting from 1
object# - number of the object detected on a frame
left, top, right, bottom - coordinates of the detected object normalized by width/height
id - classification id
probability - probability with which object was classified

Object Detecton

Object detection can be executed with:

ObjectDetection -c 1 -d -t 5 \
  -codec 264 -i /opt/data/embedded/pexels-1388365.h264 \
  -m_detect $DEMO_MODELS/ssd_mobilenet_v1_coco_INT8/ssd_mobilenet_v1_coco

Example of output.csv file:

$ cat output.csv | head -10
0, 1, 0, 0.002710, 0.538312, 0.281266, 0.924779, 3, 0.920898
0, 1, 1, 0.889920, 0.329635, 0.934364, 0.382974, 10, 0.903809
0, 1, 2, 0.766202, 0.336393, 0.796227, 0.377465, 10, 0.833984
0, 2, 0, 0.003211, 0.538931, 0.284941, 0.923290, 3, 0.932617
0, 2, 1, 0.890183, 0.327957, 0.934696, 0.380550, 10, 0.904785
0, 2, 2, 0.767770, 0.336121, 0.796690, 0.377289, 10, 0.844727
0, 3, 0, 0.003667, 0.539677, 0.288079, 0.924863, 3, 0.939453
0, 3, 1, 0.890039, 0.327210, 0.934449, 0.380946, 10, 0.916016
0, 3, 2, 0.769145, 0.334644, 0.797796, 0.374019, 10, 0.857422
0, 4, 0, 0.003526, 0.543937, 0.292021, 0.921377, 3, 0.928223

For detailed command line options see man ObjectDetection.

Object Detection and Classification

End to end pipeline can be executed with:

SamplePipeline -c 1 -b 1 \
  -codec 264 -i /opt/data/embedded/pexels-1388365.h264 \
  -m_classify $DEMO_MODELS/resnet-50-tf_INT8/resnet-50-tf_i8 \
  -m_detect $DEMO_MODELS/ssd_mobilenet_v1_coco_INT8/ssd_mobilenet_v1_coco

If frame has multiple objects each one is classified seperately. Example of output.csv file:

$ cat output.csv | head -10
0, 1, 0, 0.002710, 0.538312, 0.281266, 0.924779, 657, 0.252905
0, 1, 1, 0.889920, 0.329635, 0.934364, 0.382974, 921, 0.696968
0, 1, 2, 0.766202, 0.336393, 0.796227, 0.377465, 921, 0.390088
0, 2, 0, 0.003211, 0.538931, 0.284941, 0.923290, 437, 0.278607
0, 2, 1, 0.890183, 0.327957, 0.934696, 0.380550, 921, 0.398567
0, 2, 2, 0.767770, 0.336121, 0.796690, 0.377289, 921, 0.509662
0, 3, 0, 0.003667, 0.539677, 0.288079, 0.924863, 657, 0.354671
0, 3, 1, 0.890039, 0.327210, 0.934449, 0.380946, 921, 0.351903
0, 3, 2, 0.769145, 0.334644, 0.797796, 0.374019, 921, 0.531966
0, 4, 0, 0.003526, 0.543937, 0.292021, 0.921377, 657, 0.251537

For detailed command line options see man SamplePipeline.

DL Streamer

DL Streamer is a streaming media analytics framework, based on GStreamer multimedia framework, for creating complex media analytic pipelines. It ensures pipeline interoperability and provides optimized media, and inference operations using Intel® Distribution of OpenVINO™ Toolkit Inference Engine backend.

To run DL Streamer samples enter container you've built allowing GPU access from inside the container:

DEVICE=${DEVICE:-/dev/dri/renderD128}
DEVICE_GRP=$(ls -g $DEVICE | awk '{print $3}' | \
  xargs getent group | awk -F: '{print $3}')
docker run --rm -it \
  -e DEVICE=$DEVICE --device $DEVICE --group-add $DEVICE_GRP \
  --cap-add SYS_ADMIN \
  intel-gva-samples

Object Detecton

Object detection can be executed with:

gst-launch-1.0 \
  filesrc location=/opt/data/embedded/pexels-1388365.h264 ! \
  h264parse ! \
  vaapih264dec ! \
  gvadetect model=$DEMO_MODELS/ssd_mobilenet_v1_coco_INT8/ssd_mobilenet_v1_coco.xml \
  device=GPU ! \
  gvafpscounter ! \
  fakesink async=false

Object Detection and Classification

Complex pipeline can be executed with:

gst-launch-1.0 \
  filesrc location=/opt/data/embedded/pexels-1388365.h264 ! \
  h264parse ! \
  vaapih264dec ! \
  gvadetect model=$DEMO_MODELS/ssd_mobilenet_v1_coco_INT8/ssd_mobilenet_v1_coco.xml \
  device=GPU ! \
  queue ! \
  gvaclassify model=$DEMO_MODELS/resnet-50-tf_INT8/resnet-50-tf_i8.xml \
  device=GPU ! \
  gvafpscounter ! \
  fakesink async=false

DL Streamer elements

DL Streamer includes inference elements as well as some helper ones. Please refer to the official documentation for more info.

You can use next command to get help about a GStreamer plugin:

gst-inspect-1.0 <element_name>

Here is a brief description of some of the elements.

gvainference - performs inference using provided model and passes raw results down the pipeline.
gvadetect - performs object detection using provided model.
gvaclassify - performs object classification using provided model.
gvafpscounter - measures frames per second and outputs result to console.
gvametaconvert and gvametapublish - can be used to publish metadata (inference results) produced by samples to an output file.
gvapython - allows user to execute custom Python code on GStreamer buffers and attached to them metadata.

Example of complex DL Streamer pipeline:

gst-launch-1.0 \
  filesrc location=/opt/data/embedded/pexels-1388365.h264 ! \
  h264parse ! \
  vaapih264dec ! \
  gvadetect model=$DEMO_MODELS/ssd_mobilenet_v1_coco_INT8/ssd_mobilenet_v1_coco.xml device=GPU ! \
  queue ! \
  gvaclassify model=$DEMO_MODELS/resnet-50-tf_INT8/resnet-50-tf_i8.xml device=GPU ! \
  gvafpscounter ! \
  gvametaconvert format=json json-indent=2 ! \
  gvametapublish file-path=/opt/data/artifacts/out.json ! \
  fakesink async=false

Analytics architecture/solutions

Media Analytic pipeline is composed of media + inference components connected in generic sense as shown in figure below.

The solution pipeline can easily be tailored to shape in different use case by changing inference components with different AI models. For example, pipeline can be used for pure classification, face recognition, smart city, VMC Summarization and other use cases as shown in figure below

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.github/workflows		.github/workflows
assets		assets
doc		doc
docker		docker
patches		patches
templates		templates
va_sample		va_sample
.dockerignore		.dockerignore
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.rst		README.rst
SECURITY.md		SECURITY.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Media Analytics Software Stack

Overview

Host requirements

Input content requirements

How to install?

How to run?

VA Samples

Object Detecton

Object Detection and Classification

DL Streamer

Object Detecton

Object Detection and Classification

DL Streamer elements

Analytics architecture/solutions

Further reading

About

Releases

Packages

Contributors 7

Languages

License

intel/media-analytics

Folders and files

Latest commit

History

Repository files navigation

Media Analytics Software Stack

About

Resources

License

Security policy

Stars

Watchers

Forks

Languages