Skip to content

FFMpeg‐DNN for Media AI workloads (openvino, pytorch, tensorflow)

FocusLuo edited this page Jun 6, 2024 · 4 revisions

Architecture Diagram

image

Model Support List

Based on open_model_zoo release 2023.0

* Detection

Models Framework Validated on iGPU (Inference) Validated on dGPU (Inference)
retinanet OpenVino FP16, FP32 FP16, FP32
face-detection-0200 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
face-detection-0202 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
face-detection-0204 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
face-detection-adas-0001 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
face-detection-retail-0004 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
face-detection-retail-0005 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
face-detection-retail-0044 OpenVino FP16, FP32 FP16, FP32
mobilenet-ssd OpenVino FP16, FP32 FP16, FP32
pedestrian-and-vehicle-detector-adas-0001 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
pedestrian-detection-adas-0002 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
pelee-coco OpenVino FP16, FP32 FP16, FP32
person-detection-0200 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
person-detection-0201 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
person-detection-0202 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
person-detection-retail-0013 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
person-vehicle-bike-detection-2000 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
person-vehicle-bike-detection-2001 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
person-vehicle-bike-detection-2002 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
person-vehicle-bike-detection-crossroad-0078 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
person-vehicle-bike-detection-crossroad-1016 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
product-detection-0001 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
ssd_mobilenet_v1_coco OpenVino FP16, FP32 FP16, FP32
ssd_mobilenet_v1_fpn_coco OpenVino FP16, FP32 FP16, FP32
ssd300 OpenVino FP16, FP32 FP16, FP32
ssd512 OpenVino FP16, FP32 FP16, FP32
ssdlite_mobilenet_v2 OpenVino FP16, FP32 FP16, FP32
vehicle-detection-0200 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
vehicle-detection-0201 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
vehicle-detection-0202 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
vehicle-detection-adas-0002 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
vehicle-license-plate-detection-barrier-0106 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
vehicle-license-plate-detection-barrier-0123 OpenVino FP16, FP32 FP16, FP32
face-detection-0205 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
face-detection-0206 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
mobilenet-yolo-v4-syg OpenVino FP16, FP32 FP16, FP32
person-detection-0106 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
person-detection-0203 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
person-detection-0301 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
person-detection-0302 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
person-detection-0303 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
person-vehicle-bike-detection-2003 OpenVino FP16, FP32 FP16, FP32
person-vehicle-bike-detection-2004 OpenVino FP16, FP32 FP16, FP32
person-vehicle-bike-detection-crossroad-yolov3-1020 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
yolo-v2-tf OpenVino FP16, FP32 FP16, FP32
yolo-v3-tf OpenVino FP16, FP32 FP16, FP32
yolo-v3-tiny-tf OpenVino FP16, FP32 FP16, FP32
yolo-v4-tf OpenVino FP16, FP32 FP16, FP32
yolo-v2-ava-0001 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
yolo-v2-ava-sparse-35-0001 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
yolo-v2-ava-sparse-70-0001 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
yolo-v2-tiny-tf OpenVino FP16, FP32 FP16, FP32
yolo-v2-tiny-ava-0001 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
yolo-v2-tiny-ava-sparse-30-0001 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
yolo-v2-tiny-ava-sparse-60-0001 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
yolo-v2-tiny-vehicle-detection-0001 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
yolo-v4-tiny-tf OpenVino FP16, FP32 FP16, FP32
yolo-v1-tiny-tf OpenVino FP16, FP32 FP16, FP32

* Classification

Models Framework Validated on iGPU (Inference) Validated on dGPU (Inference)
alexnet OpenVino FP16, FP32 FP16, FP32
anti-spoof-mn3 OpenVino FP16, FP32 FP16, FP32
caffenet OpenVino FP16, FP32 FP16, FP32
convnext-tiny OpenVino FP16, FP32 FP16, FP32
densenet-121 OpenVino FP16, FP32 FP16, FP32
densenet-121-tf OpenVino FP16, FP32 FP16, FP32
dla-34 OpenVino FP16, FP32 FP16, FP32
googlenet-v1 OpenVino FP16, FP32 FP16, FP32
googlenet-v1-tf OpenVino FP16, FP32 FP16, FP32
googlenet-v2 OpenVino FP16, FP32 FP16, FP32
googlenet-v2-tf OpenVino FP16, FP32 FP16, FP32
googlenet-v3 OpenVino FP16, FP32 FP16, FP32
googlenet-v3-pytorch OpenVino FP16, FP32 FP16, FP32
googlenet-v4-tf OpenVino FP16, FP32 FP16, FP32
hbonet-0.25 OpenVino FP16, FP32 FP16, FP32
hbonet-1.0 OpenVino FP16, FP32 FP16, FP32
inception-resnet-v2-tf OpenVino FP16, FP32 FP16, FP32
netvlad-tf OpenVino FP16, FP32 FP16, FP32
nfnet-f0 OpenVino FP16, FP32 FP16, FP32
octave-resnet-26-0.25 OpenVino FP16, FP32 FP16, FP32
open-closed-eye-0001 OpenVino FP16, FP32 FP16, FP32
regnetx-3.2gf OpenVino FP16, FP32 FP16, FP32
se-inception OpenVino FP16, FP32 FP16, FP32
se-resnet-50 OpenVino FP16, FP32 FP16, FP32
se-resnext-50 OpenVino FP16, FP32 FP16, FP32
shufflenet-v2-x0.5 OpenVino FP16, FP32 FP16, FP32
shufflenet-v2-x1.0 OpenVino FP16, FP32 FP16, FP32
Sphereface OpenVino FP16, FP32 FP16, FP32
squeezenet1.0 OpenVino FP16, FP32 FP16, FP32
squeezenet1.1 OpenVino FP16, FP32 FP16, FP32
swin-tiny-patch4-window7-224 OpenVino FP16, FP32 FP16, FP32
swin-tiny-patch4-window7-224 OpenVino FP16, FP32 FP16, FP32
t2t-vit-14 OpenVino FP16, FP32 FP16, FP32
vehicle-reid-0001 OpenVino FP16, FP32 FP16, FP32
vgg16 OpenVino FP16, FP32 FP16, FP32
vgg19 OpenVino FP16, FP32 FP16, FP32
repvgg-a0 OpenVino FP16, FP32 FP16, FP32
repvgg-b1 OpenVino FP16, FP32 FP16, FP32
repvgg-b3 OpenVino FP16, FP32 FP16, FP32
resnest-50-pytorch OpenVino FP16, FP32 FP16, FP32
resnet-18-pytorch OpenVino FP16, FP32 FP16, FP32
resnet18-xnor-binary-onnx-0001 OpenVino FP16, FP32 FP16, FP32
resnet-34-pytorch OpenVino FP16, FP32 FP16, FP32
resnet50-binary-0001 OpenVino FP16, FP32 FP16, FP32
resnet-50-pytorch OpenVino FP16, FP32 FP16, FP32
resnet-50-tf OpenVino FP16, FP32 FP16, FP32
efficientnet-v2-b0 OpenVino FP16, FP32 FP16, FP32
efficientnet-v2-s OpenVino FP16, FP32 FP16, FP32
efficientnet-b0-pytorch OpenVino FP16, FP32 FP16, FP32
levit-128s OpenVino FP16, FP32 FP16, FP32
mobilenet-v1-1.0-224-tf OpenVino FP16, FP32 FP16, FP32
mobilenet-v2-pytorch OpenVino FP16, FP32 FP16, FP32
mobilenet-v1-0.25-128 OpenVino FP16, FP32 FP16, FP32
mobilenet-v1-1.0-224 OpenVino FP16, FP32 FP16, FP32
mobilenet-v2 OpenVino FP16, FP32 FP16, FP32
mobilenet-v2-1.0-224 OpenVino FP16, FP32 FP16, FP32
mobilenet-v2-1.4-224 OpenVino FP16, FP32 FP16, FP32
mobilenet-v3-large-1.0-224-paddle OpenVino FP16, FP32 FP16, FP32
mobilenet-v3-large-1.0-224-tf OpenVino FP16, FP32 FP16, FP32
mobilenet-v3-small-1.0-224-paddle OpenVino FP16, FP32 FP16, FP32
mobilenet-v3-small-1.0-224-tf OpenVino FP16, FP32 FP16, FP32
person-reidentification-retail-0277 OpenVino FP16, FP32 FP16, FP32
person-reidentification-retail-0288 OpenVino FP16, FP32 FP16, FP32
image-retrieval-0001 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
weld-porosity-detection-0001 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
face-reidentification-retail-0095 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
facial-landmarks-35-adas-0002 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
person-attributes-recognition-crossroad-0234 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
person-reidentification-retail-0286 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
person-reidentification-retail-0287 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
vehicle-attributes-recognition-barries-0039 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32
vehicle-attributes-recognition-barries-0042 OpenVino FP16-INT8, FP16, FP32 FP16-INT8, FP16, FP32

* Image Processing

Models Framework Validated on iGPU (Inference) Validated on dGPU (Inference)
basicVSR libtorch - inference
basicVSR++ libtorch - inference
RVRT libtorch - inference
fast-neural-style-mosaic-onnx OpenVino FP16, FP32 FP16, FP32
fbcnn OpenVino FP16, FP32 FP16, FP32

command example

dnn_detect

ssd

ffmpeg -i input.png -vf format=rgb24,dnn_detect=dnn_backend=openvino:model=intel/person-detection-0200/FP32/person-detection-0200.xml -y output.jpg

yolov1-v2

ffmpeg -i input.jpg -vf dnn_detect=dnn_backend=openvino:model_type=yolo:cell_w=19:cell_h=19:nb_classes=80:model=public/yolo-v2-tf/FP32/yolo-v2-tf.xml:anchors="0.57273f&0.677385f&1.87446f&2.06253f&3.33843f&5.47434f&7.88282f&3.52778f&9.77052f&9.16828f" -y output.jpg

yolov3

ffmpeg -i input.jpg -vf dnn_detect=dnn_backend=openvino:model=public/yolo-v3-tf/FP32/yolo-v3-tf.xml:nb_classes=80:model_type=yolov3:anchors=/"10&13&16&30&33&23&30&61&62&45&59&119&116&90&156&198&373&326/" -y output.jpg

yolov4

ffmpeg -v verbose -i input.jpg -vf dnn_detect=dnn_backend=openvino:model_type=yolov4:nb_classes=80:model=public/yolo-v4-tf/FP32/yolo-v4-tf.xml:anchors="36&75&76&55&72&146&142&110&192&243&459&401&12&16&19&36&40&28" -y output.jpg

dnn_classify

ffmpeg -i input.jpg -vf dnn_detect=dnn_backend=openvino:model=public/mobilenet-ssd/FP32/mobilenet-ssd.xml,dnn_classify=dnn_backend=openvino:model=public/googlenet-v3/FP32/googlenet-v3.xml:labels=public/googlenet-v3/imagenet_slim_labels.txt -y ./output.jpg

dnn_processing

ffmpeg -i input.jpg -vf format=rgb24,dnn_processing=dnn_backend=openvino:model=public/fast-neural-style-mosaic-onnx/FP32/fast-neural-style-mosaic-onnx.xml:input_resizable=1" -y output.png

command options

input and output

If you want to only use some of the model I/O ports or want to use ports in particular order, you can specify the input and output port.

ffmpeg -i input.jpg -vf format=rgb24,dnn_detect=dnn_backend=openvino:model=public/yolo-v3-tf/FP32/yolo-v3-tf.xml:nb_classes=80:input=input_1:output=/"conv2d_74/Conv2D/YoloRegion&conv2d_66/Conv2D/YoloRegion&conv2d_58/Conv2D/YoloRegion/":model_type=yolov3:anchors=/"10&13&16&30&33&23&30&61&62&45&59&119&116&90&156&198&373&326/" -y output.jpg

If the name of input and output port contain special character, you can use '/'.

ffmpeg -v verbose -i input.jpg -vf dnn_detect=dnn_backend=openvino:model_type=yolov4:nb_classes=80:model=public/yolo-v4-tf/FP32/yolo-v4-tf.xml:input=image_input:output='Func/StatefulPartitionedCall/output/_542\:0&Func/StatefulPartitionedCall/output/_543\:0&Func/StatefulPartitionedCall/output/_544\:0':anchors="36&75&76&55&72&146&142&110&192&243&459&401&12&16&19&36&40&28" -y output.jpg

showinfo

For detect and classify filter, the result is stored in bbox sidedata of AVFrame. You can use "showinfo" filter to print them in log.

ffmpeg -i input.jpg -vf dnn_detect=dnn_backend=openvino:model=intel/person-detection-0200/FP32/person-detection-0200.xml,showinfo -f null -

drawbox

You can use "drawbox" and "drawtext" filters to draw detected bbox and write label in frame.

ffmpeg -i input.jpg -vf dnn_detect=dnn_backend=openvino:model=intel/person-detection-0200/FP32/person-detection-0200.xml,drawbox=box_source=side_data_detection_bboxes,drawtext=text_source=side_data_detection_bboxes -y output.jpg

device

You can use backend option "deivce" to choose the target device (CPU, GPU and NPU)

ffmpeg -i input.jpg -vf dnn_detect=dnn_backend=openvino:model=intel/person-detection-0200/FP32/person-detection-0200.xml:device=GPU -y output.jpg