Skip to content

Update Intel's Inference Engine deep learning backend#11587

Merged
vpisarev merged 3 commits intoopencv:3.4from
dkurt:dnn_ie_future
May 31, 2018
Merged

Update Intel's Inference Engine deep learning backend#11587
vpisarev merged 3 commits intoopencv:3.4from
dkurt:dnn_ie_future

Conversation

@dkurt
Copy link
Copy Markdown
Member

@dkurt dkurt commented May 24, 2018

This pullrequest changes

  • Update Intel's Inference Engine backend support (OpenVINO). (This PR is not compatible with beta releases of Inference Engine).
  • Intel® Movidius™ Neural Compute Stick computational target (DNN_TARGET_MYRIAD)

Measured efficiency in milliseconds (median times):

Model name CV CPU CV GPU, fp32 CV GPU, fp16 IE CPU IE GPU, fp32 IE GPU, fp16 IE NCS
AlexNet 13.896 17.253 11.999 11.784 12.074 8.554 82.426
DenseNet_121 53.407 92.322 114.797 27.770 60.182 ---- ---
ENet 43.899 45.791 --- --- --- ---- ---
GoogLeNet 14.436 26.842 27.057 8.635 20.407 15.927 105.209
Inception_5h 15.595 32.168 31.484 --- --- ---- ---
Inception_v2_SSD_TensorFlow 42.287 79.025 89.259 33.557 --- ---- 336.046
MobileNet_SSD_Caffe 18.972 22.117 26.790 7.960 25.772 20.071 94.525
MobileNet_v1_SSD_TensorFlow 22.11 24.26 28.49 14.95 --- ---- 114.91
MobileNet_v2_SSD_TensorFlow 30.50 49.98 56.03 21.51 --- ---- 243.43
OpenFace 3.524 12.573 14.801 3.019 --- ---- ---
OpenPose_pose_coco 874.148 1056.600 1160.916 536.416 762.814 437.578 ---
OpenPose_pose_mpi 855.136 1037.866 1146.179 530.876 753.180 431.936 ---
OpenPose_pose_mpi_faster_4_stages 585.762 746.806 805.990 376.030 528.614 291.959 ---
ResNet_50 34.316 37.396 41.666 19.430 45.441 30.939 210.283
SSD, VGG16 259.178 393.397 358.791 193.240 242.875 135.590 1739.619
SqueezeNet_v1_1 3.672 9.470 10.894 2.512 7.865 6.924 45.237
YOLOv3 207.946 361.693 366.220 180.727 --- ---- ---
opencv_face_detector 13.784 33.526 39.373 7.631 18.165 14.665 106.338

CV CPU: DNN_BACKEND_DEFAULT, DNN_TARGET_CPU
CV GPU, fp32: DNN_BACKEND_DEFAULT, DNN_TARGET_OPENCL
CV GPU, fp16: DNN_BACKEND_DEFAULT, DNN_TARGET_OPENCL_FP16
IE CPU: DNN_BACKEND_INFERENCE_ENGINE, DNN_TARGET_CPU
IE GPU, fp32: DNN_BACKEND_INFERENCE_ENGINE, DNN_TARGET_OPENCL
IE GPU, fp16: DNN_BACKEND_INFERENCE_ENGINE, DNN_TARGET_OPENCL_FP16
IE NCS: DNN_BACKEND_INFERENCE_ENGINE, DNN_TARGET_MYRIAD

CPU: Intel® Core™ i7-6700K CPU @ 4.00GHz x 8
GPU: Intel® HD Graphics 530 (Skylake GT2)
NCS: Intel® Movidius™ Neural Compute Stick

@pengli
Copy link
Copy Markdown

pengli commented May 25, 2018

@dkurt , Hi, could you share the test method to get the result in the table ?

@dkurt
Copy link
Copy Markdown
Member Author

dkurt commented May 25, 2018

@pengli, This is a summary of https://github.com/opencv/opencv/blob/3.4/modules/dnn/perf/perf_net.cpp performance tests. You may build OpenCV with -DBUILD_PERF_TESTS=ON and run
./bin/opencv_perf_dnn --gtest_filter=DNNTestNetwork.*

To download models, run https://github.com/opencv/opencv_extra/blob/3.4/testdata/dnn/download_models.py and export the following paths to environment:

export OPENCV_TEST_DATA_PATH=/path/to/opencv_extra/testdata/
export OPENCV_DNN_TEST_DATA_PATH=/path/to/opencv_extra/testdata/

Instructions of how to enable Inference Engine backend at OpenCV: https://github.com/opencv/opencv/wiki/Intel%27s-Deep-Learning-Inference-Engine-backend

Please note that resulting table is formatted in Github's Markdown syntax. Two models MobileNet_v1_SSD_TensorFlow and MobileNet_v2_SSD_TensorFlow are not represented in tests (I tested them replacing paths at MobileNet_SSD_TensorFlow test case).

set(INF_ENGINE_LIBRARIES "")

set(ie_lib_list inference_engine)
set(ie_lib_list inference_engine cpu_extension)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to link with CPU extension directly?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mshabunin, Yes, we can emit it.

@dkurt
Copy link
Copy Markdown
Member Author

dkurt commented May 29, 2018

Removed cpu_extension dependency which means we still split networks on multiple Inference Engine subgraphs by unsupported layers. By default, it gives efficiency gaps (see OpenFace, YOLOv3). In case of several Inference Engine graphs on CPU we need to set OMP_WAIT_POLICY to PASSIVE to improve an efficiency of the pipelines. This way we can achieve similar performance (see last two columns) using a single IE graph with wrapped OpenCV layers as custom ones or using multiple IE graphs.

In average, OMP_WAIT_POLICY=PASSIVE is about 10% slower in the following measurements.

Model with custom layers, unset OMP_WAIT_POLICY w/o custom layers, unset OMP_WAIT_POLICY with custom layers, OMP_WAIT_POLICY=PASSIVE w/o custom layers, OMP_WAIT_POLICY=PASSIVE
AlexNet 12.156 12.076 12.184 12.253
DenseNet_121 27.459 27.968 30.719 31.235
GoogLeNet 8.791 8.787 9.392 9.381
Inception_v2_SSD_TensorFlow 33.635 33.914 35.629 36.115
MobileNet_SSD_Caffe 8.188 8.235 8.887 8.951
OpenFace 3.113 18.084 4.139 4.439
OpenPose_pose_coco 532.661 542.425 533.296 540.710
OpenPose_pose_mpi 523.934 533.354 523.928 537.293
OpenPose_pose_mpi_faster_4_stages 371.634 374.209 371.636 374.838
ResNet_50 19.645 19.572 19.929 19.879
SSD 194.348 198.954 191.108 197.539
SqueezeNet_v1_1 2.562 2.548 2.815 2.855
YOLOv3 180.793 502.741 180.965 172.600
opencv_face_detector 7.685 7.724 8.427 8.551

@dkurt
Copy link
Copy Markdown
Member Author

dkurt commented May 30, 2018

Added OpenCL target tests for YOLOv3 model

Model name CV CPU CV GPU, fp32 CV GPU, fp16 IE CPU IE GPU, fp32 IE GPU, fp16 IE NCS
YOLOv3 206.46ms 341.41ms 354.62ms 166.85ms 224.59ms 132.78ms ---

@vpisarev
Copy link
Copy Markdown
Contributor

@dkurt, is it ready to be merged? I do not have any objections from my side

@dkurt
Copy link
Copy Markdown
Member Author

dkurt commented May 31, 2018

@vpisarev, Maybe the only thing is that I didn't find a way to disable active threads waiting despite an environment variable OMP_WAIT_POLICY. So it should be done by user for now.

@vpisarev
Copy link
Copy Markdown
Contributor

@dkurt, it looks like there is no standard way to do it :( https://stackoverflow.com/questions/32970102/how-to-control-global-openmp-settings-from-c-c. Let's merge your patch in 👍

@vpisarev vpisarev merged commit f96f934 into opencv:3.4 May 31, 2018
@alalek alalek mentioned this pull request Jun 4, 2018
@dkurt dkurt deleted the dnn_ie_future branch August 27, 2018 14:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants