Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA backend for the opencv_dnn #1010

Open
pi-null-mezon opened this issue Feb 20, 2017 · 17 comments

Comments

7 participants
@pi-null-mezon
Copy link

commented Feb 20, 2017

It is not a bug, but question about new feature. After some experiments with Caffe and opencv_dnn I have found that for a present moment Caffe with CUDA performs forward propagation (in average, across different networks) 25 times faster than the opencv_dnn with LAPACK or OPENCL. So, it is evident that CUDA gives great speed advantage in this task. Could anybody add CUDA backend to opencv_dnn?

System information (version)
  • OpenCV => 3.2
  • Windows 64 Bit
  • Compiler => Visual Studio 2015 / Visual Studio 2013 / Mingw 5.2
@dkurt

This comment has been minimized.

Copy link
Member

commented Aug 2, 2017

@pi-null-mezon, we've added Halide backend since the issue was opened. It let us choose OpenCL computational target and run networks on GPU (even NVidia). We'll experiment with CUDA target and compare efficiency later.
On the other hand, default CPU efficiency has been dramatically improved last time. You may see efficiency comparison at table.

@pi-null-mezon

This comment has been minimized.

Copy link
Author

commented Aug 2, 2017

Hello @dkurt! Thanks for the good news! Am I right that according to the performance table, you've provided above, the fastest backend now is DNN C++ but not DNN Hallide?

@dkurt

This comment has been minimized.

Copy link
Member

commented Aug 3, 2017

@pi-null-mezon, You are right. In most cases default backend is more efficient on CPU. But Halide backend is now one and only way to run models on GPU. So if you have powerful GPU on board you can use OpenCV to run networks on it.

@pi-null-mezon

This comment has been minimized.

Copy link
Author

commented Aug 14, 2017

@dkurt, how can I switch opencv_dnn backend from C++ to Halide if I am working on Windows? Am I right that I need to download Halide binaries and rebuild opencv with some kind of USE_HALLIDE flags turned on?

@dkurt

This comment has been minimized.

Copy link
Member

commented Aug 14, 2017

@pi-null-mezon, unfortunately, the worst thing is LLVM and there is no pre-compiled LLVM binaries. But you may try to use truncated version of it (I've downloaded it by svn co on Ubuntu). We have some instruction for Windows in tutorial How to enable Halide backend for improve efficiency.

As far as I remember, we have no Halide in our testing system for Windows, Linux with OpenCL only. So we could miss some bugs there. Anyway, you may create an issue if something wont work out.

@pi-null-mezon

This comment has been minimized.

Copy link
Author

commented Aug 30, 2017

@dkurt hello! Finally I have build Opencv with Halide on Windows. At least it works, but one thing I can not find in the tutorials is how to make a selection between different GPUs on machine to perform calculations. For the instance I've got two GPU: Intel HD Graphics and AMD Radeon. How can I force Opencv to use particular one?

@dkurt

This comment has been minimized.

Copy link
Member

commented Aug 30, 2017

@pi-null-mezon, according to Halide documentation, you may select device id just by environment variable: export HL_GPU_DEVICE=1 for Linux or set HL_GPU_DEVICE=1 for Windows. I tested locally that it switches either between CPU and GPU (in short words between devices of clinfo output on Linux).

@pi-null-mezon

This comment has been minimized.

Copy link
Author

commented Aug 31, 2017

@dkurt thanks! GPU computations work! But results after dnn::net::forward() are not similar to CPU version. I need to make more tests and maybe will open new issue. Thanks!

@TechnikEmpire

This comment has been minimized.

Copy link

commented Dec 21, 2017

@pi-null-mezon how did your tests work out? I'm wondering if I should bother putting in the effort to build the halide back end on Windows.

@pi-null-mezon

This comment has been minimized.

Copy link
Author

commented Dec 22, 2017

@TechnikEmpire you definitely should try it, but watch out #9530

@TechnikEmpire

This comment has been minimized.

Copy link

commented Dec 22, 2017

@pi-null-mezon Cool thanks, but if it fails completely with GPU backend then that sort of defeats the purpose for me. I get a decent framerate using default backend and CPU with yahoo nsfw model, but I'm looking for a portable way to try and speed that up on the GPU when available. Last time I checked, the halide backend on CPU didn't perform as well.

@baoson202

This comment has been minimized.

Copy link

commented Dec 11, 2018

@dkurt thanks! GPU computations work! But results after dnn::net::forward() are not similar to CPU version. I need to make more tests and maybe will open new issue. Thanks!

you run GPU computations work . Did you call cv::dnn::Net::setHalideScheduler ? . I skipped call setHalideScheduler and it crash.

@kaangoksal

This comment has been minimized.

Copy link

commented Dec 11, 2018

Does it work out of the box? How do we configure the CUDA backend for this?

@TechnikEmpire

This comment has been minimized.

Copy link

commented Dec 11, 2018

Everyone here - stop messing about with CUDA and Halide and just use the inference engine, which is now open source.

This is the best possible performance you can squeeze out of DNN and it does not disappoint.

@dkurt

This comment has been minimized.

Copy link
Member

commented Dec 11, 2018

@TechnikEmpire, IE cannot run deep learning models on NVIDIA GPUs. And OpenCV for now have no CUDA backend as well. One of the possible ways is to test Halide backend with CUDA target.

@TechnikEmpire

This comment has been minimized.

Copy link

commented Dec 11, 2018

@dkurt Yeah I know, I was just throwing it out there that the IE is a very good, well optimized back end targeting CPU. Was letting people know because I was blown away by the performance. I realize a GPU accelerated back end can still out-perform a CPU backend.

@garybradski

This comment has been minimized.

Copy link
Contributor

commented Feb 1, 2019

We plan on leading a Google Summer of Code project to add a GPU backend for DNN. If you can help, see the idea page for OpenCV GSoC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.