Running caffe on AMD GPUs #2
Comments
I also replied to an issue in "ROCm" github tracker: ROCm/ROCm#86 |
You are comparing for Titan X a foundation that leverages cuDNN which is optimized set fo solver for running deep learning. Current our Caffe framework is just using IM2COL and GEMM. So its, not an Apple to Apple test. We have a piece of the puzzle that attacks critical performance bottleneck in deep learning; Optimized solvers, this will remove the performance gap. Our solution MIOpen is how we are approaching Deep Learning Solver; this will have all of the core functionality at launch, and we will be building from there. We just have not released it into the market yet. We are not far from this happing. The Caffe library was released not to see how fast it is so that we can start the conversation of upstreaming of the libraries. But I am glad you could build it and run it, this is the current feedback we are looking for from the developers. |
@gstoner looking forward towards the release of MIOpen! |
@gstoner thanks for reply. I am interested is if someone gets better results on AMD gpus, compared to CPU-only when training. If i set my it 6700 8thread cpu to use only 4 theads (
Does the release of opencl source code () by amd have any influence on hipCaffe If this "issue" is not meant to be open, please close it |
I just wated to report again that when learning my own network, I noticed MUCH better speedup on radeon vs cpu. (i just updated ROCm 1.5.80)
when using rx 480
Notice that cpu usage was small (sys) |
Excellent. Mninst is small workload so will show minimal gpu acceleration. Also this versions of caffe is not yet optimized with miopen , which will boost performance significantly.
On May 18, 2017, at 7:42 AM, gsedej <notifications@github.com<mailto:notifications@github.com>> wrote:
I just wated to report again that when learning my own network, I noticed MUCH better speedup on radeon vs cpu. (i just updated ROCm 1.5.80)
this is when using cpu (i7, 6700, all 8 threads)
real 9m11.601s
user 23m6.120s
sys 47m11.336s
when using rx 480
real 3m37.433s
user 3m5.044s
sys 0m6.564s
Notice that cpu usage was small (sys)
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub<#2 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/ACYSAliDc8ffe_91A9NVo714K_OOiuzLks5r7DyegaJpZM4NZGEU>.
|
@gsedej Do you mind sharing how you got ROCM up and running. I've been trying for about a week and I'm just having about zero luck. I've put some GIST's up to document what I'm doing and I'm not quite sure. It's probably something pretty stupid but I'm new to the AMD side.
https://gist.github.com/daveselinger/9504a8496bef102a5b60613106255621 |
Hi. @daveselinger ! I didnt use hip-caffe for some time, since I am using segnet (caffe based for segmentation) that has own layers that (probably) doesn't work with hip-caffe. (somebody would need to rewrite added layers to .hpp) Are you familiar with ordinary caffe? I was working on ordinary caffe and just copied data and prototext files over to hip-caffe and it was working. I only have ubuntu 16.04 and rocm that I installed like in instructions. hipcaffe needs to be compiled. No fglrx or amdgpu-pro. I will try if it's still running and report |
So the old compiled hipCaffe does not work, so I tried compiling from source, but it breaks because I have updated mesa 3D opengl drivers, that relay on llvm 5.0 (clang) but probably rocm/hip does not yet support llvm 5.0. Anyway you need to have installed libraries like: hip_base hip_hcc miopen-hip hipblas What kind of error do you get? |
So i did manage to compile and run mnist example. But i had to disable opencv in Makefile.conf (uncomment USE_OPENCV := 0), probably due to CLANG/LLVM version and opencv version |
@gsedej THANKS! So I'm obviously over-thinking the software part. I'm going through the process now of testing other MB's and other GPU's. I'll keep you posted. THANK YOU SO MUCH for the quick response! |
@gsedej OK, so switching cards apparently makes it work. For whatever reason, the RX550 does not work, but the RX580 works fine. I was pretty surprised by this, but I'm pretty amazed at how easy it is once you make that switch! :) I've also tried the OpenCL driver on the 550 with Theano and that works about as well as my concrete shoes are good at swimming... |
Hello!
The reason I am opening issue here is because I didn't find better place for discussion of caffe running on AMD gpus. If there is better place for discussion, please say so.
So I was able to install ROCm 1.5 on my Ubuntu 16.04 running on i7 6700 and Radeon RX 480 8GB.
I did manage to build hipCaffe and run mnist example using AMD GPU. I also tried same sample on cpu (using multithreaded OpenBLAS) and the result were not that impressive.
I am testing using command
time ./examples/mnist/train_lenet.sh
When using CPU only (8 threads), the result is:
When using GPU (rx 480 + ROCm 1.5):
The speed is better also CPU is more free.
Compared to out "gpu grid" server when using 1x nvidia Titan X:
I do understand that Titan X is much faster than RX 480 and nvidia has put MUCH more resources in to deep learning optimisation.
Are those results expected, or should I get better? Anyone else tried hipCaffe on any AMD GPU?
Your system configuration
Operating system: Ubuntu 16.04 + ROCm 1.5 (kernel 4.9-kfd) + RX 480 8GB
BLAS: OpenBLAS (and amd hip BLAS variant)
Python or MATLAB version (for pycaffe and matcaffe respectively): no python/matlab
The text was updated successfully, but these errors were encountered: