Running caffe on AMD GPUs #2

gsedej · 2017-05-12T10:10:00Z

Hello!
The reason I am opening issue here is because I didn't find better place for discussion of caffe running on AMD gpus. If there is better place for discussion, please say so.

So I was able to install ROCm 1.5 on my Ubuntu 16.04 running on i7 6700 and Radeon RX 480 8GB.

I did manage to build hipCaffe and run mnist example using AMD GPU. I also tried same sample on cpu (using multithreaded OpenBLAS) and the result were not that impressive.

I am testing using command time ./examples/mnist/train_lenet.sh
When using CPU only (8 threads), the result is:

real	7m38.942s
user	22m0.692s
sys	34m6.000s

When using GPU (rx 480 + ROCm 1.5):

real	5m46.945s
user	5m27.120s
sys	0m7.256s

The speed is better also CPU is more free.
Compared to out "gpu grid" server when using 1x nvidia Titan X:

real	2m0.855s
user	1m34.332s
sys	0m43.948s

I do understand that Titan X is much faster than RX 480 and nvidia has put MUCH more resources in to deep learning optimisation.
Are those results expected, or should I get better? Anyone else tried hipCaffe on any AMD GPU?

Your system configuration

Operating system: Ubuntu 16.04 + ROCm 1.5 (kernel 4.9-kfd) + RX 480 8GB
BLAS: OpenBLAS (and amd hip BLAS variant)
Python or MATLAB version (for pycaffe and matcaffe respectively): no python/matlab

The text was updated successfully, but these errors were encountered:

gsedej · 2017-05-12T10:11:27Z

I also replied to an issue in "ROCm" github tracker: ROCm/ROCm#86

gstoner · 2017-05-12T13:10:29Z

You are comparing for Titan X a foundation that leverages cuDNN which is optimized set fo solver for running deep learning. Current our Caffe framework is just using IM2COL and GEMM. So its, not an Apple to Apple test.

We have a piece of the puzzle that attacks critical performance bottleneck in deep learning; Optimized solvers, this will remove the performance gap. Our solution MIOpen is how we are approaching Deep Learning Solver; this will have all of the core functionality at launch, and we will be building from there. We just have not released it into the market yet. We are not far from this happing.

The Caffe library was released not to see how fast it is so that we can start the conversation of upstreaming of the libraries. But I am glad you could build it and run it, this is the current feedback we are looking for from the developers.

vicproon · 2017-05-15T11:46:36Z

@gstoner looking forward towards the release of MIOpen!

gsedej · 2017-05-15T12:00:37Z

@gstoner thanks for reply.
I do understand comparing rx480 vs titanx both in performce and drivers.

I am interested is if someone gets better results on AMD gpus, compared to CPU-only when training. If i set my it 6700 8thread cpu to use only 4 theads (OPENBLAS_NUM_THREADS=4) i get better results than when used with hip+rx 480

export OPENBLAS_NUM_THREADS=4
time ./examples/mnist/train_lenet.sh
real	4m54.193s
user	9m47.540s
sys	4m49.708s

Does the release of opencl source code () by amd have any influence on hipCaffe
()

If this "issue" is not meant to be open, please close it

gsedej · 2017-05-18T12:42:05Z

I just wated to report again that when learning my own network, I noticed MUCH better speedup on radeon vs cpu. (i just updated ROCm 1.5.80)
this is when using cpu (i7, 6700, all 8 threads)

real	9m11.601s
user	23m6.120s
sys	47m11.336s

when using rx 480

real	3m37.433s
user	3m5.044s
sys	0m6.564s

Notice that cpu usage was small (sys)

bensander · 2017-05-18T13:06:56Z

Excellent. Mninst is small workload so will show minimal gpu acceleration. Also this versions of caffe is not yet optimized with miopen , which will boost performance significantly. On May 18, 2017, at 7:42 AM, gsedej <notifications@github.com<mailto:notifications@github.com>> wrote: I just wated to report again that when learning my own network, I noticed MUCH better speedup on radeon vs cpu. (i just updated ROCm 1.5.80) this is when using cpu (i7, 6700, all 8 threads) real 9m11.601s user 23m6.120s sys 47m11.336s when using rx 480 real 3m37.433s user 3m5.044s sys 0m6.564s Notice that cpu usage was small (sys) — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub<#2 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/ACYSAliDc8ffe_91A9NVo714K_OOiuzLks5r7DyegaJpZM4NZGEU>.

daveselinger · 2017-08-30T00:40:53Z

@gsedej Do you mind sharing how you got ROCM up and running. I've been trying for about a week and I'm just having about zero luck. I've put some GIST's up to document what I'm doing and I'm not quite sure. It's probably something pretty stupid but I'm new to the AMD side.

Do I use the amdgpu pro driver?
Do I need to install the APP-SDK as well?
I am installing the ROCM kernel using apt and the instructions on ROCM website...

https://gist.github.com/daveselinger/9504a8496bef102a5b60613106255621
https://gist.github.com/daveselinger/8cba6d41eaa70b220725091390ff52c1

gsedej · 2017-08-30T10:57:01Z

Hi. @daveselinger ! I didnt use hip-caffe for some time, since I am using segnet (caffe based for segmentation) that has own layers that (probably) doesn't work with hip-caffe. (somebody would need to rewrite added layers to .hpp)

Are you familiar with ordinary caffe? I was working on ordinary caffe and just copied data and prototext files over to hip-caffe and it was working.

I only have ubuntu 16.04 and rocm that I installed like in instructions. hipcaffe needs to be compiled.

No fglrx or amdgpu-pro.

I will try if it's still running and report

gsedej · 2017-08-30T11:54:56Z

So the old compiled hipCaffe does not work, so I tried compiling from source, but it breaks because I have updated mesa 3D opengl drivers, that relay on llvm 5.0 (clang) but probably rocm/hip does not yet support llvm 5.0.

Anyway you need to have installed libraries like: hip_base hip_hcc miopen-hip hipblas

What kind of error do you get?

gsedej · 2017-08-30T12:17:37Z

So i did manage to compile and run mnist example. But i had to disable opencv in Makefile.conf (uncomment USE_OPENCV := 0), probably due to CLANG/LLVM version and opencv version

daveselinger · 2017-08-30T23:16:58Z

@gsedej THANKS! So I'm obviously over-thinking the software part. I'm going through the process now of testing other MB's and other GPU's. I'll keep you posted. THANK YOU SO MUCH for the quick response!

daveselinger · 2017-08-31T05:41:41Z

@gsedej OK, so switching cards apparently makes it work. For whatever reason, the RX550 does not work, but the RX580 works fine. I was pretty surprised by this, but I'm pretty amazed at how easy it is once you make that switch! :) I've also tried the OpenCL driver on the 550 with Theano and that works about as well as my concrete shoes are good at swimming...

cathalgarvey mentioned this issue May 31, 2017

Everything fails, hardcoded jenkins paths in binaries #3

Closed

gstoner closed this as completed Jul 6, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running caffe on AMD GPUs #2

Running caffe on AMD GPUs #2

gsedej commented May 12, 2017

gsedej commented May 12, 2017

gstoner commented May 12, 2017

vicproon commented May 15, 2017

gsedej commented May 15, 2017

gsedej commented May 18, 2017

bensander commented May 18, 2017 via email

daveselinger commented Aug 30, 2017

gsedej commented Aug 30, 2017

gsedej commented Aug 30, 2017

gsedej commented Aug 30, 2017

daveselinger commented Aug 30, 2017

daveselinger commented Aug 31, 2017

Running caffe on AMD GPUs #2

Running caffe on AMD GPUs #2

Comments

gsedej commented May 12, 2017

Your system configuration

gsedej commented May 12, 2017

gstoner commented May 12, 2017

vicproon commented May 15, 2017

gsedej commented May 15, 2017

gsedej commented May 18, 2017

bensander commented May 18, 2017 via email

daveselinger commented Aug 30, 2017

gsedej commented Aug 30, 2017

gsedej commented Aug 30, 2017

gsedej commented Aug 30, 2017

daveselinger commented Aug 30, 2017

daveselinger commented Aug 31, 2017