Accelerating darch using MKL #18

lizhongz · 2016-08-17T14:29:11Z

Hi, my group wants to use darch for our DNN. The training took very long about 1.5 days for our use case, so we decided to speed it up by exploiting Intel MKL which automatically offloads some computations to our Xeon Phi coprocessors.

I have recompiled R using MKL and linked it to MKL's BLAS and LAPACK. MKL is able to offload computations to Xeon Phi for operations like matrix multiplication. However, MKL automatic offloading does not happen when darch is running. I was wondering if darch uses R's default BLSA or LAPACK (in this case, MKL BLAS and LAPACK), or its own implementation. If not, is the a way to explore MKL and Xeon Phi?

Thanks,
-- Lizhong

saviola777 · 2016-08-17T16:36:40Z

Hello,

as detailed here, MKL support is working on my test machine when gputools has been (left) disabled. darch uses R's default implementations for matrix multiplication in most cases, but some algorithms have been written in C++, which provides a speedup for single-core systems but may be a slowdown when using MKL, but definitely not to the degree that "automatic offloading does not happen". Maybe I should provide parameters to disable these C++ implementations.

Please provide more details about the parameters and dataset used to run darch so that I may reproduce the MKL issue. What behavior do you see when using darch 0.10?

lizhongz · 2016-08-17T19:49:30Z

@saviola777 Thanks for your quick reply. We are running darch 0.12.0 and the gputools are not installed.

Session info

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] doRNG_1.6 rngtools_1.2.4 pkgmaker_0.22 registry_0.3 foreach_1.4.3
[6] darch_0.12.0

darsh DNN commnd

darchModel_50_10_1<-darch(llrRankedTrain[,1:50],llrRankedTrain$target,layers=c(50,50,11,2),darch.unitFunction=exponentialLinearUnit)

Input data

The input data size is about 1,000,000 x 50

Is version 0.10 the non optimized version for single core?

saviola777 · 2016-08-18T06:09:55Z

Thanks for the feedback. I think the C++ implementation of the unit functions (more specifically of the ELU) is to blame for the lack of multi-threading in this case. I will have to investigate how I can make use of multi-threading from within the C++ code, but I'm afraid that it's going to be non-trivial (also considering that I'm not very experienced when it comes to writing C++ code).

Version 0.10 does not include the C++ optimizations, but it lacks many of the new features (e.g., it does not support ELU) and contains a number of bugs and problems which were fixed in 0.10. You can of course add your own unit functions dynamically in 0.10 if you want.

There are two possible solutions to this problem:

introduce a switch for the C++ implementations of unit functions, which would be simple
make use of multi-threading from within the C++ code, which would probably be the best solution

I can't promise you an update with a fix on CRAN for a while, and I'm not sure when I'll get around to fix this, but I will try to implement the first solution within the next weeks so that you can check if the problem is solved by it.

saviola777 · 2018-05-05T08:02:18Z

Just a couple of… weeks later, this should finally be fixed, I moved most C++ functions to RcppParallel, so you should see a significant speedup.

saviola777 self-assigned this Aug 17, 2016

saviola777 added bug wait for feedback labels Aug 17, 2016

saviola777 removed the wait for feedback label Aug 18, 2016

saviola777 closed this as completed in 721d2e0 May 5, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Accelerating darch using MKL #18

Accelerating darch using MKL #18

lizhongz commented Aug 17, 2016 •

edited

saviola777 commented Aug 17, 2016

lizhongz commented Aug 17, 2016 •

edited

saviola777 commented Aug 18, 2016

saviola777 commented May 5, 2018

Accelerating darch using MKL #18

Accelerating darch using MKL #18

Comments

lizhongz commented Aug 17, 2016 • edited

saviola777 commented Aug 17, 2016

lizhongz commented Aug 17, 2016 • edited

Session info

darsh DNN commnd

Input data

saviola777 commented Aug 18, 2016

saviola777 commented May 5, 2018

lizhongz commented Aug 17, 2016 •

edited

lizhongz commented Aug 17, 2016 •

edited