-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
Description
Matlab users who employ "properly" vectorized code may notice a dramatic slowdown in going to Julia because most Matlab vector operations are multithreaded while Julia operations are not. There is an intrinsic difficulty here because Julia code cannot be multithreaded (see issue #1790), but as a stopgap measure I suppose that you could call out to C code that uses OpenMP to parallelize important vector operations.
For example, compare the following Matlab (R2012a) code (on 16-core x86_64 3GHz AMD Opteron 4280, Debian GNU/Linux)
>> x = rand(10000000,1);
Elapsed time is 0.211827 seconds.
>> tic; y = exp(x); toc
Elapsed time is 0.035116 seconds.
>> tic; y = x.^2 + 3*x.^3; toc
Elapsed time is 0.159840 seconds.
with the Julia equivalent:
julia> x = rand(10000000,1);
julia> tic(); y = exp(x); toc()
elapsed time: 0.3979671001434326 seconds
julia> tic(); y = x.^2 + 3*x.^3; toc()
elapsed time: 2.3250949382781982 seconds
Julia is more than 10x slower. The difference is almost entirely due to multithreading, since if I run Matlab with -singleCompThread I get:
>> tic; y = exp(x); toc
Elapsed time is 0.406924 seconds.
>> tic; y = x.^2 + 3*x.^3; toc
Elapsed time is 2.005534 seconds.
Also needed would be a way to specify the number of threads to use (e.g. you could use the standard OMP_NUM_THREADS environment variable by default, while providing some way to change it from within Julia). The nice thing about using OpenMP is that you can also use OpenMP for the OpenBLAS library and for FFTW (and possibly other libraries?), and use the same thread-pool for all three.