Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable FFTW threading by default (to match up to performance of octave and others) #17000

Closed
loganwilliams opened this issue Jun 18, 2016 · 64 comments
Labels
needs decision A decision on this change is needed performance Must go faster

Comments

@loganwilliams
Copy link

I've noticed that Julia is an order of magnitude slower to compute FFTs than GNU Octave. This discrepancy in speed confuses me, given that bought Octave and Julia ought to be calling the same FFTW library. Is this expected?

Times for Julia:

julia> R = rand(512,512);
julia> @time fft(R);
  0.042149 seconds (76 allocations: 8.003 MB)

julia> R = rand(5000,5000);
julia> @time fft(R);
  6.212666 seconds (76 allocations: 762.943 MB, 1.17% gc time)

Times for Octave:

>> R = rand(512,512);
>> tic; fft2(R); toc;
Elapsed time is 0.00377011 seconds.

>> R = rand(5000,5000);
>> tic; fft2(R); toc;
Elapsed time is 0.556037 seconds.

After setting FFTW.set_num_threads=2, and using rfft instead of fft, I saw a small improvement in Julia's performance, but a large discrepancy still remains.

julia> R = rand(512,512);
julia> @time rfft(R);
  0.018692 seconds (93 allocations: 2.012 MB)

julia> R = rand(5000,5000);
julia> @time rfft(R);
  1.736385 seconds (99 allocations: 190.816 MB, 0.95% gc time)

I have reproduced this issue on my personal computer (OS X 10.11.5), and on a Google Compute Engine VM running Ubuntu 16.04. Here is my Julia versioninfo():

Julia Version 0.4.5
Commit 2ac304d (2016-03-18 00:58 UTC)
Platform Info:
  System: Darwin (x86_64-apple-darwin13.4.0)
  CPU: Intel(R) Core(TM) i5-5257U CPU @ 2.70GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.3
@loganwilliams loganwilliams changed the title FFT operation an order of magnitude slower than in Octave FFT in Julia is an order of magnitude slower than in Octave Jun 18, 2016
@ufechner7
Copy link

ufechner7 commented Jun 18, 2016

Please read: http://docs.julialang.org/en/release-0.4/manual/performance-tips/
In particular: 1) put the code you want to benchmark into functions 2) run this function twice, because when you call it the first time it is getting compiled.

@loganwilliams
Copy link
Author

I had already read both tips. Wrapping fft(R) (which is a single function call) in a second function did nothing to improve performance. I also already ran each timing statement twice -- I just excluded the first, unhelpful, timing result for brevity's sake.

@ViralBShah
Copy link
Member

Cc @stevengj

@ufechner7
Copy link

ufechner7 commented Jun 18, 2016

Hello,
I benchmarked your example on my computer. My hardware: i7-2600K CPU @ 3.40GHz × 4
OS: Ubuntu Linux 14.04, 64 bits
CPU governor: performance (!)
Results:
R=rand(512,512)
Julia, 1 thread rfft(R): 1.6 - 1.7 ms
Julia, 2 threads rfft(R): 1.3 ms
Julia, 4 threads rfft(R): 1.1 ms
Octave fft2(R): 1.1 ms

R=rand(5000,5000)
Julia, 1 thread rfft(R): 0.31 .. 0.32 s
Julia, 2 threads rfft(R): 0.17 s
Julia, 4 threads rfft(R): 0.10 .. 011 s
Octave fft2(R): 0.17 s

Summary: Julia with two threads is about as fast as Octave. Julia with four threads is
much faster than Octave, but only for large problems.

Versioninfo:
Julia Version 0.4.2
Commit bb73f34 (2015-12-06 21:47 UTC)
Platform Info:
System: Linux (x86_64-linux-gnu)
CPU: Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz
WORD_SIZE: 64
BLAS: libopenblas (NO_AFFINITY SANDYBRIDGE)
LAPACK: liblapack.so.3
LIBM: libopenlibm
LLVM: libLLVM-3.3

GNU Octave, version 3.8.1
Copyright (C) 2014 John W. Eaton and others.
This is free software; see the source code for copying conditions.
There is ABSOLUTELY NO WARRANTY; not even for MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. For details, type 'warranty'.

Octave was configured for "x86_64-pc-linux-gnu".

@ufechner7
Copy link

I upgraded to Julia 0.4.5 from the Ubuntu ppa, but the timing results do not change.
julia> versioninfo()
Julia Version 0.4.5
Commit 2ac304d (2016-03-18 00:58 UTC)
Platform Info:
System: Linux (x86_64-linux-gnu)
CPU: Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz
WORD_SIZE: 64
BLAS: libopenblas (NO_AFFINITY SANDYBRIDGE)
LAPACK: liblapack.so.3
LIBM: libopenlibm
LLVM: libLLVM-3.3

@loganwilliams
Copy link
Author

loganwilliams commented Jun 18, 2016

Hmmm. Yesterday, I had thought I was able to reproduce this on a VM, but I must have been mistaken, because attempting it again now on an Ubuntu Google Cloud instances I get the same results that you have shown.

@ufechner7
Copy link

You are right, I had a typo. It is 0.17s for the 5000x5000 matrix with octave. Does this mean, that the issue can be closed, or is there still a problem on OS X?

@loganwilliams
Copy link
Author

I'm still experiencing the issue on my computer. Is there any additional information I can provide to help diagnose?

@StefanKarpinski
Copy link
Sponsor Member

Since this sort of issue has come up a few times, perhaps we should add documentation to the various FFT functions in Julia about what they are comparable to in other languages (Matlab, R, etc.).

@yuyichao
Copy link
Contributor

I agree it should be documented better but I don't think that was ever the issue here. (The very first post of this issue uses fft in julia and fft2 in octave).

@ViralBShah ViralBShah added the needs docs Documentation for this change is required label Jun 18, 2016
@kshyatt kshyatt added the performance Must go faster label Jun 19, 2016
@loganwilliams
Copy link
Author

loganwilliams commented Jun 19, 2016

Alright, I found another Mac OS X computer to test this on, but it was quite old, running OS X 10.8.5.

for 512x512 matrix
.0089 seconds on average in Julia (4 threads, using rfft)
.0070 seconds on average in Octave (default threads (not sure what that is) using fft2)

for 5000x5000 matrix
1.27 seconds on average in Julia
1.19 seconds on average in Octave

Here, Julia seems just slightly slower than Octave.

Julia version info:

Julia Version 0.4.5
Commit 2ac304d (2016-03-18 00:58 UTC)
Platform Info:
  System: Darwin (x86_64-apple-darwin13.4.0)
  CPU: Intel(R) Core(TM) i5-3330S CPU @ 2.70GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Sandybridge)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.3

Octave version info: (running an older version because that was all I could get running quickly on OS 10.8.5.)

----------------------------------------------------------------------
GNU Octave Version 3.2.3
GNU Octave License: GNU General Public License
Operating System: Darwin 12.5.0 Darwin Kernel Version 12.5.0: Sun Sep 29 13:33:47 PDT 2013; root:xnu-2050.48.12~1/RELEASE_X86_64 x86_64
----------------------------------------------------------------------
no packages installed.

My computer running OS X 10.11.5 continues to exhibit the order of magnitude performance difference. Can anyone else reproduce this?

@loganwilliams
Copy link
Author

Here's the result of profiling the rfft of a 5000x5000 matrix on my 10.11.5 computer: http://pastebin.com/nmr5Nyvn

@jaakkor2
Copy link
Contributor

Julia on my MacBook 3x slower than Octave

julia> versioninfo()
Julia Version 0.4.5
Commit 2ac304d (2016-03-18 00:58 UTC)
Platform Info:
  System: Darwin (x86_64-apple-darwin13.4.0)
  CPU: Intel(R) Core(TM)2 Duo CPU     P7350  @ 2.00GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Penryn)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.3
julia> FFTW.set_num_threads(1);R=rand(5000,5000);@time a=rfft(R);
  7.072357 seconds (99 allocations: 190.816 MB, 0.32% gc time)

julia> FFTW.set_num_threads(2);R=rand(5000,5000);@time a=rfft(R);
  3.810004 seconds (99 allocations: 190.816 MB, 0.65% gc time)
octave:5> R=rand(5000,5000); tic, a=fft2(R); toc
Elapsed time is 1.35603 seconds.
octave:6> ver
----------------------------------------------------------------------
GNU Octave Version 3.8.2
GNU Octave License: GNU General Public License
Operating System: Darwin 15.5.0 Darwin Kernel Version 15.5.0: Tue Apr 19 18:36:36 PDT 2016; root:xnu-3248.50.21~8/RELEASE_X86_64 x86_64
----------------------------------------------------------------------
no packages installed.

Maybe of interest, size of libraries in /usr/local/octave/3.8.2/lib

# ls -al libfftw3f.3.dylib 
-rwxr-xr-x  1 root  admin  1678704 20 Elo  2014 libfftw3f.3.dylib

and `/Applications/Julia-0.4.5.app/Contents/Resources/julia/lib/julia``

# ls -al libfftw3f.3.dylib
-rwxr-xr-x@ 1 jaakko  admin  6666908 18 Maa 03:12 libfftw3f.3.dylib

Used installation binaries
https://sourceforge.net/projects/octave/files/Octave%20MacOSX%20Binary/2014-09-25-Binary-of-GNU-Octave-3.8.2-for-OSX-10.9.5/
https://s3.amazonaws.com/julialang/bin/osx/x64/0.4/julia-0.4.5-osx10.7+.dmg

@loganwilliams
Copy link
Author

Replacing the FFTW libraries from the Julia Mac OS X package with the FFTW libraries from the Octave Mac OS X package fixes the issue. Julia is now faster than Octave.

julia> R = rand(512,512);

julia> FFTW.set_num_threads(2);

julia> @time rfft(R);
  0.347049 seconds (390.80 k allocations: 19.633 MB, 1.41% gc time)

julia> @time rfft(R);
  0.001880 seconds (93 allocations: 2.012 MB)

julia> @time rfft(R);
  0.002026 seconds (93 allocations: 2.012 MB)

julia> @time rfft(R);
  0.003032 seconds (93 allocations: 2.012 MB, 788.36% gc time)

julia> R = rand(5000,5000);

julia> @time rfft(R);
  0.339954 seconds (99 allocations: 190.816 MB, 0.27% gc time)

julia> @time rfft(R);
  0.337303 seconds (99 allocations: 190.816 MB, 2.07% gc time)

julia> @time rfft(R);
  0.330447 seconds (99 allocations: 190.816 MB, 9.41% gc time)

@timholy
Copy link
Sponsor Member

timholy commented Jun 19, 2016

It would seem interesting to know the difference in how the two libraries were compiled.

@zhmz90
Copy link
Contributor

zhmz90 commented Jun 20, 2016

The speed of rfft on my machine with the default Julia FFTW libraries is close to @loganwilliams 's post above

julia> FFTW.set_num_threads(2);

julia> R = rand(512,512);

julia> @time rfft(R);
  0.003687 seconds (88 allocations: 2.013 MB)

julia> @time rfft(R);
  0.003386 seconds (88 allocations: 2.013 MB)

julia> @time rfft(R);
  0.003434 seconds (88 allocations: 2.013 MB)

julia> @time rfft(R);
  0.003323 seconds (88 allocations: 2.013 MB)
julia> R = rand(5000,5000);

julia> @time rfft(R);
  0.257353 seconds (93 allocations: 190.816 MB, 0.50% gc time)

julia> @time rfft(R);
  0.254915 seconds (93 allocations: 190.816 MB, 6.54% gc time)

julia> @time rfft(R);
  0.271649 seconds (93 allocations: 190.816 MB, 36.09% gc time)

julia> @time rfft(R);
  0.270884 seconds (93 allocations: 190.816 MB, 0.61% gc time)

julia> @time rfft(R);
  0.274651 seconds (93 allocations: 190.816 MB, 0.59% gc time)

julia> @time rfft(R);
  0.268775 seconds (93 allocations: 190.816 MB, 0.62% gc time)

The above test is just on master version of Julia which is a few days old .
I updated my Julia to the latest today but the performance is similar.

julia> versioninfo()
Julia Version 0.5.0-dev+4877
Commit 02ac2b1* (2016-06-20 22:32 UTC)
Platform Info:
  System: Linux (x86_64-linux-gnu)
  CPU: Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.7.1 (ORCJIT, haswell)


octave:1>  R=rand(5000,5000); tic, a=fft2(R); toc
Elapsed time is 0.22766 seconds.
octave:2>  R=rand(5000,5000); tic, a=fft2(R); toc
Elapsed time is 0.201882 seconds.
octave:3>  R=rand(5000,5000); tic, a=fft2(R); toc
Elapsed time is 0.173801 seconds.
octave:4> ver
----------------------------------------------------------------------
GNU Octave Version 3.8.1
GNU Octave License: GNU General Public License
Operating System: Linux 3.16.0-30-generic #40~14.04.1-Ubuntu SMP Thu Jan 15 17:43:14 UTC 2015 x86_64
----------------------------------------------------------------------
Package Name  | Version | Installation directory
--------------+---------+-----------------------
          io  |   2.2.9 | /home/guo/octave/io-2.2.9
         mpi *|   1.1.1 | /usr/share/octave/packages/mpi-1.1.1
  statistics  |   1.2.4 | /home/guo/octave/statistics-1.2.4

@ufechner7
Copy link

@zhmz90: Could you please first, mention which computer (cpu, clock speed) you use, and second, also test the speed with octave?

@zhmz90
Copy link
Contributor

zhmz90 commented Jun 21, 2016

@ufechner7 I have added my test to the above post. The result shows fft2 in Octave is faster than rfft in Julia. In Julia, I set the FFTW.set_num_threads(2); while in octave did noting since I am not familiar with octave.

@ufechner7
Copy link

@zhmz90: Is it a mac or Linux machine? If Linux, which distribution/ version?

@zhmz90
Copy link
Contributor

zhmz90 commented Jun 22, 2016

@ufechner7

guo@x02:~$ uname -a
Linux x02 3.16.0-30-generic #40~14.04.1-Ubuntu SMP Thu Jan 15 17:43:14 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

@loganwilliams
Copy link
Author

Who maintains the OS X package distribution?

@tkelman
Copy link
Contributor

tkelman commented Jun 24, 2016

which one? exactly how did you install julia and which package distribution are you referring to?

@loganwilliams
Copy link
Author

loganwilliams commented Jun 24, 2016

@tkelman Uh, the one displayed very prominently on Julia's web page: http://julialang.org/downloads/

Exact version info is in my first post.

I have resolved my personal issue by replacing the libraries with libraries from Octave's distribution, but as @timholy noted, "It would seem interesting to know the difference in how the two libraries were compiled."

@tkelman
Copy link
Contributor

tkelman commented Jun 24, 2016

Just wanted to check that you weren't getting it from homebrew or similar. So that build is produced from running a complete source build on our mac buildbots. Our makefile flags for fftw can be found under deps (maybe a handful of related flags in Make.inc but I think those are mostly enabling or disabling different dependencies).

@fguevaravas
Copy link

I can report similar performance issues on my machine (Mac OS X 10.11.5, 4GHz Intel Core i7). Octave is about 4 times faster than Julia to compute FFTs. As @loganwilliams suggested I copied the fftw libraries from the Octave package and this improved things. But Julia is still about 50% slower than Octave. See below for the results.

In Octave (freshly installed from: https://sourceforge.net/projects/octave/files/Octave%20MacOSX%20Binary/2016-06-06-binary-octave-4.0.2/octave_gui_402.dmg/download )

>> fftw('threads')
ans =  8
>> x = randn(5000,5000); tic; y=fft2(x); toc
Elapsed time is 0.348789 seconds.
>> x = randn(5000,5000); tic; y=fft2(x); toc
Elapsed time is 0.37518 seconds.
>> x = randn(5000,5000); tic; y=fft2(x); toc
Elapsed time is 0.369055 seconds.
>> ver
----------------------------------------------------------------------
GNU Octave Version: 4.0.2
GNU Octave License: GNU General Public License
Operating System: Darwin 15.5.0 Darwin Kernel Version 15.5.0: Tue Apr 19 18:36:36 PDT 2016; root:xnu-3248.50.21~8/RELEASE_X86_64 x86_64
----------------------------------------------------------------------
no packages installed.

Julia with FFTW shipping with Julia package: https://s3.amazonaws.com/julialang/bin/osx/x64/0.4/julia-0.4.6-osx10.7+.dmg

julia> FFTW.set_num_threads(8)

julia> x = randn(5000,5000); @time y=fft(x);
  1.178509 seconds (77 allocations: 762.943 MB, 2.92% gc time)

julia> x = randn(5000,5000); @time y=fft(x);
  1.201879 seconds (76 allocations: 762.943 MB, 4.64% gc time)

julia> x = randn(5000,5000); @time y=fft(x);
  1.327285 seconds (76 allocations: 762.943 MB, 6.36% gc time)

julia> versioninfo()
Julia Version 0.4.6
Commit 2e358ce (2016-06-19 17:16 UTC)
Platform Info:
  System: Darwin (x86_64-apple-darwin13.4.0)
  CPU: Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.3

Julia with FFTW from octave package.

julia> FFTW.set_num_threads(8)

julia> x = randn(5000,5000); @time y=fft(x);
  0.512738 seconds (76 allocations: 762.943 MB, 20.49% gc time)

julia> x = randn(5000,5000); @time y=fft(x);
  0.506314 seconds (76 allocations: 762.943 MB, 12.40% gc time)

julia> x = randn(5000,5000); @time y=fft(x);
  0.503848 seconds (76 allocations: 762.943 MB, 16.77% gc time)

@raminammour
Copy link
Contributor

If it helps anyone, I had the same issue and I copied the fftw3* library files from julia 0.4.5 to julia 0.4.6 and recovered similar runtime to Matlab.

@ranjanan
Copy link
Contributor

ranjanan commented Aug 2, 2016

I have a related question: isn't Julia starting up only with 1 FFTW thread by default? If so, why is that done when OpenBLAS is made to start with multiple threads?

@stevengj
Copy link
Member

On MacOS with Julia master built from source, I get

julia> unsafe_string(cglobal((:fftw_cc, FFTW.libfftw), UInt8))
"clang -stdlib=libc++ -mmacosx-version-min=10.7 -m64  -O3 -fomit-frame-pointer -mtune=native -fstrict-aliasing -fno-schedule-insns -ffast-math"

julia> unsafe_string(cglobal((:fftw_codelet_optim, FFTW.libfftw), UInt8))
""

which looks okay.

@stevengj
Copy link
Member

With the official Julia 0.4.6 binary on MacOS, I get

julia> bytestring(cglobal((:fftw_cc, FFTW.libfftw), UInt8))
"clang -stdlib=libc++ -mmacosx-version-min=10.7 -march=core2 -integrated-as -m64  -I/usr/local/include"

julia> bytestring(cglobal((:fftw_codelet_optim, FFTW.libfftw), UInt8))
""

which looks like zero compiler optimizations, which will definitely hurt performance.

@tkelman
Copy link
Contributor

tkelman commented Aug 23, 2016

This is the same issue as #17751 (comment). The mac buildbot has a whole bunch of profile scripts that set CFLAGS and a number of other environment variables that are overriding things. It's potentially a fixable bug in various upstream build systems as in JuliaMath/openlibm#142, but that's way messier to do for all of our upstreams than to remove the profile scripts.

@stevengj
Copy link
Member

(Here is how FFTW picks its cflags: https://github.com/FFTW/fftw3/blob/master/m4/ax_cc_maxopt.m4)

@tkelman
Copy link
Contributor

tkelman commented Aug 23, 2016

Though the lack of optimization flags due to buildbot misconfiguration is a separate issue than the current title of

Enable FFTW threading by default

it is contributing to the difference in performance even when using the same number of threads.

@ufechner7
Copy link

So a second issue should be created, like "Lack of optimization of FFTW due to buildbot misconfiguration"

@ViralBShah
Copy link
Member

Or just change the title of this issue?

@tkelman
Copy link
Contributor

tkelman commented Aug 24, 2016

Enable FFTW threading by default

Is a valid, but separate, issue to the buildbot flags problem. The latter should now be resolved, I believe.

@goszlanyi
Copy link

It is not only MacOS.
Both Win64 and generic Linux precompiled binaries (version 0.4.5) return:

julia> bytestring(cglobal((:fftw_codelet_optim, FFTW.libfftw), UInt8))
""

@stevengj
Copy link
Member

@GaborOszlanyi, that just means that the codelets are compiled with the same flags as the rest of FFTW, which is not necessarily a problem. Look at bytestring(cglobal((:fftw_cc, FFTW.libfftw), UInt8)).

@goszlanyi
Copy link

fftw_cc is OK in both cases.
Sorry for the misunderstanding.

@ufechner7
Copy link

I still would like to create a second issue, because this are two different issues:
a) enabling multi threading by default; this needs a decision
b) Lack of optimization of FFTW due to buildbot misconfiguration; just needs to be fixed, and perhaps backported to 0.4 and 0.5
This are two different issues. If nobody opposes my proposal, I open a second issue.

@tkelman
Copy link
Contributor

tkelman commented Aug 24, 2016

@ufechner7 the second issue was already fixed, and does not require any changes in this repository.

@ufechner7
Copy link

So issue b) this will be fixed in the next binary releases of 0.4 and 0.5? That would be nice.

You write, that fixing the buildbots "does not require any changes in this repository". Is there a separate repository for the buildbots?

@staticfloat
Copy link
Sponsor Member

Yep, the main one is wittingly named julia-buildbot

@tkelman
Copy link
Contributor

tkelman commented Aug 25, 2016

issue b) this will be fixed in the next binary releases of 0.4 and 0.5

Should be, if the fix was complete and correct. Once we resolve #18079 it should also be testable with 0.6-dev nightlies.

@raminammour
Copy link
Contributor

As of yesterday the 0.4.6 Linux 64 binaries (julia-2e358ce975) shipped with the slow fftw libraries, which looked the same as the ones in 0.5 binaries.

As I noted before, the binaries from 0.4.5 (2ac304d) are good, using those in 0.4.6 buys you a factor 5 speedup.

Cheers!

@ViralBShah
Copy link
Member

@tkelman Is this something we can fix in 0.5.x?

@tkelman
Copy link
Contributor

tkelman commented Aug 26, 2016

Are you asking about multithreading, or are you asking about the other issue which now has its own #18245?

Enabling threading by default is a bit much of a behavior change to backport I think.

@ViralBShah
Copy link
Member

I thought #18245 was referring here for the fix for single-threaded perf.

I don't think we should backport threading by default, but we should probably do it on master sooner rather than later for 0.6.

@tkelman
Copy link
Contributor

tkelman commented Aug 26, 2016

Let's keep the discussions separate from now on. This issue is titled

Enable FFTW threading by default

and should stay focused on that going forward if we can.

@jongwook
Copy link

jongwook commented Sep 7, 2016

Just a side question to @tkelman out of curiosity, will FFTW move to a package in favor of other FFT implementation like MKL? is it related to the GPL of FFTW? thanks 😃

@ViralBShah ViralBShah added this to the 0.6.0 milestone Sep 10, 2016
@tkelman tkelman removed this from the 0.6.0 milestone Jan 5, 2017
@tkelman
Copy link
Contributor

tkelman commented Jul 21, 2017

This issue should be reopened on FFTW.jl

@tkelman tkelman closed this as completed Jul 21, 2017
@ViralBShah
Copy link
Member

FFTW integration with julia's partr threads: JuliaMath/FFTW.jl#105

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs decision A decision on this change is needed performance Must go faster
Projects
None yet
Development

No branches or pull requests