Disable FFT cache #12307

mreineck · 2020-06-04T08:40:39Z

This addresses #12297 by disabling the internal plan cache of the FFT module completely.
While this may sound like a drastic solution, I'm pretty sure the performance consequences will be minor.
1D FFTs will become somewhat slower, but they have never been a strong point of pocketfft anyway. Users who require really high 1D FFT performance are better served with pyFFTW.
(Note, however, that for FFTs which are executed only once, the difference is pretty small.)
Multi-D FFTs will not suffer noticeably from a disabled plan cache, and I expect that most CPU time spent in scipy FFT calls will be inside multi-d transforms.

This is just a proposal of a simple fix for the problem raised in the issue. I'm sure there are more efficient approaches, but they will most likely be nontrivial, so maybe we should just apply this for the time being.

larsoner · 2020-06-04T13:18:13Z

Rather than disable it entirely, I'd rather at least make it opt-in. For example, start with a cache size / number of elements of zero, but allow changing this with some public function to be larger.

@peterbell10 any ideas?

mreineck · 2020-06-04T13:44:39Z

Rather than disable it entirely, I'd rather at least make it opt-in. For example, start with a cache size / number of elements of zero, but allow changing this with some public function to be larger.

I'm not necessarily disagreeing, but there may be unexpected complications: for example, scipy.fft may be used twice in the same Python job by completely unrelated modules. The second one to configure the FFT cache will overwrite the choices made by the first one. This could be quite a can of worms.

peterbell10 · 2020-06-04T14:11:48Z

Can we see the benchmark results for a range of different 1D fft sizes? I expect smaller transforms benefit the most anyway, so only caching transforms less than a fixed size might be better than disabling the cache completely.

mreineck · 2020-06-04T14:31:28Z

Can we see the benchmark results for a range of different 1D fft sizes?

Sorry, I don't have an environment available at the moment where I can easily build scipy. I do my own benchmarking and testing with my standalone FFT package, but this isn't representative, since it has evolved quite a bit.

peterbell10 · 2020-07-07T17:34:04Z

I've benchmarked this and it does consistently slow down 1D transforms by 15-25% unfortunately. There also doesn't seem to be much of a length dependence here so setting a maximum plan size to cache would still significantly hurt 1D transform performance.

User-configurable cache sizes could be a good option, and I think could be done without too much extra code.

larsoner · 2020-07-07T18:03:56Z

This could be quite a can of worms.

Agreed it is a bit of a can of worms, but if we can help speed up the code of 95% of users (I think some reasonable, enabled-by-default caching would do this) and give the remaining 5% a way to disable or configure things I think we should.

User-configurable cache sizes could be a good option, and I think could be done without too much extra code.

Sounds good to me. Not sure what the default cache size should be (unlimited as it is now? some percentage of the available memory?). But in any case, people wanting no caching would just set the cache size to zero, saving them from any potential can-of-worms (and known memory) problems. Some way to clear the cache but keep it the same size might be nice, though I guess setting it to zero then back to a non-zero value would accomplish this, so maybe an explicit clearing function is not needed.

mreineck · 2020-07-07T18:31:23Z

I've benchmarked this and it does consistently slow down 1D transforms by 15-25% unfortunately.

This is much less slowdown than I'd expected. My personal choice would definitely be to have no cache in this situation, given the tradeoff between additional complexity (both in the implementation and the user interface) and the potential gains. The decision is of course entirely up to you (i.e. the scipy developers), I just wanted to give my 2ct.

(As an anecdote: before the C version of pocketfft was integrated, numpy had an FFT cache, which actually made short 1D FFTs slower by integral factors, because the cache lookup was so slow. Nobody complained about this - but people will complain about machines starting to swap. numpy doesn't cache FFTs any more ... :)

peterbell10 · 2020-07-07T19:16:25Z

At the end of the day it comes down to the user's use case. If they are using multi-dimensional transforms then the cache is probably no more than a few % performance gain, so dropping the cache would be fine. If their use case is lots of repeated 1D transforms then a 15-25% performance gain is IMO still valuable. Though, obviously not as drastic as the likes of FFTW's plans.

mreineck · 2022-01-03T18:58:39Z

I think this fell victim to the master -> main transition. For me it's fine to leave it closed; whenever someone notices high memory consumption, they can still be pointed to this PR for a potential way of solving it.

Once scipy allows C++17 extensions, I'll be able to offer a better alternative :-)

rgommers · 2022-01-03T19:03:29Z

Sounds good, thanks @mreineck

Once scipy allows C++17 extensions, I'll be able to offer a better alternative :-)

Maybe in one or two years - C++14 is okay now. Blockers for C++17 are documented in http://scipy.github.io/devdocs/dev/toolchain.html#c-language-standards

mreineck · 2022-01-03T19:26:54Z

Thanks for the link!
I just double-checked,and unfortunately I cannot really downgrade my code to C++14 ... the main problem is usage of if constexpr().
Even if potential inclusion is quite some time away, do you think it makes sense to open an issue (or even a WIP PR) even now to discuss the technical details, play around with benchmarks etc.?

rgommers · 2022-01-03T20:45:59Z

Even if potential inclusion is quite some time away, do you think it makes sense to open an issue (or even a WIP PR) even now to discuss the technical details, play around with benchmarks etc.?

That's a question for @peterbell10 I think.

disable FFT cache

3042a99

miladsade96 added the scipy.fft label Jun 4, 2020

mreineck mentioned this pull request Nov 27, 2021

BUG: Possible memory leaks in signal.welch? #15104

Open

rgommers deleted the branch scipy:master January 3, 2022 16:32

rgommers closed this Jan 3, 2022

rgommers mentioned this pull request Jan 3, 2022

DOC: Update the docs of InterpolatedUnivariateSpline and LSQUnivariateSpline. <fitpack2>. #14100

Closed

mreineck mentioned this pull request Jan 23, 2023

Can we get PocketFFT ported to Tensorflow? tensorflow/tensorflow#56685

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disable FFT cache #12307

Disable FFT cache #12307

mreineck commented Jun 4, 2020

larsoner commented Jun 4, 2020

mreineck commented Jun 4, 2020

peterbell10 commented Jun 4, 2020

mreineck commented Jun 4, 2020

peterbell10 commented Jul 7, 2020

larsoner commented Jul 7, 2020

mreineck commented Jul 7, 2020

peterbell10 commented Jul 7, 2020

mreineck commented Jan 3, 2022

rgommers commented Jan 3, 2022

mreineck commented Jan 3, 2022

rgommers commented Jan 3, 2022

Disable FFT cache #12307

Disable FFT cache #12307

Conversation

mreineck commented Jun 4, 2020

larsoner commented Jun 4, 2020

mreineck commented Jun 4, 2020

peterbell10 commented Jun 4, 2020

mreineck commented Jun 4, 2020

peterbell10 commented Jul 7, 2020

larsoner commented Jul 7, 2020

mreineck commented Jul 7, 2020

peterbell10 commented Jul 7, 2020

mreineck commented Jan 3, 2022

rgommers commented Jan 3, 2022

mreineck commented Jan 3, 2022

rgommers commented Jan 3, 2022