Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable FFT cache #12307

Closed
wants to merge 1 commit into from
Closed

Disable FFT cache #12307

wants to merge 1 commit into from

Conversation

mreineck
Copy link
Contributor

@mreineck mreineck commented Jun 4, 2020

This addresses #12297 by disabling the internal plan cache of the FFT module completely.
While this may sound like a drastic solution, I'm pretty sure the performance consequences will be minor.
1D FFTs will become somewhat slower, but they have never been a strong point of pocketfft anyway. Users who require really high 1D FFT performance are better served with pyFFTW.
(Note, however, that for FFTs which are executed only once, the difference is pretty small.)
Multi-D FFTs will not suffer noticeably from a disabled plan cache, and I expect that most CPU time spent in scipy FFT calls will be inside multi-d transforms.

This is just a proposal of a simple fix for the problem raised in the issue. I'm sure there are more efficient approaches, but they will most likely be nontrivial, so maybe we should just apply this for the time being.

@larsoner
Copy link
Member

larsoner commented Jun 4, 2020

Rather than disable it entirely, I'd rather at least make it opt-in. For example, start with a cache size / number of elements of zero, but allow changing this with some public function to be larger.

@peterbell10 any ideas?

@mreineck
Copy link
Contributor Author

mreineck commented Jun 4, 2020

Rather than disable it entirely, I'd rather at least make it opt-in. For example, start with a cache size / number of elements of zero, but allow changing this with some public function to be larger.

I'm not necessarily disagreeing, but there may be unexpected complications: for example, scipy.fft may be used twice in the same Python job by completely unrelated modules. The second one to configure the FFT cache will overwrite the choices made by the first one. This could be quite a can of worms.

@peterbell10
Copy link
Member

Can we see the benchmark results for a range of different 1D fft sizes? I expect smaller transforms benefit the most anyway, so only caching transforms less than a fixed size might be better than disabling the cache completely.

@mreineck
Copy link
Contributor Author

mreineck commented Jun 4, 2020

Can we see the benchmark results for a range of different 1D fft sizes?

Sorry, I don't have an environment available at the moment where I can easily build scipy. I do my own benchmarking and testing with my standalone FFT package, but this isn't representative, since it has evolved quite a bit.

@peterbell10
Copy link
Member

I've benchmarked this and it does consistently slow down 1D transforms by 15-25% unfortunately. There also doesn't seem to be much of a length dependence here so setting a maximum plan size to cache would still significantly hurt 1D transform performance.

User-configurable cache sizes could be a good option, and I think could be done without too much extra code.

@larsoner
Copy link
Member

larsoner commented Jul 7, 2020

This could be quite a can of worms.

Agreed it is a bit of a can of worms, but if we can help speed up the code of 95% of users (I think some reasonable, enabled-by-default caching would do this) and give the remaining 5% a way to disable or configure things I think we should.

User-configurable cache sizes could be a good option, and I think could be done without too much extra code.

Sounds good to me. Not sure what the default cache size should be (unlimited as it is now? some percentage of the available memory?). But in any case, people wanting no caching would just set the cache size to zero, saving them from any potential can-of-worms (and known memory) problems. Some way to clear the cache but keep it the same size might be nice, though I guess setting it to zero then back to a non-zero value would accomplish this, so maybe an explicit clearing function is not needed.

@mreineck
Copy link
Contributor Author

mreineck commented Jul 7, 2020

I've benchmarked this and it does consistently slow down 1D transforms by 15-25% unfortunately.

This is much less slowdown than I'd expected. My personal choice would definitely be to have no cache in this situation, given the tradeoff between additional complexity (both in the implementation and the user interface) and the potential gains. The decision is of course entirely up to you (i.e. the scipy developers), I just wanted to give my 2ct.

(As an anecdote: before the C version of pocketfft was integrated, numpy had an FFT cache, which actually made short 1D FFTs slower by integral factors, because the cache lookup was so slow. Nobody complained about this - but people will complain about machines starting to swap. numpy doesn't cache FFTs any more ... :)

@peterbell10
Copy link
Member

At the end of the day it comes down to the user's use case. If they are using multi-dimensional transforms then the cache is probably no more than a few % performance gain, so dropping the cache would be fine. If their use case is lots of repeated 1D transforms then a 15-25% performance gain is IMO still valuable. Though, obviously not as drastic as the likes of FFTW's plans.

@mreineck
Copy link
Contributor Author

mreineck commented Jan 3, 2022

I think this fell victim to the master -> main transition. For me it's fine to leave it closed; whenever someone notices high memory consumption, they can still be pointed to this PR for a potential way of solving it.

Once scipy allows C++17 extensions, I'll be able to offer a better alternative :-)

@rgommers
Copy link
Member

rgommers commented Jan 3, 2022

Sounds good, thanks @mreineck

Once scipy allows C++17 extensions, I'll be able to offer a better alternative :-)

Maybe in one or two years - C++14 is okay now. Blockers for C++17 are documented in http://scipy.github.io/devdocs/dev/toolchain.html#c-language-standards

@mreineck
Copy link
Contributor Author

mreineck commented Jan 3, 2022

Thanks for the link!
I just double-checked,and unfortunately I cannot really downgrade my code to C++14 ... the main problem is usage of if constexpr().
Even if potential inclusion is quite some time away, do you think it makes sense to open an issue (or even a WIP PR) even now to discuss the technical details, play around with benchmarks etc.?

@rgommers
Copy link
Member

rgommers commented Jan 3, 2022

Even if potential inclusion is quite some time away, do you think it makes sense to open an issue (or even a WIP PR) even now to discuss the technical details, play around with benchmarks etc.?

That's a question for @peterbell10 I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants