-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scipy.fft: Add a plan
argument
#10302
Comments
I've implemented a simple cache with a FIFO eviction policy. get_plan functiontemplate<typename T, size_t size=4> shared_ptr<T> get_plan(size_t len)
{
static shared_ptr<T> cache[size];
static size_t keys[size] = {0};
static int head = 0;
static mutex mut;
{
std::lock_guard<mutex> lock(mut);
for (int i = 0; i < size; ++i)
{
if (keys[i] == len)
{
if (cache[i] != nullptr)
return cache[i];
break;
}
}
}
auto plan = make_shared<T>(len);
{
std::lock_guard<mutex> lock(mut);
// If some else beat us to it, use theirs
for (int i = 0; i < size; ++i)
{
if (keys[i] == len)
{
if (cache[i] != nullptr)
return cache[i];
break;
}
}
keys[head] = len;
cache[head] = plan;
head++;
if (head >= size)
head = 0;
}
return plan;
}
BenchmarksWithout cache
With cache
|
A flexible API would be:
@peterbell10 WDYT both in terms of usability, and in terms of compat with PyFFTW / PyCUDA APIs? |
I think having that level of customizability is good. An alternative API would be to have something like #4607 where the plan is an |
As far as compatibility with pyFFTW, I can't see any problems. For CuPy, there doesn't seem to be any support for plan objects but it has some internal cache. I also believe the underlying CUDA library mimics the FFTW interface so it should be possible. |
Actually, I found that cupy already supports a I also noticed that these are symmetrical plans (i.e. |
Yes, the CuPy core devs did not want to deviate from the NumPy API in the Also, see cupy/cupy#1936 which discusses adding R2C and C2R support to the n-dimensional planning. |
@peterbell10 this does indeed seem worth adding. Maybe a LRU instead of FIFO, though, assuming there is not a big performance hit for adding it? Does the API I proposed above make the most sense to you? Or do you prefer the callable approach from #4607 (I haven't looked closely at it)? |
Sure, I only wrote it as a FIFO for the sake of quick prototyping. I wouldn't expect the LRU bookkeeping to add a significant performance overhead.
I think both are reasonable and I don't have any real preference. The callable approach is similar to pyfftw's FFTW objects and your functional API is quite similar to CuPy's approach. Perhaps @grlee77 will have some insight as it sound like he's been involved in both of those? |
I think the API Eric suggested is good. pyFFTW has "builders" which return a callable like in #4607, but it also has plan caching that can be used by the scipy and numpy interfaces and works basically like the "auto" case proposed in Eric's API. I started a branch with a CUFFT plan cache for CuPy (see https://github.com/grlee77/cupy/commits/cufft_cache), but haven't started a PR for it. It is using a key-based dictionary with tracking of the total amount of memory being used for the work arrays currently in the cache so that the user could potentially set an upper memory limit for the cache. If I recall correctly, pocketfft does not keep large work arrays allocated within the plans, so I don't think similar memory limit concerns would apply here. |
This should help. Also for users who are really concerned with memory, they can always use So @peterbell10 I think if you are satisfied with the API I proposed above and think it's sufficiently future compatible with the backend system we want to build, it's probably worth opening a PR to add this feature. |
... also, I noticed that now https://13652-1460385-gh.circle-artifacts.com/0/html-scipyorg/roadmap.html#fourier-transform-enhancements needs to be updated :) |
It's correct that it doesn't store the work arrays, but it stores the full twiddle factors, which usually have almost the size of the array to be transformed, and in some cases (Bluestein!) several times the array size. So the memory consumption is not inacceptably high, but you can still swamp your computer if you run a few very large 1D FFTs with different lengths. |
One aspect missing from the API is plan creation:
|
I would say a single one with a transform type.
I would say as specific as the args to |
We cache now, so I'll close this |
What about pre-planned transforms? |
Feel free to reopen and re-title, or open a separate issue for that. We can also wait to implement until someone wants it |
plan
argument
Reopening as |
Hello guys, while the discussion of supporting a What if we add a keyword-only argument |
@leofang adding that kwarg and behavior to |
For For |
I would think it is fine as long as it is not declared deprecated. For the proposed change, there is no new feature implemented or change of behavior, after all. I do not insist on touching this module though.
We could say something in the doc along the line that “this argument is reserved for FFT backend providers and is not used by SciPy itself.”
I don’t understand this. I thought it is obvious in the FFT community that when a Alternatively, we may also just close this issue so as to make a guarantee that SciPy will not add the argument |
@peterbell10 @larsoner Any comments? Thanks. |
If all goes well then, yes, you should just get the same FFT computation but hopefully faster. I'm more concerned about the edge-cases where the FFT can or cannot be computed. How should errors be reported? Are the edges the same on all implementations? etc. These types of details can be important if we are keeping the interface generic. That being said, I suppose since you aren't proposing SciPy add a |
Thanks Peter! I agree these are legit concerns. But I also think if we're gonna raise
That's right. I am only proposing a (placeholder) consumer API. The producer APIs like
I added this as an alternative solution in my earlier post. Could be the simplest & fastest way out? |
I'm okay with adding a
That both provides an API for other packages to follow, and leaves the door open for us to eventually allow passing plans via |
Thanks Eric! One last question to both of you before I start drafting a PR: Do we want to touch the |
fftpack is deprecated and we shouldn't change it anymore. Just wait to hear from @peterbell10 that he's okay with this plan before making a PR. |
As far as I recall, it is not officially deprecated, because eventually removing it would break too many codes. |
Okay, just reserving a keyword with @mreineck I agree. I believe the phrasing used was "long term legacy", not deprecated. The intention being it won't be removed but will also not receive new features. With that in mind, it should be left as-is without a |
Thanks a lot to you all, @peterbell10 @larsoner @mreineck @grlee77, for your comments and help!🙏 |
See #10238 (comment)
scipy.fft
currently lacks any plan caching. For repeated transforms, this does a significant amount of duplicate work and makesscipy.fft
slower thanscipy.fftpack
for repeated regular sized ffts. (For one off ffts, pocketfft is still much faster)Things to look into/discuss:
The text was updated successfully, but these errors were encountered: