Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Improve fftn performance #10490

Merged
merged 1 commit into from Jul 19, 2019

Conversation

@peterbell10
Copy link
Member

commented Jul 19, 2019

What does this implement/fix?

fftpack's _init_nd_shape_and_axes is quite slow due to it's use of very small length numpy arrays. Moving over to list comprehensions gives a nice speed improvement (except possibly for very high dimensional transforms).

I put this in _pocketfft instead of modifying fftpack because the fftpack users of this function assume the return value is an array which _pocketfft doesn't.

Additional information

This simple benchmark shows the improvements: up to 40 us (98%!) faster in some cases, particularly those involving None:

In [1]: from scipy.fft._pocketfft.basic import _init_nd_shape_and_axes as pocket_init
In [2]: from scipy.fftpack.helper import _init_nd_shape_and_axes as pack_init
In [3]: trials = [(None, None), (None, (1,)), ((10, 20, 30), None), ((10, 20, 30), (2, 1, 0))]
In [4]: import numpy as np; x = np.random.random((128, 64, 64))
In [5]: for s, a in trials:
...:     print('Trial fttpack: ', s, a)
...:     %timeit pack_init(x, s, a)
...:     print('Trial pocketfft: ', s, a)
...:     %timeit pocket_init(x, s, a)
...:
Trial fttpack:  None None
43.4 µs ± 2.69 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Trial pocketfft:  None None
930 ns ± 69.5 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
Trial fttpack:  None (1,)
37.8 µs ± 2.03 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Trial pocketfft:  None (1,)
6.23 µs ± 332 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Trial fttpack:  (10, 20, 30) None
40.6 µs ± 3.01 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Trial pocketfft:  (10, 20, 30) None
6.72 µs ± 398 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Trial fttpack:  (10, 20, 30) (2, 1, 0)
35.7 µs ± 3.08 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Trial pocketfft:  (10, 20, 30) (2, 1, 0)
13.6 µs ± 572 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

@peterbell10 peterbell10 force-pushed the peterbell10:nd-shape-axes branch from 433b5ea to 901e88b Jul 19, 2019

@grlee77
Copy link
Contributor

left a comment

This is a good change. I had noticed in the past that calling fftn on small 1D arrays was tens of microseconds slower than calling fft, but hadn't looked into the cause.

To avoid duplication of the logic, can't you have scipy.fftpack.helper._init_nd_shape_and_axes just wrap this new version and cast the outputs to ndarray?

@peterbell10

This comment has been minimized.

Copy link
Member Author

commented Jul 19, 2019

To avoid duplication of the logic, can't you have scipy.fftpack.helper._init_nd_shape_and_axes just wrap this new version and cast the outputs to ndarray?

I'm trying to avoid circular dependecies between the two. However, I suppose if I just move the function into fftpack then we can have it both ways.

ENH: Improve fftn performance
This function is a bad use case for numpy arrays because the sizes are normally
very small. Moving over to list comprehensions gives a nice speed improvement.

@peterbell10 peterbell10 force-pushed the peterbell10:nd-shape-axes branch from ccd1b0a to 9c5e335 Jul 19, 2019

@larsoner

This comment has been minimized.

Copy link
Member

commented Jul 19, 2019

I'm trying to avoid circular dependecies between the two. However, I suppose if I just move the function into fftpack then we can have it both ways.

The nicer way to separate it I think would be to do a nested import in the fftpack code.

That way we stick as close as possible to the idea that "still maintained/improved code" lives in scipy.fft and legacy code lives in scipy.fftpack. Moreover, eventually hopefully someday soon importing scipy.fft won't require import anything from scipy.fftpack.

@peterbell10

This comment has been minimized.

Copy link
Member Author

commented Jul 19, 2019

The nicer way to separate it I think would be to do a nested import in the fftpack code.

By that do you mean just moving the imports inside functions? In that case I would say there's still a circular dependency; it just doesn't manifest itself as import issues.

eventually hopefully someday soon importing scipy.fft won't require import anything from scipy.fftpack.

If we get dct/dst into _pocketfft (#10493) then fftpack can depend on fft without issue, so hopefully we can fix this very soon.

@larsoner

This comment has been minimized.

Copy link
Member

commented Jul 19, 2019

Ahh okay, we can take care of it later then.

@larsoner

This comment has been minimized.

Copy link
Member

commented Jul 19, 2019

Thanks @peterbell10

@larsoner larsoner merged commit 2f34eaf into scipy:master Jul 19, 2019

9 checks passed

ci/circleci: build_docs Your tests passed on CircleCI!
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
scipy.scipy Build #20190719.8 succeeded
Details
scipy.scipy (Linux_Python_36_32bit_full) Linux_Python_36_32bit_full succeeded
Details
scipy.scipy (Windows Python35-64bit-full) Windows Python35-64bit-full succeeded
Details
scipy.scipy (Windows Python36-32bit-full) Windows Python36-32bit-full succeeded
Details
scipy.scipy (Windows Python36-64bit-full) Windows Python36-64bit-full succeeded
Details
scipy.scipy (Windows Python37-64bit-full) Windows Python37-64bit-full succeeded
Details

@peterbell10 peterbell10 deleted the peterbell10:nd-shape-axes branch Jul 19, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.