-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for #3596: Make fftconvolve threadsafe #3647
Conversation
Maybe instead of duplicating the |
The problem with that is that it would force even single-threaded code to use the slower FFT routines, even though they don't need to, so it would make fftconvolve slower in general, with all current versions of NumPy.. The advantage of doing it this way is that it uses the faster RFFT routines whenever possible, only falling back to the complex FFT routines if it actually does encounter multiple threads running at the same time. It should therefore be about the same speed as before for all single-threaded programs. |
ret = ret.real | ||
if _rfft_mt_safe: | ||
ret = irfftn(rfftn(in1, fshape) * | ||
rfftn(in2, fshape), fshape)[fslice].copy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this be followed by ret = ret.real
? (purpose is to strip away negligible imaginary parts?) I don't know why irfftn would return anything but real, but it was there in the original...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll admit this omission was unintentional, but it turns out that since we're doing an inverse real-FFT transform, the output is already a real array by definition, so that step is redundant anyway (and didn't need to be in the original)..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I guess that's my fault: 5567c9c It previously existed to strip away negligible imaginary parts from ifftn, but irfftn output should always be real already, so this line isn't needed with irfftn at all.
In other words, I think this is equivalent, but doesn't duplicate the same operation in multiple places?
I don't know if the 2 irfftn calls can be similarly combined, though, because of the |
So it took some doing but I think I've figured out a way to consolidate the duplicated code without making things entirely impossible to follow.. let me know what you think of the updated version.. |
Looks good to me |
While you're changing things, I don't see any reason for the Also there's a completely inconsequential typo in I don't see any real problems, though. The comments are very helpful |
BUG: fftpack: make fftconvolve threadsafe Fixes gh-3596
Thanks, merged. |
First pass at a fix for #3596: For NumPy versions prior to 1.9.0, will make sure that numpy.fft.rfftn / irfftn are only called from one thread at a time. Will fall back to the slower SciPy complex-FFT routines for all other simultaneous threads.