-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallelization #6
Comments
Sorry for the late response on this! The FFTs can be completely parallelized, exactly like libfqfft. The core loop that does all the work is actually basically the same, so the openMP parallelization can be done identically. You could also switch it back to libfqfft rather easily. (In fact I think the version of the library in the first commit already did that, so you could just copy paste its code) The reason I reimplemented multiplicative FFTs was to take two significant optimizations:
|
Thanks so much for the answer, this is super helpful! |
No problem! One thing I probably should have said is that these FFTs can also be parallelized better than libfqfft's. You can divide the FFT into |
Hi,
I have a quick question. I noticed in the paper for Fractal, the benchmarks are done single-threaded. When I was looking at a benchmark on my machine, I saw that the
multiplicative_FFT_wrapper
often took the most time, so I looked at the code. It seems like it's taken fromlibfqfft
--is there any reason thatlibfqfft
wasn't directly used to take advantage of its OpenMP support? Would it be (somewhat) easy to drop in a call tolibfqfft
?Thanks!
PS: thanks for sharing all this. It's really awesome stuff.
The text was updated successfully, but these errors were encountered: