Supporting Eigen & FFTW when STAN_THREADS is enabled #3025

rok-cesnovar · 2023-11-10T19:22:45Z

Description:

The FFTW library can be used seamlessly with Eigen and their FFT implementation which makes everything simple to use with Stan. Using FFTW has proven to yield huge speedups in our model.

There is however a problem when using threading. When using fftw in multi-threaded applications, you need to call fftw_make_planner_thread_safe(); once before calling fft().

Which means I need to add

#ifdef EIGEN_FFTW_DEFAULT
    fftw_make_planner_thread_safe();
#endif

in main() of cmdstan.

Do we think this might be worth adding for all users? Or is this too niche?

Current Version:

v2.33.1

The text was updated successfully, but these errors were encountered:

WardBrian · 2023-11-16T15:18:26Z

I would be in favor of this. I think you'd want the condition to actually be #if defined EIGEN_FFTW_DEFAULT && defined STAN_THREADS.

I'm curious about your application that got faster - I had previously tried to use FFTW for a problem which required a 2-D fft, but that was not much faster in Stan-Math because Eigen (as of 3.4, anyway) only exposes a 1-D FFT function, so our 2-D FFTs are just loops still.

rok-cesnovar · 2023-11-16T15:52:42Z

Oh yeah, #if defined EIGEN_FFTW_DEFAULT && defined STAN_THREADS sounds good.

The application is running a bunch of 1-D convolutions of vectors y(sizes between 800-1k) with filters x that can vary in size (typically size 100-400). Both are parameters.

The speedups of just the fft call are around 2 (but in my model that is a ton because the convolution, as you can imagine is like 90% of the runtime).

Gradient evalution times for fft() is shown below, x-axis is the size of y

black = non-fft convolution
red = native Eigen fft
green = FFTW

WardBrian · 2024-02-14T17:32:59Z

@rok-cesnovar would it make more sense for this to be in stan::math::init_threadpool_tbb()?

rok-cesnovar · 2024-02-14T17:52:27Z

Hmm, I think you are right, because that way its not limited to cmdstan only. Will move to math.

WardBrian · 2024-03-28T16:42:44Z

For a model using 2-D FFTs, I saw about a 1.5x speedup by using FFTW. This also required some changes to our FFT code, since Eigen exposes the 2-D transforms conditionally:

https://github.com/WardBrian/math/tree/experiment/eigen-fftw

rok-cesnovar · 2024-03-28T16:57:06Z

Awesome! Yeah, FFTW is the real deal :)

rok-cesnovar transferred this issue from stan-dev/cmdstan Feb 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Supporting Eigen & FFTW when STAN_THREADS is enabled #3025

Supporting Eigen & FFTW when STAN_THREADS is enabled #3025

rok-cesnovar commented Nov 10, 2023 •

edited

Loading

WardBrian commented Nov 16, 2023

rok-cesnovar commented Nov 16, 2023 •

edited

Loading

WardBrian commented Feb 14, 2024

rok-cesnovar commented Feb 14, 2024

WardBrian commented Mar 28, 2024

rok-cesnovar commented Mar 28, 2024

Supporting Eigen & FFTW when STAN_THREADS is enabled #3025

Supporting Eigen & FFTW when STAN_THREADS is enabled #3025

Comments

rok-cesnovar commented Nov 10, 2023 • edited Loading

Description:

Current Version:

WardBrian commented Nov 16, 2023

rok-cesnovar commented Nov 16, 2023 • edited Loading

WardBrian commented Feb 14, 2024

rok-cesnovar commented Feb 14, 2024

WardBrian commented Mar 28, 2024

rok-cesnovar commented Mar 28, 2024

rok-cesnovar commented Nov 10, 2023 •

edited

Loading

rok-cesnovar commented Nov 16, 2023 •

edited

Loading