[POC | not ready for merge] Speed up bootstrap using multiprocessing #185

rebecca-burwei · 2023-07-15T21:30:30Z

Trying out some things I learned at SciPy.

Locally, I'm seeing a 4-5x speed-up on the tests. I wonder what the speed-up would be in our production environments.

I understand we're introducing a bit more non-determinism here, but I bet it's possible to make the difference small enough.

If this works, we should implement it for the Bayesian bootstrap too.

Other tips:

This can probably be done with multithreading, which will use less memory and take less time.
Another way to speed things up is to use numba on stat_fn's that are not just simple compositions of numpy functions.

Tagging some folks for feedback on this proof of concept.

scholtzan · 2023-07-18T21:40:59Z

how does #177 compare to this?

mikewilli · 2024-02-27T19:05:31Z

@rebecca-burwei Should this be closed?

feat: freq bootstrap with mp

22be5fe

rebecca-burwei requested review from scholtzan, danielkberry and m-d-bowerman July 15, 2023 21:31

Provide feedback