Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Requiring numba? #8

Closed
HDembinski opened this issue May 13, 2020 · 13 comments
Closed

Requiring numba? #8

HDembinski opened this issue May 13, 2020 · 13 comments

Comments

@HDembinski
Copy link
Member

The speed of the functions jackknife and bootstrap could be greatly enhanced with numba. While it is possible to have two implements for each function, one requiring numba and one just requiring numpy, I think it would make sense to require numba. It used to be a pain to install, but nowadays one can just pip install it and of course it is also available from anaconda.

I know numba well enough to work on this. Another option (but less preferred), would be to write the implementations in C++ and wrap the code to Python with pybind11. I can do that as well, but then the package is not pure Python anymore and has to be compiled for every target platform. This is a huge added burden, and the C++ code may not run faster than numba (I did some tests a while ago, where numba beat my C++ implemenation).

@dsaxton
Copy link
Collaborator

dsaxton commented May 13, 2020

What kind of speedups do you see from using numba? If those are significant requiring it could make sense (I'd prefer that to including C++ code).

@HDembinski
Copy link
Member Author

In general, the speed-ups can be order of magnitute or more, even if the code already used numpy. It is hard to say without trying, but I would expect similar speed ups for this library.

So maybe the best course of action is to write some benchmarks with the current code, and then rewrite some of it to use numba, and compare. I can do that.

@HDembinski
Copy link
Member Author

We can then decide whether numba is optional or required, ok?

@HDembinski
Copy link
Member Author

I tried numba on jackknife, which seemed most promising for acceleration. The speedup is about 10x on my machine, depending strongly on the number of samples in the original array. The speedup is probably also fairly platform-dependent.

https://github.com/HDembinski/essays/blob/master/resample%20and%20numba.ipynb

@HDembinski
Copy link
Member Author

HDembinski commented May 21, 2020

Based on this evidence, I think moving to numba is benefinical. Perhaps the code can be made even faster. The jackknife in particular could be much faster in the future, as soon as numba starts to support Python callables (currently it does not). Already today, numba can call ufuncs that offer a ctypes or cffi interface. In this case, we would not need to generate N arrays with N-1 elements at once, but only one N-1 array and reuse the allocated memory N times.

@HDembinski
Copy link
Member Author

I just updated the numba example with a better implementation.

@HDembinski
Copy link
Member Author

Since you are ok with making breaking changes, I think the version of jackknife that returns the Nx(N-1) array should be removed. For N > 10000 this becomes dangerous as it allocates huge amount of memory. Algorithms that have a O(N^2) memory requirement on input with arbitrary N can easily bring down computers.

@dsaxton
Copy link
Collaborator

dsaxton commented May 21, 2020

Thanks for working on this, that does look like a nice speedup.

I think I'm fine with removing the N x (N - 1) output option. Currently the jackknife is only implemented internally with a non-None statistical function so things should still work with that feature removed. It also feels a bit cleaner to have the shape of the output not depend on the input (this is also a question I have about the behavior of the bootstrap function, but it's not as problematic since you can control the number of bootstrap replications).

@HDembinski
Copy link
Member Author

I agree. I think we should always require, for jacknife or bootstrap, that the user passes the statistical function, to avoid holding all the resampled data in memory at once. For the bootstrap it is less of a problem, agreed.

I am concerned about the balanced bootstrap. I think the implementation requires one to hold all the N x M values in memory at once.

@dsaxton
Copy link
Collaborator

dsaxton commented May 22, 2020

I am concerned about the balanced bootstrap. I think the implementation requires one to hold all the N x M values in memory at once.

Yes, I think you are right. Off the top of my head I'm wondering if it might be reasonable to, instead of copying a permutation of all the data repeated M times, do the same for the integer positions and then use some kind of sliding window to calculate the statistical function. Or maybe that could be done in a lazy way and only "realize" the positions during the calculation?

@HDembinski
Copy link
Member Author

I have to look into this algorithm again, but I think one has to store at least the full NxM index array at once, where N is the number of samples and M the number of replicas. If the data values are doubles, one could safe a bit by using a smaller integer to hold the indices, say int32.

@HDembinski
Copy link
Member Author

I am closing this, since I added numba :)

@dsaxton
Copy link
Collaborator

dsaxton commented May 29, 2020

I am closing this, since I added numba :)

Makes sense, we can open specific issues for other applications of numba in the future

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants