New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Numba energy permutation test #28
Conversation
I would appreciate some help with these doctests, I can't work out how to run them individually. Nor can I work out why they are changing from |
I think we should test the performance against the Numpy version. Numpy is VERY fast, and the Numba rewrite loses some functionality and is harder to maintain, so I would want to see a significant improvement before the rewrite (also, we may even want to have BOTH versions, to preserve the old functionality). |
Check that you are compiling numba in |
I'm was using |
Have you excluded the compilation step from the numba timings? |
No I haven't, but that's kind of the point, isn't it? We want to find the problem size at which the numba speedup (if it exists) wins against numpy despite the flagfall cost of compilation. And it seems like that never happens. |
No, because the compilation only happens once, while the function may be called multiple times. |
Note: this builds on #27, and changes from that branch will appear here until that is merged.
This re-implements most of the energy distance functions and permutation tests using numba. This provides significant performance improvements. I have some benchmarks below, which compare numba to pure Python (note: this isn't comparing numba to original code that used numpy tricks, it's comparing my changes with and without the JIT). The results suggest that numba improves performance for any number of permutations above 250. I expect this will also be true of multiple different permutation tests in the same program.
And with slightly higher limits:
However, the costs are:
average
parameter is now a string which is eithermean
ormedian
RandomState
object, and have to rely on onlynp.random.seed()