New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GIL release #14
Comments
@robbmcleod - good idea, do you want to give it a try and see if it leads to any improvements in performance? |
Initially it doesn't help when I tried multi-threading with Those Also we could explicitly create |
Releasing the GIL isn't useful as the histogram calculation is too fast for context switches to be worthwhile. My experience with NumExpr is that even with p-threads you're looking at about 2-5 us per-thread to pass the thread barrier. Bypassing the Python checks to call the C-API directly is the best speed-up I found:
The only other thing that strikes me as a possible optimization would be SIMD vectorizing the loops. Neither MSVC or GCC are vectorizing the loops as they are now (which is sort of obvious both from the uncompelling float32 benchmarks and the compiler messages). The calls to Test code:
Changes can be seen at my fork: |
Shall we close this? |
@robbmcleod - thanks for investigating this! I'd like to keep this open as I want to take some time to over the changes and make sure there's nothing we want to include. |
@robbmcleod - one of the changes here that does make a big difference for large arrays is the 32-bit version you implemented for 32-bit arrays (to avoid the increased memory usage), so I was wondering if you would be happy to open a pull request to add those changes in? (if you don't have time I can pull in the commits you made). |
I'll take a look when I have time. |
I'm going to close this since 32-bit floating point arrays should be faster now (#23) |
Nominally in C code that takes some time one can release the GIL with:
It looks like it would be pretty trivial to enclose your for loops with these macros, is this something you're interested in?
The text was updated successfully, but these errors were encountered: