Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: release the GIL in hotloops #244

Merged
merged 1 commit into from
Aug 19, 2024
Merged

PERF: release the GIL in hotloops #244

merged 1 commit into from
Aug 19, 2024

Conversation

neutrinoceros
Copy link
Owner

This allows multi-threading parallelism to beat single-thread walltime performances (on Python 3.12). By my measurements, it's possible to obtain better performances on multiple threads for sorted datasets and up to a couple (2 or 3) fields. I suspect this means cache misses are the main hurdle, which hinders the benefit of doing multi-threaded computations, but it's still the only option for datasets that are too large to be copied accross proceses, so I think it's worth it.

@neutrinoceros neutrinoceros marked this pull request as ready for review August 19, 2024 08:23
@neutrinoceros neutrinoceros merged commit 1ee0f77 into main Aug 19, 2024
16 checks passed
@neutrinoceros neutrinoceros deleted the perf/release_gil branch August 19, 2024 08:23
@neutrinoceros
Copy link
Owner Author

Turns out this makes perfs drastically worse for a large-ish number of fields (tested with 8). Single thread perf seems unbeatable in this regime but multi-threading still does about 10 times worse with this patch, so I "unmerged" it but suppressing the merge commit from main.

@neutrinoceros neutrinoceros restored the perf/release_gil branch August 30, 2024 11:51
@neutrinoceros neutrinoceros deleted the perf/release_gil branch August 30, 2024 11:52
@neutrinoceros
Copy link
Owner Author

I restored this merge for two reasons:

  • I don't want to pollute changelogs generated with gh changelog new with unmerged-PRs
  • I think releasing the GIL is still the right thing to do at the C level; the responsibility of using a lock or not should be left to user code, and it's simple enough to implement on that side that I don't think I should add more complexity within the library.

@neutrinoceros neutrinoceros added this to the v2.0.0 milestone Sep 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant