-
-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-44422: Fix threading.enumerate() reentrant call #26727
Conversation
@pablogsal @pitrou @serhiy-storchaka @methane: I'm not sure about this change, would you mind to have a look? |
If the CI tests pass, I will try the buildbot label to check on more platforms. |
# | ||
# bpo-44422: Use a reentrant lock to allocate reentrant calls to functions like | ||
# threading.enumerate(). | ||
_active_limbo_lock = RLock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The RLock() is the right way to go to prevent self-deadlock.
We could temporarily turn GC off but that seems hard to do in a way that reliably restores the prior enabled/disabled mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could temporarily turn GC off but that seems hard to do in a way that reliably restores the prior enabled/disabled mode.
Sure, that's the other option that I considered. But gc.disable() is process-wide: it affects all Python threads, and so it might have surprising side effects. Some code might rely on the current exact GC behavior.
Another option would be to rewrite the code in C. The problem is that other functions rely on this lock, like active_count(). I would prefer to not have to rewrite "half" of threading.py in C. Using a RLock is less intrusive.
🤖 New build scheduled with the buildbot fleet by @vstinner for commit 2907ca88719503493acd930790ffdb978e73dbbb 🤖 If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again. |
(@iritkatriel : a concrete example of GC-induced reentrancy issue) |
I ran manual tests on this PR (my laptop has 8 logical CPUs, 4 physical Intel CPUs):
|
Oops, there is a typo in a comment :-( "Use a reentrant lock to allocate reentrant calls": you should read "to allow". I will update my PR later, but I prefer to wait until the buildbot jobs complete. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, a RLock
seems like the best effort solution here. Presumably, only the non-mutating code protected by _active_limbo_lock
can be called reentrantly (why would a reentrate call mutate the locks table?).
Still, a reminder that a lot of things can unfortunately happen in destructors and trigger reentrancy into unsuspecting code.
These bugs are surprising. Nobody expects the GC reentrancy! In OpenStack, it's even more surprising since threading.current_thread() is monkey-patched for even more fun! |
Really? For what kind of fun? :-o |
Python became too deterministic and boring. It's time to make it non deterministic again! |
9 Refleak builds failed:
test__xxsubinterpreters and/or test_threading failed:
|
The same lock is used in a few other places. Maybe in one of them it's not right to make it re-entrant? |
There is also another |
Ah, the https://bugs.python.org/issue42972#msg385297 bug strikes back. I wrote #26727 to fix the leak. |
Oh, nicely stopped @iritkatriel! I will update the PR, once #26727 is merged. |
Ooops sorry, the PR to fix the leak is: #26734 |
The threading.enumerate() function now uses a reentrant lock to prevent a hang on reentrant call.
Issue spotted by Irit.
I fixed _after_fork() and the typo (allocate => allow), and I rebased it on top of the merged #26734 fix. @iritkatriel: Would you mind to review the updated PR? |
This comment is now obsolete:
|
Right @iritkatriel, I plan to write a second change only for that, but only change it in the main branch to avoid any risk of regression in stable branches. Oh I forgot to mention it here, I have a local patch for it ;-) So @iritkatriel, does it look good to you? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks @vstinner for the PR 🌮🎉.. I'm working now to backport this PR to: 3.10, 3.9. |
The threading.enumerate() function now uses a reentrant lock to prevent a hang on reentrant call. (cherry picked from commit 243fd01) Co-authored-by: Victor Stinner <vstinner@python.org>
GH-26737 is a backport of this pull request to the 3.10 branch. |
The threading.enumerate() function now uses a reentrant lock to prevent a hang on reentrant call. (cherry picked from commit 243fd01) Co-authored-by: Victor Stinner <vstinner@python.org>
GH-26738 is a backport of this pull request to the 3.9 branch. |
Thanks everyone for your useful reviews ;-) I wasn't confident at all that this change was safe. At least, the responsibility of any possible regression is now distributed :-D |
@iritkatriel: I created #26741 to use again the |
The threading.enumerate() function now uses a reentrant lock to prevent a hang on reentrant call.
The threading.enumerate() function now uses a reentrant lock to
prevent a hang on reentrant call.
https://bugs.python.org/issue44422