bpo-44422: Fix threading.enumerate() reentrant call #26727

vstinner · 2021-06-14T21:29:14Z

The threading.enumerate() function now uses a reentrant lock to
prevent a hang on reentrant call.

https://bugs.python.org/issue44422

vstinner · 2021-06-14T21:29:47Z

@pablogsal @pitrou @serhiy-storchaka @methane: I'm not sure about this change, would you mind to have a look?

vstinner · 2021-06-14T21:31:49Z

If the CI tests pass, I will try the buildbot label to check on more platforms.

rhettinger · 2021-06-14T21:44:19Z

Lib/threading.py

+#
+# bpo-44422: Use a reentrant lock to allocate reentrant calls to functions like
+# threading.enumerate().
+_active_limbo_lock = RLock()


The RLock() is the right way to go to prevent self-deadlock.

We could temporarily turn GC off but that seems hard to do in a way that reliably restores the prior enabled/disabled mode.

We could temporarily turn GC off but that seems hard to do in a way that reliably restores the prior enabled/disabled mode.

Sure, that's the other option that I considered. But gc.disable() is process-wide: it affects all Python threads, and so it might have surprising side effects. Some code might rely on the current exact GC behavior.

Another option would be to rewrite the code in C. The problem is that other functions rely on this lock, like active_count(). I would prefer to not have to rewrite "half" of threading.py in C. Using a RLock is less intrusive.

bedevere-bot · 2021-06-14T21:53:46Z

🤖 New build scheduled with the buildbot fleet by @vstinner for commit 2907ca88719503493acd930790ffdb978e73dbbb 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

pitrou · 2021-06-14T22:00:40Z

(@iritkatriel : a concrete example of GC-induced reentrancy issue)

vstinner · 2021-06-14T22:05:56Z

I ran manual tests on this PR (my laptop has 8 logical CPUs, 4 physical Intel CPUs):

Run ./python rec_threading.py: SUCCESS (rec_threading.py is attached to bpo-44422)
Run ./python -m test -j10 -r twice: SUCCESS
Run ./python -m test test_threading -F -j10 --timeout=60 for 15 minutes: SUCCESS (no timeout)

vstinner · 2021-06-14T22:06:04Z

Oops, there is a typo in a comment :-( "Use a reentrant lock to allocate reentrant calls": you should read "to allow". I will update my PR later, but I prefer to wait until the buildbot jobs complete.

pitrou

Indeed, a RLock seems like the best effort solution here. Presumably, only the non-mutating code protected by _active_limbo_lock can be called reentrantly (why would a reentrate call mutate the locks table?).

Still, a reminder that a lot of things can unfortunately happen in destructors and trigger reentrancy into unsuspecting code.

vstinner · 2021-06-14T22:07:17Z

(@iritkatriel : a concrete example of GC-induced reentrancy issue)

These bugs are surprising. Nobody expects the GC reentrancy! In OpenStack, it's even more surprising since threading.current_thread() is monkey-patched for even more fun!

pitrou · 2021-06-14T22:08:01Z

In OpenStack, it's even more surprising since threading.current_thread() is monkey-patched for even more fun!

Really? For what kind of fun? :-o

vstinner · 2021-06-14T22:14:44Z

Really? For what kind of fun? :-o

Python became too deterministic and boring. It's time to make it non deterministic again!

vstinner · 2021-06-15T11:37:51Z

9 Refleak builds failed:

buildbot/AMD64 Fedora Stable Refleaks PR
buildbot/AMD64 RHEL7 Refleaks PR
buildbot/AMD64 RHEL8 Refleaks PR
buildbot/PPC64LE Fedora Stable Refleaks PR
buildbot/PPC64LE RHEL7 Refleaks PR
buildbot/PPC64LE RHEL8 Refleaks PR
buildbot/s390x Fedora Refleaks PR
buildbot/s390x RHEL7 Refleaks PR
buildbot/s390x RHEL8 Refleaks PR

test__xxsubinterpreters and/or test_threading failed:

test__xxsubinterpreters leaked [55, 55, 55] references, sum=165
test__xxsubinterpreters leaked [31, 31, 31] memory blocks, sum=93
...
test_threading leaked [110, 110, 110] references, sum=330
test_threading leaked [62, 62, 62] memory blocks, sum=186

iritkatriel · 2021-06-15T11:59:03Z

The same lock is used in a few other places. Maybe in one of them it's not right to make it re-entrant?

iritkatriel · 2021-06-15T12:05:01Z

There is also another _active_limbo_lock = _allocate_lock() assignment in _after_fork:
https://github.com/python/cpython/blob/2907ca88719503493acd930790ffdb978e73dbbb/Lib/threading.py#L1562

vstinner · 2021-06-15T12:44:08Z

test__xxsubinterpreters and/or test_threading failed

Ah, the https://bugs.python.org/issue42972#msg385297 bug strikes back.

I wrote #26727 to fix the leak.

vstinner · 2021-06-15T12:44:39Z

There is also another _active_limbo_lock = _allocate_lock() assignment in _after_fork:

Oh, nicely stopped @iritkatriel! I will update the PR, once #26727 is merged.

vstinner · 2021-06-15T12:45:17Z

Ooops sorry, the PR to fix the leak is: #26734

The threading.enumerate() function now uses a reentrant lock to prevent a hang on reentrant call.

Issue spotted by Irit.

vstinner · 2021-06-15T13:15:00Z

I fixed _after_fork() and the typo (allocate => allow), and I rebased it on top of the merged #26734 fix.

@iritkatriel: Would you mind to review the updated PR?

iritkatriel · 2021-06-15T13:39:57Z

This comment is now obsolete:

                    # We don't call self._delete() because it also
                    # grabs _active_limbo_lock.
                    del _active[get_ident()]

vstinner · 2021-06-15T14:07:08Z

This comment is now obsolete

Right @iritkatriel, I plan to write a second change only for that, but only change it in the main branch to avoid any risk of regression in stable branches. Oh I forgot to mention it here, I have a local patch for it ;-)

So @iritkatriel, does it look good to you?

iritkatriel

LGTM

miss-islington · 2021-06-15T14:14:26Z

Thanks @vstinner for the PR 🌮🎉.. I'm working now to backport this PR to: 3.10, 3.9.
🐍🍒⛏🤖

The threading.enumerate() function now uses a reentrant lock to prevent a hang on reentrant call. (cherry picked from commit 243fd01) Co-authored-by: Victor Stinner <vstinner@python.org>

bedevere-bot · 2021-06-15T14:14:36Z

GH-26737 is a backport of this pull request to the 3.10 branch.

The threading.enumerate() function now uses a reentrant lock to prevent a hang on reentrant call. (cherry picked from commit 243fd01) Co-authored-by: Victor Stinner <vstinner@python.org>

bedevere-bot · 2021-06-15T14:14:43Z

GH-26738 is a backport of this pull request to the 3.9 branch.

vstinner · 2021-06-15T14:15:23Z

Thanks everyone for your useful reviews ;-) I wasn't confident at all that this change was safe. At least, the responsibility of any possible regression is now distributed :-D

The threading.enumerate() function now uses a reentrant lock to prevent a hang on reentrant call. (cherry picked from commit 243fd01) Co-authored-by: Victor Stinner <vstinner@python.org>

) The threading.enumerate() function now uses a reentrant lock to prevent a hang on reentrant call. (cherry picked from commit 243fd01) Co-authored-by: Victor Stinner <vstinner@python.org>

vstinner · 2021-06-15T16:43:50Z

This comment is now obsolete: (...)

@iritkatriel: I created #26741 to use again the _delete() method in _bootstrap_inner().

The threading.enumerate() function now uses a reentrant lock to prevent a hang on reentrant call.

vstinner added needs backport to 3.9 only security fixes needs backport to 3.10 only security fixes labels Jun 14, 2021

the-knights-who-say-ni added the CLA signed label Jun 14, 2021

bedevere-bot added the awaiting core review label Jun 14, 2021

rhettinger approved these changes Jun 14, 2021

View reviewed changes

bedevere-bot added awaiting merge and removed awaiting core review labels Jun 14, 2021

rhettinger reviewed Jun 14, 2021

View reviewed changes

vstinner added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Jun 14, 2021

bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Jun 14, 2021

pitrou approved these changes Jun 14, 2021

View reviewed changes

serhiy-storchaka approved these changes Jun 15, 2021

View reviewed changes

vstinner mentioned this pull request Jun 15, 2021

bpo-42972: _thread.RLock type implements tp_traverse #26734

Merged

vstinner added 2 commits June 15, 2021 15:10

bpo-44422: Fix threading.enumerate() reentrant call

28fa7d7

The threading.enumerate() function now uses a reentrant lock to prevent a hang on reentrant call.

Fix typo

adb0a4d

Fix _after_fork()

20a0d9a

Issue spotted by Irit.

iritkatriel approved these changes Jun 15, 2021

View reviewed changes

vstinner merged commit 243fd01 into python:main Jun 15, 2021

bedevere-bot removed the awaiting merge label Jun 15, 2021

vstinner deleted the enumerate_rlock branch June 15, 2021 14:14

bedevere-bot removed the needs backport to 3.10 only security fixes label Jun 15, 2021

bedevere-bot removed the needs backport to 3.9 only security fixes label Jun 15, 2021

jdevries3133 pushed a commit to jdevries3133/cpython that referenced this pull request Jun 19, 2021

bpo-44422: Fix threading.enumerate() reentrant call (pythonGH-26727)

1d63579

The threading.enumerate() function now uses a reentrant lock to prevent a hang on reentrant call.

pablogsal mentioned this pull request Jul 18, 2021

[3.10] Correct the order of regen-abidump #27228

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bpo-44422: Fix threading.enumerate() reentrant call #26727

bpo-44422: Fix threading.enumerate() reentrant call #26727

vstinner commented Jun 14, 2021 •

edited by bedevere-bot

Loading

vstinner commented Jun 14, 2021

vstinner commented Jun 14, 2021

rhettinger Jun 14, 2021

vstinner Jun 14, 2021

bedevere-bot commented Jun 14, 2021

pitrou commented Jun 14, 2021 •

edited

Loading

vstinner commented Jun 14, 2021 •

edited by bedevere-bot

Loading

vstinner commented Jun 14, 2021

pitrou left a comment

vstinner commented Jun 14, 2021

pitrou commented Jun 14, 2021

vstinner commented Jun 14, 2021

vstinner commented Jun 15, 2021

iritkatriel commented Jun 15, 2021

iritkatriel commented Jun 15, 2021

vstinner commented Jun 15, 2021

vstinner commented Jun 15, 2021

vstinner commented Jun 15, 2021

vstinner commented Jun 15, 2021

iritkatriel commented Jun 15, 2021

vstinner commented Jun 15, 2021

iritkatriel left a comment

miss-islington commented Jun 15, 2021

bedevere-bot commented Jun 15, 2021

bedevere-bot commented Jun 15, 2021

vstinner commented Jun 15, 2021

vstinner commented Jun 15, 2021

bpo-44422: Fix threading.enumerate() reentrant call #26727

bpo-44422: Fix threading.enumerate() reentrant call #26727

Conversation

vstinner commented Jun 14, 2021 • edited by bedevere-bot Loading

vstinner commented Jun 14, 2021

vstinner commented Jun 14, 2021

rhettinger Jun 14, 2021

Choose a reason for hiding this comment

vstinner Jun 14, 2021

Choose a reason for hiding this comment

bedevere-bot commented Jun 14, 2021

pitrou commented Jun 14, 2021 • edited Loading

vstinner commented Jun 14, 2021 • edited by bedevere-bot Loading

vstinner commented Jun 14, 2021

pitrou left a comment

Choose a reason for hiding this comment

vstinner commented Jun 14, 2021

pitrou commented Jun 14, 2021

vstinner commented Jun 14, 2021

vstinner commented Jun 15, 2021

iritkatriel commented Jun 15, 2021

iritkatriel commented Jun 15, 2021

vstinner commented Jun 15, 2021

vstinner commented Jun 15, 2021

vstinner commented Jun 15, 2021

vstinner commented Jun 15, 2021

iritkatriel commented Jun 15, 2021

vstinner commented Jun 15, 2021

iritkatriel left a comment

Choose a reason for hiding this comment

miss-islington commented Jun 15, 2021

bedevere-bot commented Jun 15, 2021

bedevere-bot commented Jun 15, 2021

vstinner commented Jun 15, 2021

vstinner commented Jun 15, 2021

vstinner commented Jun 14, 2021 •

edited by bedevere-bot

Loading

pitrou commented Jun 14, 2021 •

edited

Loading

vstinner commented Jun 14, 2021 •

edited by bedevere-bot

Loading