Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-105699: Use a _Py_hashtable_t for the PyModuleDef Cache #106974

Merged

Conversation

ericsnowcurrently
Copy link
Member

@ericsnowcurrently ericsnowcurrently commented Jul 21, 2023

This fixes a crasher due to a race condition, triggered infrequently when two isolated (own GIL) subinterpreters simultaneously initialize their sys or builtins modules. The crash happened due the combination of the "detached" thread state we were using and the "last holder" logic we use for the GIL. It turns out it's tricky to use the same thread state for different threads. Who could have guessed?

We solve the problem by eliminating the one object we were still sharing between interpreters. We replace it with a low-level hashtable, using the "raw" allocator to avoid tying it to the main interpreter.

We also remove the accommodations for "detached" thread states, which were a dubious idea to start with.

(This approach does not have the same potential performance penalty as the linked list we used in gh-106899.)

@ericsnowcurrently
Copy link
Member Author

@Yhg1s, any objections to this change in 3.12 (pending a review here)? I suspect the current use of dict will eventually cause a crash somewhere and more-so as folks use subinterpreters more. I'm asking here because I'd prefer 3.12 to match 3.13.

Copy link
Member

@Yhg1s Yhg1s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this fixes a fundamental issue, I'm okay with backporting it to 3.12rc1, if it can get into a decent shape in time.

Python/import.c Show resolved Hide resolved
Python/import.c Outdated Show resolved Hide resolved
Python/import.c Outdated Show resolved Hide resolved
Python/import.c Outdated Show resolved Hide resolved
Python/import.c Outdated
goto finally;
}

PyObject *extensions = EXTENSIONS.dict;
if (extensions == NULL) {
key = hashtable_key_from_2_strings(filename, name, ':');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

: is valid in filenames (on most filesystems). Why not /, which isn't? Also, since it's used in two places and not easy to miss/misinterpret, it should probably be a #define/static constant.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An module name (identifier) will never have ":" or "/" in it. I figured the separator didn't matter as long as it wasn't a valid character in identifiers. I thought "/" might be a little confusing when debugging since it will likely be in the filename, so I picked the next character that made sense to me in this context, ":".

If you think there's an advantage to "/" (or any other character), or a disadvantage to ":" or you feel strongly about it, I'd be glad to change the separator. It isn't critical to me.

Python/import.c Show resolved Hide resolved
Python/import.c Outdated
goto finally;
}
res = 0;
/* This decref would be problematic if the module def were
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That brings up a good point: shouldn't the PyModuleDefs placed in the shared cache be made immortal? Otherwise how can they sensibly be shared at all?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. We don't use module defs like the usual objects and generally assume they are statically defined, but it would be good to play it safe.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@ericsnowcurrently ericsnowcurrently merged commit 8ba4df9 into python:main Jul 28, 2023
22 checks passed
@miss-islington
Copy link
Contributor

Thanks @ericsnowcurrently for the PR 🌮🎉.. I'm working now to backport this PR to: 3.12.
🐍🍒⛏🤖

@miss-islington
Copy link
Contributor

Sorry, @ericsnowcurrently, I could not cleanly backport this to 3.12 due to a conflict.
Please backport using cherry_picker on command line.
cherry_picker 8ba4df91ae60833723d8d3b9afeb2b642f7176d5 3.12

@ericsnowcurrently ericsnowcurrently deleted the fix-extensions-tstate-alt branch July 28, 2023 20:39
ericsnowcurrently added a commit to ericsnowcurrently/cpython that referenced this pull request Jul 28, 2023
…hongh-106974)

This fixes a crasher due to a race condition, triggered infrequently when two isolated (own GIL) subinterpreters simultaneously initialize their sys or builtins modules.  The crash happened due the combination of the "detached" thread state we were using and the "last holder" logic we use for the GIL.  It turns out it's tricky to use the same thread state for different threads.  Who could have guessed?

We solve the problem by eliminating the one object we were still sharing between interpreters.  We replace it with a low-level hashtable, using the "raw" allocator to avoid tying it to the main interpreter.

We also remove the accommodations for "detached" thread states, which were a dubious idea to start with.

(cherry picked from commit 8ba4df9)
@bedevere-bot
Copy link

GH-107412 is a backport of this pull request to the 3.12 branch.

@bedevere-bot bedevere-bot removed the needs backport to 3.12 bug and security fixes label Jul 28, 2023
@ericsnowcurrently
Copy link
Member Author

Thanks for the reviews, @Yhg1s!

ericsnowcurrently added a commit that referenced this pull request Jul 28, 2023
…-106974) (gh-107412)

gh-105699: Use a _Py_hashtable_t for the PyModuleDef Cache (gh-106974)

This fixes a crasher due to a race condition, triggered infrequently when two isolated (own GIL) subinterpreters simultaneously initialize their sys or builtins modules.  The crash happened due the combination of the "detached" thread state we were using and the "last holder" logic we use for the GIL.  It turns out it's tricky to use the same thread state for different threads.  Who could have guessed?

We solve the problem by eliminating the one object we were still sharing between interpreters.  We replace it with a low-level hashtable, using the "raw" allocator to avoid tying it to the main interpreter.

We also remove the accommodations for "detached" thread states, which were a dubious idea to start with.

(cherry picked from commit 8ba4df9)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants