-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[subinterpreters] Eliminate PyInterpreterState.modules. #72597
Comments
tl;dr PyInterpreterState does not need a "modules" field. Attached is a patch that removes it. During interpreter startup [1] the sys module is imported using the same C API [2] as any other builtin module. That API only requires one bit of import state, sys.modules. Obviously, while the sys module is being imported, sys.modules does not exist yet. To accommodate this situation, PyInterpreterState has a "modules" field. The problem is that PyInterpreterState.modules remains significant in the C-API long after it is actually needed during startup. This creates the potential for sys.modules and PyInterpreterState.modules to be out of sync. Currently I'm working on an encapsulation of the import state. PyInterpreterState.modules complicates the scene enough that I'd like to see it go away. The attached patch does so by adding private C-API functions that take a modules arg, rather than getting it from the interpreter state. These are used during interpreter startup, rendering PyInterpreterState.modules unnecessary and allowing sys.modules to become the single source of truth. If this patch lands, we can close bpo-12633. [1] see _Py_InitializeEx_Private() and Py_NewInterpreter() in Python/pylifecycle.c |
(added Graham Dumpleton to the nosy list to ask if this change may impact mod_wsgi for 3.7) +1 on the general idea, but given that the current field is a public part of the interpreter state, the replacement access API should really be public as well - we can't be sure folks will always be going through the "PyImport_GetModuleDict()" API. If you replace the addition of the Folks accessing this field directly can then define their own shim function if PyInterpreterState_GetModuleCache isn't defined. (The rationale for the GetModuleDict -> GetModuleCache change is that "ModuleDict" is ambiguous - every module has a dict. For PyImport_* we're stuck with it, but the "PyImport" prefix at least gives a hint that the reference might be to the sys.modules cache. That affordance doesn't exist for the "PyInterpeterState" prefix. |
I always use PyImport_GetModuleDict(). So long as that isn't going away I should be good. |
What's the benefit to adding PyInterpreterState_GetModuleCache()? TBH, it should only be needed in this short period during startup when the import system hasn't been bootstrapped yet. After that code can import sys and access sys.modules from there. (For that matter, PyImport_GetModuleDict() doesn't buy all that much either...) I think all this would be clearer in a world with PEP-432. :) FWIW, I'm inclined to encourage new APIs where it makes sense that take an explicit interp arg. I just don't think a new public API is warranted here. If we didn't already have PyImport_GetModuleDict(), I'd probably just move the code to Python/pylifecycle.c, inlined or in a small static function. And +1 to ModuleCache vs. ModuleDict. :) |
Hmm, actually _PyImport_GetModuleDict() isn't needed to solve the startup issue. It's still rather internally focused but the same could be said for PyImport_GetModuleDict(). I guess I'm still not sold on adding a new public API function for what amounts to a rename of PyImport_GetModuleDict(). Furthermore, wouldn't it make more sense to keep it in the PyImport_* namespace? Perhaps we could set the precedent that explicitly-arg'ed functions should be in the PyInterpreterState_* (or PyInterpreter_*) namespace? |
Meh, there really isn't any need for _PyImport_GetModuleDict(). I'll drop it. Problem solved! :) |
I just checked the docs, and it turns out I'm wrong about this being a previously public API: "There are no public members in this structure." From https://docs.python.org/3/c-api/init.html#c.PyInterpreterState That means the only externally supported API that needs to be preserved is PyImport_GetModuleDict() to get the current thread's module cache, and your original patch already did that. |
As Nick pointed out, PyInterpreterState's fields are private so do what you want. :) |
FYI, this broke some (very) corner cases. See issue bpo-31404. |
We're reverting this (see bpo-31404), so back to the drawing board... |
@ericsnowcurrently: is this issue and the linked PR still something you are working on, or has it been superseded by other more recent issues? |
It is still alive, just way back on my TODO list. 🙂 |
@ericsnowcurrently I'll pick this one up. |
I have a WIP branch 1, it passes all the tests but has refleak. I'll investigate more once I get some time till then you can try it. Footnotes |
Be cautious. There are some potentially tricky bits to this (as evidenced by #72597 (comment)). FWIW, there is no urgency with this at the moment. |
Okay, it was on low priority anyways but we now have a updated branch against main to work with (if we want to). My top priority is Per Interpreter GIL for 3.12~3.13 and it's precursors which is more urgent and interesting than this ;) |
I've closed my PR (gh-3606) but will probably get back to it some time down the road. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: