-
-
Notifications
You must be signed in to change notification settings - Fork 31.6k
_ast module: get_global_ast_state() doesn't work with Mercurial lazy import #85797
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Building Mercurial with Python 3.9.0rc1 fails with the error: SystemError: <built-in function compile> returned NULL without setting an error The problem comes from the PyAST_Check() function. This function calls get_global_ast_state() which gets the state of the _ast module. If the module is not imported yet, it is imported. The problem is that Mercurial monkey-patches the __import__() builtin function to implement lazy imports. get_global_ast_state() calls PyImport_Import() which returns a Mercurial _LazyModule(name='_ast', ...) object. Calling get_ast_state() (PyModule_GetState()) on it is unsafe since it's not the _ast extension module, but another module (which has no state, PyModule_GetState() returns NULL). https://bugzilla.redhat.com/show_bug.cgi?id=1871992#c1 -- The _ast extension module was modified multiple times recently:
I did the PEP-489 change to fix a regression caused by the first change (PEP-384), two bugs in fact: |
Given how close are we to a release for 3.9 we should try to maximize stability. |
Regarding ac46eb4: ---- So, the main issue here is that the AST types are not used only by the _ast module, but by the interpreter itself: the compile() builtin and Py_CompileStringObject. I see two ways of fixing this properly:
|
Would that impact performance considerably? |
Also, option 1 is virtually equivalent to the state of the _ast module prior to the recent changes except that the symbols are in a shared object instead of the binary or libpython. The advantage here is not moving them out of the shared object, is making them having static storage. |
Also, adding them into a module that needs access through Python had a bootstrap problem: Basically: initializing import system -> initialize codec -> compile -> ast init -> init ast module -> 💥 |
What I'm not yet clear on: when is that shared object is initialized and destroyed?
Only for PyCF_ONLY_AST, which is, AFAIK, not used by the interpreter itself. |
I am assuming that you mean at the Python level and not at the linker level. Then:
|
With this I mean import-> dlopen -> dlsym for init function -> call init function |
At commit ac46eb4 ("bpo-38113: Update the Python-ast.c generator to PEP-384 (gh-15957)"): static struct PyModuleDef _astmodule = {
PyModuleDef_HEAD_INIT,
"_ast",
NULL,
sizeof(astmodulestate),
NULL,
NULL,
astmodule_traverse,
astmodule_clear,
astmodule_free,
};
#define astmodulestate_global ((astmodulestate *)PyModule_GetState(PyState_FindModule(&_astmodule)))
PyMODINIT_FUNC
PyInit__ast(void)
{
PyObject *m;
if (!init_types()) return NULL;
m = PyState_FindModule(&_astmodule);
...
return m;
} => I investigated bpo-41194 crash again. It seems like the root issue is more than PyInit__ast() returns a *borrowed* reference, rather than a strong reference. => The _ast module can only be loaded once, astmodule_free() and astmodule_clear() are not called at Python exit. At commit 91e1bc1 ("bpo-41194: The _ast module cannot be loaded more than once (GH-21290)"): static astmodulestate * #define get_global_ast_state() (&global_ast_state)
static struct PyModuleDef _astmodule = {
PyModuleDef_HEAD_INIT,
.m_name = "_ast",
.m_size = -1,
.m_traverse = astmodule_traverse,
.m_clear = astmodule_clear,
.m_free = astmodule_free,
};
PyMODINIT_FUNC
PyInit__ast(void)
{
PyObject *m = PyModule_Create(&_astmodule);
if (!m) {
return NULL;
}
astmodulestate *state = get_ast_state(m);
...
}
At commit b1cc6ba ("bpo-41194: Convert _ast extension to PEP-489 (GH-21293)"): static astmodulestate*
get_global_ast_state(void)
{
...
PyObject *module = PyImport_GetModule(name);
if (module == NULL) {
if (PyErr_Occurred()) {
return NULL;
}
module = PyImport_Import(name);
if (module == NULL) {
return NULL;
}
}
...
astmodulestate *state = get_ast_state(module);
Py_DECREF(module);
return state;
}
static struct PyModuleDef _astmodule = {
PyModuleDef_HEAD_INIT,
.m_name = "_ast",
.m_size = sizeof(astmodulestate),
.m_slots = astmodule_slots,
.m_traverse = astmodule_traverse,
.m_clear = astmodule_clear,
.m_free = astmodule_free,
}; PyMODINIT_FUNC
|
I also looked into Mercurial. They have a large list of modules [0] that don't work with the lazy-loading scheme. I don't think adding one more should be a big problem for them; but we'll reach out to them and confirm that. [0] https://www.mercurial-scm.org/repo/hg/file/tip/hgdemandimport/__init__.py#l26 |
We need this bug solved for 3.9.0 rc2. Łukasz, you're the one to make the call about the approach; how can we make your job easier? My view is:
But Pablo an Victor should chime in as well. |
I couldn’t reproduce the problem with a freshly compiled Python 3.9.0rc1 on Arch Linux. Was this ever reproduced on a non-Red Hat system? |
That is interesting. The original post here doesn't mention that the problem occurs in Mercurial's "make all", specifically when building documentation, in the command:
Could you confirm |
I was running "make all" and I also ran the documentation generator command without an error. However, I tried it again and now it failed the same way as reported. With a debug build, I get "Python/Python-ast.c:231: get_ast_state: Assertion `state != NULL' failed.". Sorry for the noise! |
About subinterpreters. In Python 3.8, _ast.AST type is a static type: static PyTypeObject AST_type = {...}; In Python 3.9, it's now a heap type: static PyType_Spec AST_type_spec = {...}; In Python 3.8, the same _ast.AST type was shared by all interpreters. With my PR 21961, the _ast.AST type is a heap type but it is also shared by all interpreters. Compared to Python 3.8, PR 21961 has no regression related to subinterpreters. It's not better nor worse. The minor benefit is that _ast.AST is cleared at Python exit (see _PyAST_Fini()). While I would be nice to make _ast.AST per interpreter, it would be an *enhancement*. I consider that it can wait for Python 3.10. |
Oh. I forgot that static types cannot be modified (in Python, but it's possible in C). So my PR still changed the behavior compared to 3.8: import _testcapi
import _ast
res = _testcapi.run_in_subinterp("import _ast; _ast.AST.x = 1")
if res != 0:
raise Exception("bug")
print(_ast.AST.x) On Python 3.8, this code snippet fails with: TypeError: can't set attributes of built-in/extension type '_ast.AST' On master, it is possible to modify or add an _ast.AST attribute. Since PR 21961 moves back to a global strange, Petr is correct that a subinterpreter can now modify the state of another subinterpreter the _ast.AST type. So I modified my PR 21961 to revert partially the change which converted AST_type type from a static type to a heap type. My PR 21961 converts AST_type back to a static type, to avoid these problems. Sadly, it makes _ast module incompatible with PEP-384, but the priority is to fix this 3rd regression. We can reconsider converting PEP-384 back to a heap type later, but we will have to be careful with not reintroducing all these bugs. |
I marked bpo-41766 as a duplicate of this issue. |
I close the issue. I tested manually: the fix works as expected. I reproduced https://bugzilla.redhat.com/show_bug.cgi?id=1871992#c1 bug (I had to comment the workaround of the mercurial package in Fedora) on 3.9: make[1]: Entering directory '/home/vstinner/mercurial/mercurial-5.4/doc' I confirm that the fix in 3.9 works as expected:
=> No crash with this fix. I also reproduced the bug in the master branch (without the fix), and I also confirm that the fix in master does fix the crash. |
I also tested bpo-41261 reproducer and https://bugs.python.org/issue41194#msg372863 reproducer on 3.9 and master branches: Python doesn't crash. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: