-
-
Notifications
You must be signed in to change notification settings - Fork 31.6k
PEP 432: Rewrite Py_Main() #76211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Python has a lot of code for its initialization. It's very hard to touch this code without risking to break something. It's hard to move code since many parts of the code are interdepent. The code rely on global "Py_xxx" configuration variables like Py_IsolateFlag (set by -I command line option). Moreover, currently Python uses the "Python runtime" early. For example, the code to parse the -W command line option uses PyUnicode_FromWideChar() and PyList_Append(). We need a stricter separation for the code before the "Python runtime" is initialized, at least partially initialized. Nick Coghlan and Eric Snow are already working on all these issues as part of the implementation of PEP-432. They redesigned Py_Initialize() and Py_Finalize(). I would like to finish the work on the step before: the Py_Main() function. Attached PR is a work-in-progress to rework deeply the Py_Main() function. I have different goals:
Since pymain_free() now wants to release the memory, we need to force a memory allocator for PyMem_RawMalloc(), since pymain_core() changes the memory allocator. The main() already does something similar, but with simpler code since main() is a private function, whereas Py_Main() seems to be part of the public C API! |
I rewrote Py_Main() to prepare the code to be able to implement my "-X dev" idea: The problem is that currently the code parsing command line options and the code setting the memory allocator (handle PYTHONMALLOC environment variable) are mixed, it's not possible to touch this code. I had similar technical issues when trying to implement properly my PEP-540 idea (Add a new UTF-8 mode): it's hard to change the Python filesystem encoding to UTF-8 after parsing command line arguments, since the current code to parse command line arguments already rely on the Python filesystem encoding and other parts of the Python runtime like the memory allocators. If I implemented my PR correctly, it should become much more simpler to control PYTHONMALLOC and the filesystem encoding from the command line: from an -X option. |
While it doesn't necessarily need to be in this patch, something else I recently realised (by breaking it *cough* [1]) is that the interaction between our command line options and our environment variables isn't really clearly defined anywhere. d7ac061 restored the handling of simple on/off toggles as "toggle enabled = env var is set OR CLI flag is passed", but I noticed the other day that the interaction between PYTHONWARNINGS, the
The ordering makes *sense* (where sys.warnoptions just lists filter definitions in the order they're given, and later filters take priority over earlier ones, just as they do for any "warnings.filterwarnings" call), but it isn't immediately intuitive (since the outcome relies on filters being prepended by default). That said, I've checked and the current warnings configuration behaviour *is* explicitly covered by the test suite (in https://github.com/python/cpython/blob/master/Lib/test/test_warnings/__init__.py), so a passing test suite provides confidence we haven't broken anything on that front. |
I wrote a new "_PyInitError" type to report more information when something goes wrong:
Example: $ PYTHONHASHSEED=x ./python
Fatal Python error: _Py_HashRandomization_Init: PYTHONHASHSEED must be "random" or an integer in range [0; 4294967295] => Python doesn't fail with abort() anymore Previously, Python called abort() and so might dump a core file: $ PYTHONHASHSEED=x python3
Fatal Python error: PYTHONHASHSEED must be "random" or an integer in range [0; 4294967295] Aborted (core dumped) |
Nick: $ PYTHONWARNINGS=always,default python3 -Wignore -Wonce
(...)
>>> sys.warnoptions
['always', 'default', 'ignore', 'once']
>>> [f[0] for f in _warnings.filters[:4]]
['once', 'ignore', 'default', 'always']
""" IMHO the command line must have the priority over environment variables, since environment variables are inherited, whereas the command line can be finely tuned *on purpose*. So it's correct, no? It's just that warnoptions gives options in the reverse order than the "expected" order, no? At least, with my commit, if you want to try to change the priority, you just have to exchange two lines in pymain_add_warnings_options() :-)
-- By the way, I'm not sure that it was a good idea to expose sys.warnoptions to users, but I don't think that we can remove it anymore :-) |
I rewrote Py_Main() in two large commits. The code isn't perfect yet, for example, init_sys_streams() still reads the PYTHONIOENCODING environment variable. IMHO it should be read earlier. But as soon, my changes should allow to fix such issues much more easily. Since I was now able to implement my -X dev option in bpo-32043, I don't plan to push further changes in short term on Py_Main(). I keep this issue open a little bit to check if everything is fine on buildbots, and check if someone has complains :-) |
Victor, *please* don't add the external import settings to CoreConfig. That struct should only contain the absolute bare minimum of settings needed to get an interpreter that *can't* access the filesystem, such that builtin modules and frozen modules work, but nothing else does. If you need some extra structures to hold command line and environment state, that's fine, but the full external import system attributes should go in the main interpreter config, as described in https://www.python.org/dev/peps/pep-0432/#supported-configuration-settings |
Nick: Ok, I created PR 4511. |
Nice, thanks for that. Good call on keeping the current data types for now, so we can focus on consolidating the configuration settings first, and then look at upgrading from C level types to Python level types later. |
New changeset 0ea395a by Victor Stinner in branch 'master': |
I like the new shape of Py_Main(). The main parts of Py_Main() are now well identified:
|
Summary of the visible changes:
Summary of the most important changes:
|
Oh, the following command does crash: PYTHONMALLOC=pymalloc ./Programs/_testembed forced_io_encoding |
Hum, _PyCoreConfig.ignore_environment is redundant with Py_IgnoreEnvironmentFlag. I don't recall why I added it to _PyCoreConfig. Maybe it should be removed. |
_PyCoreConfig.ignore_environment was part of the initial PEP-432 implementation that I wrote. It's that due to the design goal that once the refactoring is complete, an embedding application should be able to control *all* the settings through the config structs, without *ever* touching the global variables directly. |
I made most, if not all, changes that I wanted to do. It's time to close this huge issue to continue the work in new more specific issues. Notes on Py_Main(). (*) _PyPathConfig_Init() is called even if it's not needed (if all "Path configuration outputs" fileds of PyCoreConfig are filled). (*) pymain_cmdline() uses _Py_CommandLineDetails structure which contains a copy of each global configuration variable like Py_UTF8Mode. Internally, the function has to "set" or "get" these variables when calling some functions like _PyPathConfig_Init(). This code is fragile. *But* pymain_read_conf() is complex, it had to read again the whole configuration a second time if the encoding changed. It would be nice to remove global variables from _Py_CommandLineDetails to avoid the get/set dance which introduces a risk of loosing changes by mistake. But I'm not sure that it's doable? (*) Should we make Py_FileSystemDefaultEncoding and Py_FileSystemDefaultEncodeErrors configurable in _PyCoreConfig? Same question for _Py_StandardStreamEncoding, _Py_StandardStreamErrors and PYTHONIOENCODING environment variable. |
I close the issue. |
Oh, my latest commit introduced a regression in test_distutils: bpo-32652. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: