-
-
Notifications
You must be signed in to change notification settings - Fork 30.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[C API] Add _PyInterpreterState_SetConfig(): reconfigure an interpreter #86426
Comments
This issue is a follow-up of the PEP-567 which introduced the PyConfig C API and is related to PEP-432 which wants to rewrite Modules/getpath.c in Python. I would like to add a new PyInterpreterState_SetConfig() function to be able to reconfigure a Python interpreter in C. One example is to write a custom sys.path, to implement of virtual environment (common request for embedded Python), etc. Currently, it's really complex to tune the Python configuration. The use case is to tune Python for embedded Python. First, I would like to add new functions to the C API for that:
The second step will to be expose these two functions in Python (I'm not sure where for now), and gives the ablity to tune the Python configuration in pure Python. The site module already does that for sys.path, but it is running "too late" in the Python initialization. Here the idea is to configure Python before it does access any file on disk, after the "core" initialization and before the "main" initialization. One concrete example would be to reimplement Modules/getpath.c in Python, convert it to a frozen module, and run it at Python startup to populate sys.path. It would allow to move some of the site code into this module to run it earlier. Pseudo-code in C: void init_core(void)
{
// "Core" initialization
PyConfig config;
PyConfig_InitPython(&config);
PyConfig._init_main = 0
Py_InitializeFromc(&config);
PyConfig_Clear(&config);
}
void tune_config(void)
{
PyConfig config;
PyConfig_InitPython(&config); // Get a copy of the current configuration // ... put your code to tune config ... // dummy example, current not possible in Python // Reconfigure Python with the updated configuration
PyInterpreterState_SetConfig(&config); // <=== NEW API!
PyConfig_Clear(&config);
}
int main()
{
init_core();
tune_config(); // <=== THE USE CASE!
_Py_InitializeMain();
return Py_RunMain();
} In this example, tune_config() is implemented in C. But later, it will be possible to convert the configuration to a Python dict and run Python code to tune the configuration. The PEP-587 added a "Multi-Phase Initialization Private Provisional API":
|
If we remove Modules/getpath.c, it will no longer be possible to automatically computes the path configuration when one of the following getter function will be called:
It means that these functions would not return NULL if called before Python is initialiazed, but return the expected string once Python is initialized. Moreover, Py_SetPath() would no longer automatically computes the "program full path" (sys.executable). |
It is not really an incompatible change according to the documentation: "Note: The following functions should not be called before Py_Initialize(): Py_EncodeLocale(), Py_GetPath(), Py_GetPrefix(), Py_GetExecPrefix(), Py_GetProgramFullPath(), Py_GetPythonHome(), Py_GetProgramName() and PyEval_InitThreads().". |
The main drawback of rewriting Modules/getpath.c as Lib/_getpath.py (and removing getpath.c) is that PyConfig_Read() could no longer compute the Python Path Configuration. It would return an "empty" path configuration. |
Responding to your request for feedback on Python-Dev: We embed Python dynamically by finding the libPython DLL, loading it, and looking up the required symbols. We make appropriate define's so that the Python headers (and NumPy headers) point to our functions which in turn point to the looked up symbols. Our launcher works on Linux, macOS, and Windows and works with many environments including standard Python and conda and brew. It also supports virtual environments in most cases. Also, a single executable [per platform] is able to work with Python versions 3.7 - 3.9 (3.6 was recently dropped, but only for external reasons). So my comment is not directly addressing the usefulness of configuring Python initialization - but I would like to request that this ability to dynamically load Python DLLs remains even with any new initialization mechanism. As another note, the main issues we run into are configuring the Python path to properly find packages and DLLs. A goal of ours is to be able to provide the base application as a drag-and-drop style installer with its own full embedded Python distribution (but still loaded dynamically) and then be able to supply additional plug-in packages (Python packages) by drag and drop. This is somewhat similar to conda packaging but without support for command line tools. |
I don't plan to remove any feature :-)
Do you mean sys.path? If yes, that's one of the goal of this issue. Allow you to write your own Python code to configure sys.path, rather than having to write C code, before the first (external) import. How do you configure sys.path currently? Do you parse a configuration file? Do you use a registry key on Windows? |
We have several launch scenarios - but for the currently most common one, which is to launch using a separate, existing Python environment, we call Py_SetPythonHome and Py_SetPath with the home directory of the environment. Then, presumably, the more complete path gets set in either Py_Initialize or when we call PyImport_ImportModule(“sys”). I might have tracked the details down once, but I don't recall them. By the time our Python code starts running, sys.path is reasonably populated. However, in another scenario, we launch with an embedded Python environment, essentially a virtual environment. In that case, we have a config file to explicitly add lib, DLLs, and site packages. But something goes wrong [cannot find/load the unicode DLL IIRC] unless we call site.addsitedir for each directory already in sys.path near the start of our Python portion of code. My notes point to two issues to explain this: https://bugs.python.org/issue22213 and https://bugs.python.org/issue35706. |
I am glad to hear that. I'm somewhat nervous about it nevertheless. In particular, the implementation of Py_DECREF changed from 3.7 to 3.8 to 3.9. 3.7 worked entirely in a header; but 3.8 had a quirky definition of _Py_Dealloc which used _Py_Dealloc_inline but was defined out of order (used before defined). This was somewhat addressed in https://github.com/python/cpython/pull/18361/files; however 3.9 now has another mechanism that defines _Py_Dealloc in Objects/object.c. This isn't a major problem because it has the same implementation as before, but changes like this have the potential to make the launcher binary be version specific. Again, not a deal breaker, but it still makes me nervous. |
Please don't use PyDict_GetItemString(), it will be deprecated. You can use _PyDict_GetItemStringWithError(). Also always check the raised exception type before overwriting the exception, so you will not swallow MemoryError or other unexpected error. |
I opened a thread on python-dev about this issue: |
The initial issue, adding an API to reconfigure an interepreter, is implemented: I added _PyInterpreterState_SetConfig(). But I failed finding time to finish the larger project "rewrite getpath.c in Python" (PR 23169). It requires changing the C API of the PEP-587 which is not easy. Also I'm not fully convinced that there is a strong need to change getpath.c. I would be interested to move code from site.py to _getpath.py, but it's also not obvious if there a strong benefit. |
Summary: With changes to the PyConfig api in 3.10 `PyConfig_SetBytesArgv` needs to be called prior to `PyConfig_Read` to properly pass arguments to the runtime with. Python 3.10 [changelog](https://docs.python.org/3.10/whatsnew/changelog.html#id173) > [bpo-42260](python/cpython#86426): The PyConfig_Read() function now only parses PyConfig.argv arguments once: PyConfig.parse_argv is set to 2 after arguments are parsed. Since Python arguments are strippped from PyConfig.argv, parsing arguments twice would parse the application options as Python options. Reviewed By: andrewjcg Differential Revision: D42762310 fbshipit-source-id: d0e529ca00d48d5bbaa056f5a3f531631b28a178
Summary: With changes to the PyConfig api in 3.10 `PyConfig_SetBytesArgv` needs to be called prior to `PyConfig_Read` to properly pass arguments to the runtime with. Python 3.10 [changelog](https://docs.python.org/3.10/whatsnew/changelog.html#id173) > [bpo-42260](python/cpython#86426): The PyConfig_Read() function now only parses PyConfig.argv arguments once: PyConfig.parse_argv is set to 2 after arguments are parsed. Since Python arguments are strippped from PyConfig.argv, parsing arguments twice would parse the application options as Python options. Reviewed By: andrewjcg Differential Revision: D42762310 fbshipit-source-id: d0e529ca00d48d5bbaa056f5a3f531631b28a178
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: