Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sys.path[0] Is Set Differently From the Rest of sys.path #109853

Open
ericsnowcurrently opened this issue Sep 25, 2023 · 9 comments
Open

sys.path[0] Is Set Differently From the Rest of sys.path #109853

ericsnowcurrently opened this issue Sep 25, 2023 · 9 comments
Labels
3.13 bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-C-API topic-subinterpreters type-feature A feature request or enhancement

Comments

@ericsnowcurrently
Copy link
Member

ericsnowcurrently commented Sep 25, 2023

Feature or enhancement

Currently sys.path[0] is set by pymain_run_python() (in Modules/main.c). This happens after pymain_init(), which initializes the runtime, including the rest of sys.path (via getpath.py and site.py). This makes it harder to reason about and introduces extra complexity for subinterpreters. (See gh-109793 and gh-109794.)

We should consider calculating sys.path[0] and setting it to its own PyConfig field via getpath.py, when the rest of the base sys.path is calculated. We may need a later check to verify that there is a matching importer, as pymain_run_python() does. (FWIW, it isn't clear that there's any value to storing the sys.path[0] value on the global _PyPathConfig.)

Also, we currently wait to actually set sys.path[0] (for the main interpreter) until after the readline/rlcompleter modules are imported in pymain_run_python(). We'd need to factor that in.

CC @zooba, @vstinner, @ncoghlan

Linked PRs

@ericsnowcurrently ericsnowcurrently added type-feature A feature request or enhancement interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-subinterpreters topic-C-API 3.13 bugs and security fixes labels Sep 25, 2023
@vstinner
Copy link
Member

We should consider calculating sys.path[0] and setting it to its own PyConfig field via getpath.py

There are two cases:

  • (A) PyConfig.run_filename is set: pymain_get_importer() computes sys.path[0]
  • (B) Otherwise, _PyPathConfig_ComputeSysPath0() computes sys.path[0]

For the case (A), do you want to execute PyImport_GetImporter() twice? Once in Py_InitializeFromConfig(), then again in pymain_run_python()? It's to decide if pymain_run_module() or pymain_run_file() should be used.

For case (B), this code path can be easily moved to Py_InitializeFromConfig(). When I designed and implemented PyConfig, I tried to minimize changes. But apparently, now the dust has settled, and we can go further :-)

@zooba
Copy link
Member

zooba commented Sep 26, 2023

I think we should move most of the default sys.path calculation into python.c, including the running of getpath.py (we'd need to expose the ability to create and then close a runtime that can't import anything).

If we're able to fully initialise the search path using only our public APIs, we'll have a much better interface for embedders to use.

@vstinner
Copy link
Member

I would love that sys.path would be fully initialized before the site module is loaded. Currently, sys.path is still modified by the site module in many ways, and so python -S gives a different sys.path :-(

site changes:

  • Make paths absolute (why not doing that earlier?)
  • Add user site directory (is it complicated to move the logic to getpath?)

@zooba
Copy link
Member

zooba commented Sep 26, 2023

I expect most of the site module can move into getpath. Venv and pth sure can (though we'd have to defer code execution in pth files until after initialization finishes). Some of the interactive mode features probably can't, but I'd also like to treat those as something specific to python.c and separate from libpython (i.e. part of the Python program not the Python interpreter).

ericsnowcurrently added a commit that referenced this issue Oct 2, 2023
This change makes sure sys.path[0] is set properly for subinterpreters. Before, it wasn't getting set at all. This PR does not address the broader concerns from gh-109853.
ericsnowcurrently added a commit to ericsnowcurrently/cpython that referenced this issue Oct 11, 2023
…-109994)

This change makes sure sys.path[0] is set properly for subinterpreters.  Before, it wasn't getting set at all.

This change does not address the broader concerns from pythongh-109853.

(cherry-picked from commit a040a32)
ericsnowcurrently added a commit to ericsnowcurrently/cpython that referenced this issue Oct 12, 2023
…-109994)

This change makes sure sys.path[0] is set properly for subinterpreters.  Before, it wasn't getting set at all.

This change does not address the broader concerns from pythongh-109853.

(cherry-picked from commit a040a32)
ericsnowcurrently added a commit that referenced this issue Nov 27, 2023
…-110701)

This change makes sure sys.path[0] is set properly for subinterpreters.  Before, it wasn't getting set at all.

This change does not address the broader concerns from gh-109853.

(cherry-picked from commit a040a32)
@encukou
Copy link
Member

encukou commented Feb 19, 2024

This adds PyConfig.sys_path_0 as public API.
Should we add some documentation for it, or mark it internal (add an underscore)?

@zooba
Copy link
Member

zooba commented Feb 19, 2024

Need @ericsnowcurrently to confirm, but I suspect marking it internal is better. When the calculation gets refactored into getpath.py then there shouldn't be any need to store it separately.

@ericsnowcurrently
Copy link
Member Author

This adds PyConfig.sys_path_0 as public API. Should we add some documentation for it, or mark it internal (add an underscore)?

We should mark it as internal at least for now. We'd need to sort out the complexity I described above before this would become meaningful config.

@ncoghlan
Copy link
Contributor

The sys.path[0] initialisation semantics are even worse than @ericsnowcurrently describes, since runpy may mutate the value if it gets invoked via -m or path entry execution.

There's an intrinsic problem here in that sys.path[0] is not semantically identical to other sys.path entries (it can be set from a much wider variety of sources, including being dropped entirely when running in isolated mode), but once the desired value is figured out, we do want it to be treated the same as any other entry for module import purposes (hence it being in the list rather than stored somewhere else).

@zooba
Copy link
Member

zooba commented Apr 25, 2024

I wonder if we can make from . import <mod> work from __main__ easily so there's a way to transition towards -P (no sys.path[0] by default)? Or if that's even worth attempting?

Glyphack pushed a commit to Glyphack/cpython that referenced this issue Sep 2, 2024
This change makes sure sys.path[0] is set properly for subinterpreters. Before, it wasn't getting set at all. This PR does not address the broader concerns from pythongh-109853.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.13 bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-C-API topic-subinterpreters type-feature A feature request or enhancement
Projects
Status: Todo
Development

No branches or pull requests

5 participants