Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sys.path[0] breaks out of runfile tree. #382

Open
dillon-giacoppo opened this issue Nov 19, 2020 · 6 comments
Open

sys.path[0] breaks out of runfile tree. #382

dillon-giacoppo opened this issue Nov 19, 2020 · 6 comments
Assignees
Labels
core-rules Issues concerning the stubs for native rules type: bug

Comments

@dillon-giacoppo
Copy link
Contributor

dillon-giacoppo commented Nov 19, 2020

🐞 bug report

Affected Rule

py_binary, py_test.

Is this a regression?

No

Description

When Python initializes it adds the directory of the script to the sys.path:

As initialized upon program startup, the first item of this list, path[0], is the directory containing the script that was used to invoke the Python interpreter. If the script directory is not available (e.g. if the interpreter is invoked interactively or if the script is read from standard input), path[0] is the empty string, which directs Python to search modules in the current directory first. Notice that the script directory is inserted before the entries inserted as a result of PYTHONPATH.

This would imply that a py target such as {REPO_DIR}/bazel-bin/{target_path}/{target}.runfiles/{script_path}/{python_script}.py would cause {REPO_DIR}/bazel-bin/{target_path}/{target}.runfiles/{script_path}/ to be inserted as sys.path[0].

However, because the script is a symlink to the real python script in the repository, Python follows the symlink and appends the actual underlying directory to sys.path[0], in this case: {REPO_DIR}/{script_path}/. This issue was raised in Python's bug tracker and noted as expected behaviour that won't be fixed issue17639.

This allows the script to "break out" of the runfiles tree and import code from the same directory which it does not necessarily have access to via src or deps.

This causes issues when creating multiple targets in the same directory which should not automatically depend on each other.

For example, given:

repo/
  src/
    main.py
    foo.py

main can depend on foo without the file being explicitly listed in the BUILD file.

🔬 Minimal Reproduction

src/BUILD

package(default_visibility = ["//visibility:public"])

py_binary(
    name = "main",
    srcs = ["main.py"],
    python_version = "PY3",
    srcs_version = "PY3",
)

src/main.py

import sys

import foo

print(sys.path[0])

src/foo.py

print("foo imported")

The output of bazel run //src:main will be:

{REPO}/src/
foo imported

Which should not be possible where foo.py is not an input to //src:main.

🌍 Your Environment

Operating System:

  
macOS 11.0.1 (20B29)
  

Output of bazel version:

  
Build label: 3.7.0-homebrew
Build target: bazel-out/darwin-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Sat Nov 14 13:29:41 2020 (1605360581)
Build timestamp: 1605360581
Build timestamp as int: 1605360581
  

Rules_python version:

  
0.1.0
  

Anything else relevant?
The most obvious solution that comes to mind for this would be to copy into the runfile tree rather than symlinking. It seems unlikely this is going to change upstream in Python.

@thundergolfer
Copy link
Collaborator

Thanks for the report. This same problem is currently documented here bazelbuild/bazel#7091 but I cannot find a counterpart already in this project.

The Python stub template has an example of a 'hack' to fix the problem https://github.com/bazelbuild/bazel/blob/master/src/main/java/com/google/devtools/build/lib/bazel/rules/python/python_stub_template.txt#L19. But that only fixes the problem in that python process, and the stub template starts the user's process and the problem is reintroduced like you observe.

@thundergolfer thundergolfer added core-rules Issues concerning the stubs for native rules type: bug labels Nov 19, 2020
@person142
Copy link
Contributor

It seems like the stub template could use the -s option to the Python interpreter:

https://docs.python.org/3/using/cmdline.html#cmdoption-s

when it is building the command to execute:

https://github.com/bazelbuild/bazel/blob/master/src/main/java/com/google/devtools/build/lib/bazel/rules/python/python_stub_template.txt#L349

@dillon-giacoppo
Copy link
Contributor Author

@person142, neither -s or -S appears to stop the directory of the script being added to the path, rather, it prevents site wide packages being added to the path (perhaps this should be done anyway or provided as an option). For example, on my machine, it prevents /usr/local/lib/python3.7/site-packages' being added. This should probably be a separate issue ticket.

The problem will still exist because {REPO_DIR}/{script_path}/ is still added at sys.path[0]. It doesn't look like there is an interpreter option to disable this behavior.

There are some potential solutions noted in the bazel issue:

  1. Copy the runfiles instead of symlink.
  2. Invoke the script with -c or -m which does not set sys.path[0].
  3. Potentially reuse the interpreter of the wrapper rather than creating a new one (which already deletes sys.path[0]).

I am unsure which would be the most appropriate in this case. I think 1 might not be trivial because IIUC the runfiles behaviour is part of core Bazel and not rules_python specific.

@github-actions
Copy link

github-actions bot commented Jun 6, 2021

This issue has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days.
Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_python!

@github-actions github-actions bot added the Can Close? Will close in 30 days if there is no new activity label Jun 6, 2021
@aaliddell
Copy link
Contributor

This should probably be kept open since it's still a concern in the upstream issue

@github-actions github-actions bot removed the Can Close? Will close in 30 days if there is no new activity label Jun 7, 2021
@thundergolfer thundergolfer self-assigned this Jun 9, 2021
@Colecf
Copy link

Colecf commented Jun 8, 2022

In the latest python 3.11 beta there's a -P flag and PYTHONSAFEPATH environment variable that prevent the python interpreter from prepending this path. So this could very easily be fixed by added PYTHONSAFEPATH to new_env in python_stub_template.txt. It will only fix it for python 3.11+ though, and it's only expected to release in October.

copybara-service bot pushed a commit to bazelbuild/bazel that referenced this issue Jul 21, 2022
See: bazelbuild/rules_python#382

Closes #15701.

PiperOrigin-RevId: 462362643
Change-Id: I80fa61076e43153b2567b7ed68b3280722e85905
jrguzman-ms pushed a commit to msft-mirror-aosp/platform.system.apex that referenced this issue Jan 24, 2023
There are a couple issues when running
pkgutil.get_data('apexer', 'mke2fs.conf') from a bazel-built
apexer binary:

- Soong will take the path to mke2fs.conf as being relative
  to the filegroup that included it, so mke2fs.conf ends up at
  the root of the python zip file, and next to apexer.py.
  Bazel doesn't do this, and instead keeps the mke2fs.conf file
  under a system/extras/ext4_utils folder. Use a genrule to
  explicitly copy mke2fs.conf to the current directory to work
  around this issue.

- pkgutil.get_data() uses python's normal rules for locating
  modules in order to find where the file is. This means it will
  look through sys.path again to find the apexer module. Bazel
  currently has a bug where the first entry in sys.path is the
  location of apexer.py in the source tree, not in the runfiles
  directory. So pkgutil will find apexer.py there, and then
  start looking for mke2fs.conf next to it, and not find it.
  Work around this by deleting sys.path[0].
  bazelbuild/rules_python#382

Bug: 204244290
Test: Presubmits
Change-Id: I9fb9502bff2590abfde95f7f7cc9ebf31157836d
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core-rules Issues concerning the stubs for native rules type: bug
Projects
None yet
Development

No branches or pull requests

5 participants