-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
py_binary rules default to importing non-sandboxed code #7091
Comments
Thanks for the summary. Another consequence of this issue is #793, where code that relies on importing unqualified names from the same directory as the script will fail depending on whether the script is generated. My guess is that the easiest solution is for us to hack around this in the stub file somehow. |
It looks like Looking at the CPython source, apparently |
It seems we are running into a similar issue when trying to auto-generate Right now we have to patch the packages to rename these files, e.g. to It would be nice not having to do this workaround and just work out of the box with these pip packages. |
Quoting http://python-notes.curiousefficiency.org/en/latest/python_concepts/import_traps.html#the-double-import-trap: """ There?s a reason the general ?no package directories on sys.path? guideline exists, and the fact that the interpreter itself doesn?t follow it when determining sys.path[0] is the root cause of all sorts of grief. """ Also relevant: bazelbuild/bazel#7091 This is addressed toward issue #539 PiperOrigin-RevId: 268058051
Quoting http://python-notes.curiousefficiency.org/en/latest/python_concepts/import_traps.html#the-double-import-trap: """ There?s a reason the general ?no package directories on sys.path? guideline exists, and the fact that the interpreter itself doesn?t follow it when determining sys.path[0] is the root cause of all sorts of grief. """ Also relevant: bazelbuild/bazel#7091 This is addressed toward issue tensorflow#539 PiperOrigin-RevId: 268058051
Is there any plan to merge proposed fixes for that issue? This is really bad as it completely breaks sandboxing for python binaries and libs. |
Just wanted to add, for anyone who comes across this like me, that this bug also affects |
Oh no, looks like I spoke too soon! I was bamboozled since I came back to this after not having looked at it for a while. The test I was running was importing code from a different module and hence not seeing these issues. The wrapper script that was updated does indeed prune |
We cannot use bazel until bazelbuild/bazel#7091 is resolved since bazel invokes Python tests via `python path/to/test.py` and this causes python to add `path/to` as the first entry in sys.path. This means for any test in `haiku/_src` we have a collision between `typing` in the standard library and `haiku/_src/typing.py` causing tests to fail. Running Haiku tests by passing modules to python (e.g. `python -m path.to.test`) works, as does running via `pytest`. We prefer `pytest` since it discovers tests and has support for running tests in parallel. PiperOrigin-RevId: 291146369 Change-Id: I01165f0d0dc20f1c520c02ae9b1da155dbbf0eb6
How can this be P3? This is a major issue when assessing rebuildability of a py_test or py_binary target. Is there any plan to actually fix this? |
Passing |
What is the status of this issue? Like @thekyz I would also like to challange this bug being P3. One of the basic principles of Bazel is hermeticity and correctness. With this bug sandoboxing in Python is simply broken. |
Note that we solved this for NodeJS by patching the stdlib to make bazel's sandbox appear as regular files rather than symlinks. https://github.com/bazelbuild/rules_nodejs/tree/stable/packages/node-patches |
Same problem in Ruby.
This rule should not work because I bet it's the same problem with Perl if you use This problem is unsettling. |
I think that assigning this problem team-Rules-Python is not a good choice. Isn't this problem about all local execution and sandboxing? |
The initial example confused me, as it uses
|
This bug interacts particularly perniciously with |
This bug was also reported as bazelbuild/rules_python#382, and f99cebe fixed it for Python 3.11+ (but note that Python 3.11 is not released yet, it's only coming in October). |
I had an idea for another possible solution here. It's not pretty, but should contain edits to the stub template and work with earlier versions. Basically, use a combination of compile(), exec() and sys.modules (maybe I'm pretty sure I got the idea here from PAR files loading things from zip files. It's implemented in Python, but has to play tricks to load the original main and trigger the |
I take it @jvolkman 's PR fixed this. |
@larsrc-google that PR was never merged (it wasn't mine; I just commented). I think #15701 fixes this, but only for Python 3.11+. |
It appears I'm missing permissions to reopen the issue. Could someone else do that? |
The tests started to fail after the repo-wide python 3.10 -> 3.11 update. This is caused by Bazel's py_binary rule setting the [`PYTHONSAFEPATH`][1] environment variable, which only has an effect for Python >= 3.11. Setting this variable avoids prepending the current working directory and the script's directory. The current test code relied on this behavior and thus failed with: ``` Traceback (most recent call last): File "/build/.cache/bazel/_bazel_build/8bcfff1c77854f2a2b07d1413b0fc106/execroot/our_workspace/bazel-out/k8-fastbuild/bin/python/bin.runfiles/our_workspace/python/bin.py", line 6, in <module> from lib import foo ModuleNotFoundError: No module named 'lib' ``` See also [bazelbuild/bazel#7091][2] [1]: https://docs.python.org/3.11/using/cmdline.html#envvar-PYTHONSAFEPATH [2]: bazelbuild/bazel#7091
Description of the problem / feature request:
Python (2.7.15 in this case) populates the first entry of the
sys.path
by resolving the first argument it is given to its real path (including resolving symlinks) and then using the directory containing that real path. This interferes with Bazel's sandboxing of Python files via symlinks and the runfiles in general, since the actual directory from the original source code will be the first entry in thesys.path
.More info on the method of populating
sys.path[0]
can be found in these discussions:https://bugs.python.org/issue6386
https://bugs.python.org/issue17639
Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
Init a fresh git repo,
git apply
this patch, and runbazel run py:test
Note that when
sys.path
is printed out, the first entry is in the original source code, and not the bazel output directory. Also note that thepy/foo/baz.py
file isn't included in the target but can be imported. Manually trimmingsys.path
at the beginning of themain.py
file yields the expected behavior.On solution to this is to not symlink, but rather actually copy, the main file. There are likely other solutions (maybe something involving a
.pth
file?) but I wasn't able to determine if it was possible to turn off this symlink-following behavior in Python (it seems like not, given the conversations I linked to earlier).What operating system are you running Bazel on?
macOS 10.14.2
What's the output of
bazel info release
?release 0.20.0-homebrew
(though I also repro'd this with non-Brew 0.21.0 and Brew 0.17.2)
Have you found anything relevant by searching the web?
Python issues listed above, regarding
sys.path[0]
:https://bugs.python.org/issue6386
https://bugs.python.org/issue17639
Bazel issue regarding symlinking to a runfiles:
#4022
Commit handling runfiles symlinks on Windows:
#6036
The text was updated successfully, but these errors were encountered: