Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: remove 'site-packages' from install dir prefix #1766

Closed
wants to merge 1 commit into from

Conversation

mattem
Copy link
Collaborator

@mattem mattem commented Feb 16, 2024

Removes the site-packages path segment in unpacked external dependency installs.

This additional path segment causes issues when dealing with very large dependency trees, so much so that the value of PYTHONPATH exceeds the OS limit of what can be set.

This manifests due to the Python binary launcher adding workspace roots automatically to sys path, in addition to the site-packages segmented path, ie the PYTHONPATH var will end up with both (from printing the PYTHONPATH in examples/pip_parse in rules_python.

Before:

/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_certifi/site-packages
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_chardet/site-packages
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_idna/site-packages
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_urllib3/site-packages
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_requests/site-packages
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_alabaster/site-packages
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_babel/site-packages
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_docutils/site-packages
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_imagesize/site-packages
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_zipp/site-packages
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_importlib_metadata/site-packages
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_markupsafe/site-packages
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_jinja2/site-packages
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_packaging/site-packages
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_pygments/site-packages
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_snowballstemmer/site-packages
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_sphinxcontrib_jsmath/site-packages
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_sphinx/site-packages
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_sphinxcontrib_serializinghtml/site-packages
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_sphinxcontrib_qthelp/site-packages
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_sphinxcontrib_htmlhelp/site-packages
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_sphinxcontrib_devhelp/site-packages
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_sphinxcontrib_applehelp/site-packages
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/_main
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_alabaster
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_babel
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_certifi
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_chardet
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_docutils
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_idna
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_imagesize
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_importlib_metadata
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_jinja2
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_markupsafe
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_packaging
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_pygments
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_requests
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_snowballstemmer
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_sphinx
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_sphinxcontrib_applehelp
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_sphinxcontrib_devhelp
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_sphinxcontrib_htmlhelp
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_sphinxcontrib_jsmath
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_sphinxcontrib_qthelp
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_sphinxcontrib_serializinghtml
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_urllib3
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_zipp
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~python~python_3_9_aarch64-apple-darwin

After:

/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_certifi
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_chardet
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_idna
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_urllib3
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_requests
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_alabaster
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_babel
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_docutils
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_imagesize
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_zipp
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_importlib_metadata
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_markupsafe
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_jinja2
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_packaging
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_pygments
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_snowballstemmer
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_sphinxcontrib_jsmath
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_sphinx
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_sphinxcontrib_serializinghtml
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_sphinxcontrib_qthelp
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_sphinxcontrib_htmlhelp
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_sphinxcontrib_devhelp
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~pip~pypi_39_sphinxcontrib_applehelp
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/_main
/private/var/tmp/_bazel_matt/3cf6080de036ce65f48725921520350e/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/main.runfiles/rules_python~override~python~python_3_9_aarch64-apple-darwin

In large graphs, this increase in env var length results in a catastrophic failure:

[REDACTED] line 374, in _RunExecv
    os.execv(python_program, [python_program, main_filename] + args)

OSError: [Errno 7] Argument list too long: /[REDACTED]/python3_9_x86_64-unknown-linux-gnu/bin/python3'

While this behaviour is possible to toggle via --experimental_python_import_all_repositories, there are a large number of dependencies that expect these roots to be available (runfiles for example iirc).

@mattem mattem force-pushed the no_site_packages branch 3 times, most recently from d64838d to 643ea7a Compare February 27, 2024 16:49
@aignas
Copy link
Collaborator

aignas commented Feb 29, 2024

I am not sure if changing this won't have any unintended consequences. I've linked a few tickets here:

  1. People may depend on site-packages being present in the path. See the PR and the rules_oci example for splitting packages by layer in https://github.com/aspect-build/bazel-examples/blob/main/oci_python_image/py_layer.bzl#L14.
  2. The entries in the PYTHONPATH having the top level folder of site-packages is much better than <whatever bzlmod layout currently is> as the later might have extra characters that don't play nice with the PYTHONPATH spec. Maybe the solution could be to use rules_py way of constructing a venv but it sounds like a very invasive chaneg.

Probably there are more things, but these are the only that come to my head right now.

@arrdem
Copy link
Contributor

arrdem commented Feb 29, 2024

+1 to site-packages being a conventional stable path component, as in standard virtualenv and other interpreter deployments. I own machinery which relies on this and this would be a breaking change to me. I'd rather see work which either bypassed $PYTHONPATH for setup entirely (see the current draft of having a manifest and a bootstrap shim that manipulates sys.path directly) or which provided for "flattening" files from many PyPi site-package trees into a single site-package tree for all 3rdparty deps.

@rickeylev
Copy link
Contributor

rickeylev commented Mar 1, 2024

re: backwards compatibility concerns: Yeah, these are valid concerns. I'm fine with putting this behind an environment variable to control behavior. This at least removes Matt having to carry a patch indefinitely until we can fix it and he can upgrade to Bazel 7. Then the envvar can just be a no-op and we can drop it.

Unfortunately, due to support for Bazel 6.x, our options for addressing the problem of the long PYTHONPATH env var are pretty limited because we can't change the binary generation or bootstrapping in that version of Bazel. With Bazel 7, its a much more solvable problem, as we can now just modify the rules and bootstrap directly (patches welcome). Unfortunately, Matt's situation needs to support Bazel 6.x

The only other idea we've been able to come up with is to hide the deps/PyInfo from the binary by having a generated site.py-based target capture it and perform path manipulations. I don't think anyone has tried to implement it yet, though.

I own machinery which relies on this

Can you explain why it relies on it? There aren't any promises about the unpacked layout of the downloaded packages. The targets are the unit of abstraction.

see the current draft ...

What draft is this referring to?

@rickeylev
Copy link
Contributor

until he can fix it

lol, oops, typo. I meant "we" as in the rules_python owners. Patches welcome, too, of course.

@aignas
Copy link
Collaborator

aignas commented Mar 1, 2024

Thanks for the context, I'm fine with this behaviour being gated under a feature flag.

@aignas
Copy link
Collaborator

aignas commented Apr 6, 2024

@mattem, should this PR be continued?

@jvolkman
Copy link
Contributor

jvolkman commented Apr 30, 2024

Just adding a note that I think this will break anyone using the py_layers or py_oci_image macros from here.

@aignas aignas added the Can Close? Will close in 30 days if there is no new activity label Jun 22, 2024
@aignas
Copy link
Collaborator

aignas commented Jul 19, 2024

Feel free to reopen PR later. Closing to clean things up.

@aignas aignas closed this Jul 19, 2024
@aignas aignas deleted the no_site_packages branch July 19, 2024 07:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Can Close? Will close in 30 days if there is no new activity
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants