Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lots of repositories cause "Argument list too long" for py_binary #14640

Open
pauldraper opened this issue Jan 25, 2022 · 13 comments
Open

Lots of repositories cause "Argument list too long" for py_binary #14640

pauldraper opened this issue Jan 25, 2022 · 13 comments
Labels
P3 We're not considering working on this, but happy to review a PR. (No assignee) team-Rules-Python Native rules for Python type: bug

Comments

@pauldraper
Copy link
Contributor

pauldraper commented Jan 25, 2022

Problem

The Python wrapper is finding every repository in the runfiles tree, creating a huge PYTHONPATH variable, and then crashing when it exceeds the system's env var limits.

I encountered this on Ubuntu 20.04, Python 3.8.

OSError: [Errno 7] Argument list too long: '/root/.cache/bazel/_bazel_root/70833c7af31404a35dcc581d457bd519/execroot/rivethealth_rivet/bazel-out/k8-fastbuild/bin/api/test_it/bin.runfiles/bazel_tools/tools/python/py3wrapper.sh'

A large number (hundreds) of external repositories is quite possible with Maven and npm projects. In my case, I have utility tools written in Python used in the runtime tree of a Node.js binary (that pulls in npm repositories).

Solution

Ideally, py_binary would only add Python dependencies to the PYTHONPATH. Besides prevent a crash, it's also more efficient for loading modules.

Workarounds

import_all

An undocumented option --experimental_python_import_all_repositories=false prevents automatically adding these repositories.

However, it requires specifying the imports attribute, at least in some executions contexts, and of course the general Bazel ecosystem (e.g. rules_docker) doesn't do that.

usercustomize

Use usercustomize to change PYTHONPATH when execv is run.

~/.local/lib/python3.8/site-packages/usercustomize.py

import os

execv = os.execv

def _execv(*args, **kwargs):
    try:
        pythonpath = os.environ["PYTHONPATH"]
    except KeyError:
        pass
    # remove unnecessary entries from pythonpath
    # ...
    os.environ["PYTHONPATH"] = pythonpath
    return execv(*args, **kwargs)

os.execv = _execv
@gregestren gregestren added team-Rules-Python Native rules for Python untriaged labels Jan 31, 2022
@comius comius added P3 We're not considering working on this, but happy to review a PR. (No assignee) type: bug and removed untriaged labels Feb 2, 2022
@hrfuller
Copy link

hrfuller commented Feb 7, 2022

This can also happen when the $TMPDIR location is long so that the absolute paths to the import locations are long. Even with only python repos on the PYTHONPATH you can exceed this limit.

@jbedorf
Copy link

jbedorf commented Dec 20, 2022

This can also happen when you are using the py_test rule when using hermetic Python for the same reason as in the previous post.

@arunkant
Copy link

So what would be solution for this?

@hrfuller
Copy link

The long term solution is to use the site packages import mechanism for external repos instead of PYTHONPATH.

@pauldraper
Copy link
Contributor Author

So what would be solution for this?

@arunkant I included two workarounds in my bug report: #14640 (comment)

The first is better than the second, unless you are using third-party Python targets, e.g. in rules_docker.

@arunkant
Copy link

Any workaround for rules_docker?

@arunkant
Copy link

The long term solution is to use the site packages import mechanism for external repos instead of PYTHONPATH.

How to do that? @hrfuller any docs or config I can use. We are migrating our codebase to monorepo and consolidate all python deps at one location and I don't want all that effort to be blocked by this issue

@pauldraper
Copy link
Contributor Author

pauldraper commented Mar 13, 2023

@arunkant in that case, see my second workaround: use usercustomize.py. Monkeypatch os.execv to remove unwanted entries from environ["PYTHONPATH"].

@arunkant
Copy link

Thanks @pauldraper. That looks reasonable workaround. I'll try it

@amir-f
Copy link

amir-f commented Sep 13, 2023

I hit this issue recently which took a while to debug. Any plan to have it fixed?

In fact if the py_binary target relative path to the root of the repo is lengthy then it's easier to hit this limit because each entry of PYTHONPATH becomes longer

@pauldraper
Copy link
Contributor Author

pauldraper commented Sep 13, 2023

The way to fix it is to have everyone make their rules compatible with --noexperimental_python_import_all_repositories

@AugustKarlstedt
Copy link

Any workaround for rules_docker?

@arunkant rules_docker merged this which fixes --noexperimental_python_import_all_repositories bazelbuild/rules_docker#2171

@lazcamus
Copy link

@arunkant rules_docker merged this which fixes --noexperimental_python_import_all_repositories bazelbuild/rules_docker#2171

If you're running the latest rules_docker release (v0.25.0 from June 2022), you'll have to patch it into your WORKSPACE:

http_archive(
    name = "io_bazel_rules_docker",
    patch_args = ["-p1"],
    patches = [
        "//build/patch:github.com_bazelbuild_rules_docker_pull_2171.patch",
    ],
    sha256 = "b1e80761a8a8243d03ebca8845e9cc1ba6c82ce7c5179ce2b295cd36f7e394bf",
    urls = ["https://github.com/bazelbuild/rules_docker/releases/download/v0.25.0/rules_docker-v0.25.0.tar.gz"],
)

You can grab the patch file with:

% gh pr diff https://github.com/bazelbuild/rules_docker/pull/2171 > github.com_bazelbuild_rules_docker_pull_2171.patch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P3 We're not considering working on this, but happy to review a PR. (No assignee) team-Rules-Python Native rules for Python type: bug
Projects
None yet
Development

No branches or pull requests

9 participants