Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up resolving requirements through --use-first-matching-interpreter Pex flag #10442

Merged
merged 14 commits into from Aug 15, 2020

Conversation

Eric-Arellano
Copy link
Contributor

@Eric-Arellano Eric-Arellano commented Jul 24, 2020

By default, when you pass a loose interpreter constraint to Pex like >=3.6, Pex will resolve for every interpreter discovered, e.g. 3.6 and 3.7 and 3.8. This results in the Pex being compatible with more interpreters at runtime.

However, the vast majority of our Pexes are never actually distributed to end-users so we don't care about compatibility - we only need one interpreter to work. The only time we need that compatibility is when using awslambda and binary, because those Pexes get materialized and distributed to the end-user.

Closes #10380 and #8685.

Benchmark

Running ./pants --no-process-execution-use-local-cache --no-dynamic-ui --no-pantsd lint src/python/pants/util/strutil.py, which will force building 4 Pex files, and output the starting and completion times.

Before (s) After (s)
black.pex 9 5
flake8.pex 8 3
isort.pex 6 3
docformatter.pex 4 3
total (combined) 27 18
total (wall time) 9 5

[ci skip-rust-tests]

…eter` Pex flag

# Rust tests will be skipped. Delete if not intended.
[ci skip-rust-tests]
Copy link
Sponsor Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Looks great.

src/python/pants/backend/python/rules/pex.py Outdated Show resolved Hide resolved
# Rust tests will be skipped. Delete if not intended.
[ci skip-rust-tests]
@Eric-Arellano
Copy link
Contributor Author

The dozens of CI failures show that this isn't safe to land :/ It seems it's not deterministic the interpreter we use to build vs. the interpreter we use to run?

@jsirois
Copy link
Member

jsirois commented Jul 24, 2020

The dozens of CI failures show that this isn't safe to land :/ It seems it's not deterministic the interpreter we use to build vs. the interpreter we use to run?

Correct if the implementation of the flag over in Pex land does not also cause the interpreter constraints to be omitted from the produced PEX-INFO in favor of setting the hashbang appropriately or some other such means of restricting the runtime interpreter to the build time interpreter.

And just checked over there - this is in fact the issue. The feature from Pex is partial at the moment.

Instead of pushing that feature to completion over in Pex would you be receptive to instead ditching the Pex feature and supporting this over in Pants? In https://github.com/pantsbuild/pants/pull/9747/files#diff-15ee4bfd6c4feec32a8fd4f8fd2faa10R42 I already showed how to do this and that PR has other benefits besides.

If yes, I can take that PR back up to replace this one.

@Eric-Arellano
Copy link
Contributor Author

Instead of pushing that feature to completion over in Pex would you be receptive to instead ditching the Pex feature and supporting this over in Pants? In https://github.com/pantsbuild/pants/pull/9747/files#diff-15ee4bfd6c4feec32a8fd4f8fd2faa10R42 I already showed how to do this and that PR has other benefits besides.

My instinct is to prefer us using Pex as a CLI tool, rather than as a library. I've found it really helpful when debugging with users that we output the exact Pex command we're running. It helps us to first get things behaving properly with Pex, and then figure out how to correct Pants.

What are the benefits you mention if we use Pex as a library rather than binary?

@jsirois
Copy link
Member

jsirois commented Jul 24, 2020

What are the benefits you mention if we use Pex as a library rather than binary?

On closer inspection, the primary benefit in that PR aside from --single which you try to implement here and loose pexes (unzipped from the get-go which the Pex CLI does not officially support) was only relevant to recursive resolve (splitting download of distributions from building of wheels / PEXes). Since recursive resolves proved non-performant for the majority of use cases that's no longer a valid reason.

There are benefits though and those are primarily generic to any tool we use outside the Pex tool. Namely, we need not warp the tool towards Pants ends, instead we can compose the tools in our own wrapper. In the case of Pex we have a special relationship (we own it), so its easy to warp but still sometimes questionable to do so - aka: does this feature make sense from the Pex user's perspective Pants aside? Sometimes not so much.

Towards this Pants has two current needs that don't map well to Pex use cases; yet composing the APIs exposed could support cleanly:

  1. This case, build me a binary to run locally only.
  2. Given a requirements PEX (a global resolve in a monorepo with many binaries), strip out just the dependencies this binary needs.

I do agree with your instinct to be able to run tools Pants runs and iterate. I think though that it would be more useful to generally support all tools Pants uses - internal or external. I think this could be done with:

./pants execute-process <digest> <args...>

Where we teach Process executions to output all the arguments following ./pants execute-process. This would provide a generic repro / tweak / modify cycle agnostic to the tool's origin or implementation language looking down the road.

@jsirois
Copy link
Member

jsirois commented Jul 25, 2020

Re the above bit about generically debuggable process executions, given 1 simple new goal and 1 simple new rule you'd have:

  1. Rule wants to support debugability of its underlying (key) Process execution does:
$ git diff src/python/pants/backend/python/typecheck/mypy/rules.py
diff --git a/src/python/pants/backend/python/typecheck/mypy/rules.py b/src/python/pants/backend/python/typecheck/mypy/rules.py
index 7ea0e628e..8bffeab1e 100644
--- a/src/python/pants/backend/python/typecheck/mypy/rules.py
+++ b/src/python/pants/backend/python/typecheck/mypy/rules.py
@@ -20,6 +20,7 @@ from pants.backend.python.subsystems import python_native_code, subprocess_envir
 from pants.backend.python.subsystems.subprocess_environment import SubprocessEncodingEnvironment
 from pants.backend.python.target_types import PythonSources
 from pants.backend.python.typecheck.mypy.subsystem import MyPy
+from pants.core.goals.execute_process import DebuggableProcess
 from pants.core.goals.typecheck import TypecheckRequest, TypecheckResult, TypecheckResults
 from pants.core.util_rules import determine_source_files, strip_source_roots
 from pants.engine.addresses import Addresses
@@ -121,7 +122,7 @@ async def mypy_lint(
         env={"PEX_EXTRA_SYS_PATH": ":".join(prepared_sources.source_roots)},
         description=f"Run MyPy on {pluralize(len(prepared_sources.snapshot.files), 'file')}.",
     )
-    result = await Get(FallibleProcessResult, Process, process)
+    result = await Get(FallibleProcessResult, DebuggableProcess(process))
     return TypecheckResults(
         [TypecheckResult.from_fallible_process_result(result, typechecker_name="MyPy")]
     )
  1. To debug you can then say --epr-debug-show:
$ ./pants --changed-since=upstream/master --epr-debug-show typecheck
19:52:47.27 [INFO] Executing Run MyPy on 121 files. which can be reproduced using:
19:52:47.27 [INFO] ./pants execute-process --digest=e0622dfa64c87d03c5a01d29d5073fc2350de0357609ee8dc5491d1f3c9a5b6b:337 --env=PATH=/home/jsirois/dev/pantsbuild/jsirois-pants/build-support/virtualenvs/Linux/pants_dev_deps.py38.venv/bin:/home/jsirois/.rvm/gems/ruby-2.6.0/bin:/home/jsirois/.rvm/gems/ruby-2.6.0@global/bin:/home/jsirois/.rvm/rubies/ruby-2.6.0/bin:/home/jsirois/.cargo/bin:/home/jsirois/bin:/home/jsirois/.local/bin:/home/jsirois/.nvm/versions/node/v8.11.3/bin:/home/jsirois/.poetry/bin:/opt/google-cloud-sdk/bin:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/lib/jvm/default/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/var/lib/snapd/snap/bin:/home/jsirois/.bash.d/bin:/home/jsirois/go/bin:/home/jsirois/.gem/ruby/2.5.0/bin:/home/jsirois/.rvm/bin:/home/jsirois/.bash.d/bin --env=PEX_INHERIT_PATH=false --env=PEX_IGNORE_RCFILES=true --env=LANG=en_US.UTF-8 --env=LC_ALL= --env=PEX_EXTRA_SYS_PATH=src/python -- python mypy.pex --config-file=build-support/mypy/mypy.ini @__files.txt
𐄂 MyPy failed.
...

And the suggested repro works **:

$ ./pants execute-process --digest=e0622dfa64c87d03c5a01d29d5073fc2350de0357609ee8dc5491d1f3c9a5b6b:337 --env=PATH=/home/jsirois/dev/pantsbuild/jsirois-pants/build-support/virtualenvs/Linux/pants_dev_deps.py38.venv/bin:/home/jsirois/.rvm/gems/ruby-2.6.0/bin:/home/jsirois/.rvm/gems/ruby-2.6.0@global/bin:/home/jsirois/.rvm/rubies/ruby-2.6.0/bin:/home/jsirois/.cargo/bin:/home/jsirois/bin:/home/jsirois/.local/bin:/home/jsirois/.nvm/versions/node/v8.11.3/bin:/home/jsirois/.poetry/bin:/opt/google-cloud-sdk/bin:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/lib/jvm/default/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/var/lib/snapd/snap/bin:/home/jsirois/.bash.d/bin:/home/jsirois/go/bin:/home/jsirois/.gem/ruby/2.5.0/bin:/home/jsirois/.rvm/bin:/home/jsirois/.bash.d/bin --env=PEX_INHERIT_PATH=false --env=PEX_IGNORE_RCFILES=true --env=LANG=en_US.UTF-8 --env=LC_ALL= --env=PEX_EXTRA_SYS_PATH=src/python -- python mypy.pex --config-file=build-support/mypy/mypy.ini 
usage: mypy [-h] [-v] [-V] [more options; see below]
            [-m MODULE] [-p PACKAGE] [-c PROGRAM_TEXT] [files ...]
mypy: error: Missing target module, package, files, or command.
...
$ ls -l dist/execute-process/
total 64864
drwxr-xr-x 3 jsirois jsirois     4096 Jul 24 19:53 build-support
-rw-r--r-- 1 jsirois jsirois     4812 Jul 24 19:53 __files.txt
-rwxr-xr-x 1 jsirois jsirois 66404287 Jul 24 19:53 mypy.pex
drwxr-xr-x 3 jsirois jsirois     4096 Jul 24 19:53 src

** There is a technicality here where the @__files.txt doesn't pass through correctly since Pants options parsing - apparently buggily - eagerly sees the @__files.txt and tries to Fromfile the passthru arg. As such for the example run I left off that last arg.

Does that general thrust alleviate your concern?

@Eric-Arellano
Copy link
Contributor Author

Bummer, this still isn't working properly. Possibly due to platforms?

@jsirois
Copy link
Member

jsirois commented Jul 28, 2020

Sounds right. But saying first plus adding platforms doesn't make sense. Can the platforms just always be stripped for first requests?

@jsirois
Copy link
Member

jsirois commented Jul 28, 2020

Same for first + multiple --python args. The feature is super fragile and you can really only use interpreter constraints together with first usefully.

…ing-interp

# Conflicts:
#	src/python/pants/backend/awslambda/python/awslambda_python_rules.py
#	src/python/pants/backend/python/lint/bandit/rules.py
#	src/python/pants/backend/python/lint/black/rules.py
#	src/python/pants/backend/python/lint/docformatter/rules.py
#	src/python/pants/backend/python/lint/flake8/rules.py
#	src/python/pants/backend/python/lint/isort/rules.py
#	src/python/pants/backend/python/lint/pylint/rules.py
#	src/python/pants/backend/python/rules/coverage.py
#	src/python/pants/backend/python/rules/repl.py
#	src/python/pants/backend/python/rules/run_setup_py.py
#	src/python/pants/backend/python/typecheck/mypy/rules.py
…ing-interp

# Conflicts:
#	src/python/pants/backend/python/rules/download_pex_bin.py
#	src/python/pants/backend/python/rules/pex.py
John pointed out that this is necessary for the runtime to be safe.
Copy link
Sponsor Contributor

@benjyw benjyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hooray!

src/python/pants/backend/python/rules/pex.py Outdated Show resolved Hide resolved
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
@jsirois
Copy link
Member

jsirois commented Aug 15, 2020

@Eric-Arellano this should finally be ready to go. Sorry that took so long to figure out!

…ing-interp

# Conflicts:
#	src/python/pants/backend/python/rules/pytest_runner.py
#	src/python/pants/backend/python/rules/repl.py
#	src/python/pants/backend/python/rules/run_python_binary.py
@coveralls
Copy link

coveralls commented Aug 15, 2020

Coverage Status

Coverage remained the same at 0.0% when pulling 2f9f621 on Eric-Arellano:first-matching-interp into 659ffd6 on pantsbuild:master.

@Eric-Arellano Eric-Arellano merged commit fa6909e into pantsbuild:master Aug 15, 2020
@Eric-Arellano Eric-Arellano deleted the first-matching-interp branch August 15, 2020 16:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Resolving requirements performance is still not acceptable
5 participants