Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[internal] Add experimental ./pants lock to generate a lockfile with pip-compile #12300

Merged
merged 5 commits into from
Jul 8, 2021

Conversation

Eric-Arellano
Copy link
Contributor

With this goal, you can run ./pants lock 3rdparty/python:: to create a lockfile using pip-compile. For example:

❯ ./pants lock 3rdparty/python:humbug
18:31:35.96 [INFO] Completed: Generate lockfile for 1 requirements: humbug==0.1.9
18:31:35.96 [INFO] Wrote lockfile to 3rdparty/python/lockfile.txt
#
# This file is autogenerated by pip-compile with python 3.9
# To update, run:
#
#    pip-compile --allow-unsafe --generate-hashes --output-file=3rdparty/python/lockfile.txt requirements.in
#
bugout==0.1.12 \
    --hash=sha256:0dc968e520c19c5db824acb35997ecc0b0227b1bf4a1f86a9c2a8cdc6d040a64 \
    --hash=sha256:999ad28771cdcfdc0e1e54aa5f5bf478d67b9f1a1cbc4b9df03ad98d46887903
    # via humbug
certifi==2021.5.30 \
    --hash=sha256:2bbf76fd432960138b3ef6dda3dde0544f27cbf8546c458e60baf371917ba9ee \
    --hash=sha256:50b1e4f8446b06f41be7dd6338db18e0990601dce795c2b1686458aa7e8fa7d8
    # via requests
chardet==4.0.0 \
    --hash=sha256:0d6f53a15db4120f2b08c94f11e7d93d2c911ee118b6b30a04ec3ee8310179fa \
    --hash=sha256:f864054d66fd9118f2e67044ac8981a54775ec5b67aed0441892edb553d21da5
    # via requests
humbug==0.1.9 \
    --hash=sha256:8771b54783f7f702cb23605df7a4d6ae81cda9c8167eccd09ed05aff5e56d5ab \
    --hash=sha256:e4c4019af50b1e461eb62346b0e732f22632e0654d7602aada82dcca8ff61dd7
    # via -r requirements.in
idna==2.10 \
    --hash=sha256:b307872f855b18632ce0c21c5e45be78c0ea7ae4c15c828c20788b26921eb3f6 \
    --hash=sha256:b97d804b1e9b523befed77c48dacec60e6dcb0b5391d57af6a65a312a90648c0
    # via requests
...

This is still very experimental, and there are many unresolved questions. The intent with landing this is to have a foundation to start iterating on and answering design questions.

Why pip-compile

There are several known ways to generate a lockfile in Python:

  1. Create a virtual env, then use pip freeze.
  2. Poetry or Pyenv.
  3. pip-compile.
  4. We could teach Pex how to do it.

Pip-compile offers a robust out-of-the-box solution with several benefits:

  • Stick to pip ecosystem and the pip resolver, unlike Poetry and Pyenv.
  • Does not install wheels like pip freeze, only downloads and does the minimum work necessary.
  • Explains in the lockfile where each requirement comes from, which is very useful.

Lockfiles vs. constraints files

Note that this generates a "lockfile", not a "constraints" file, which we have confusingly been calling the same thing. Among other differences, a lockfile can include extra like requests[security], which choke with a constraints file.

Next steps

  • Teach Pants to consume a lockfile, rather than constraints file.
  • Fix the specs semantics for ./pants lock so that the lockfile is based on what gets used by your code, rather than what reqs are defined in your universe.
    • This will be important when we have multiple user lockfiles.

[ci skip-rust]

…h pip-compile

# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]
Comment on lines +56 to +58
# TODO: What should the input to `./pants lock` be when generating user lockfiles? Currently, it's
# the 3rd-party requirements that get used as input. Likely it should instead be all 3rd party
# reqs used by the transitive closure? That would better mirror generate_lockfile.sh.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty sure the comment's proposed semantics are more correct

EDIT: we should probably defer this discussion for the dedicated PR. I'm thinking this PR should mostly be about simply using pip-compile. That way, we can have a more robust discussion there where I can give the full context and design process.

Copy link
Sponsor Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I tend to agree.

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

This probably doesn't need to be exposed as a goal at all (in this PR)...?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's helpful for me to have a goal to run when iterating. But feature gated behind the experimental backend

Comment on lines +78 to +79
# TODO: Figure out which interpreter constraints to use...Among other reasons, this is
# tricky because python_requirement_library targets don't have interpreter constraints!
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will come from the transitive closure of the code using the 3rd party reqs, with a default to [python-setup].interpreter_constraints.

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. With one permutation (templated copy á la lock-%p-%i) per platform (%s) and interpreter constraint (%i).

@Eric-Arellano
Copy link
Contributor Author

FYI @wilsonliam and @chrisjrn , who will likely work on some tickets related to lockfiles.

Also FYI @jsirois

# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
src/python/pants/backend/experimental/python/lockfile.py Outdated Show resolved Hide resolved
Comment on lines +56 to +58
# TODO: What should the input to `./pants lock` be when generating user lockfiles? Currently, it's
# the 3rd-party requirements that get used as input. Likely it should instead be all 3rd party
# reqs used by the transitive closure? That would better mirror generate_lockfile.sh.
Copy link
Sponsor Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I tend to agree.

# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
@Eric-Arellano Eric-Arellano requested a review from benjyw July 8, 2021 16:46
Copy link
Sponsor Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Comment on lines +33 to +34
# TODO: How should users indicate where to save the lockfile to when we have per-tool
# lockfiles and mmultiple user lockfiles?
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One approach would be to have a resolve(name=..) target type, with optional parameters for overriding some default lockfile templates. To use it, you would declare one in some directory, and then when it was actually consumed by a *_binary target, the lockfiles would be produced next to it at the templated paths (templated by platform * interpreter_constraint * etc).

A downside of that modeling though is that it means that whatever our "default" global resolve is would be a bit magical. We don't want folks to have to declare one of these just to get started with Pants. Not sure how to fix that mismatch... maybe with a setting instead...?: [resolves] resolve_mapping = "{'default': '3rdparty/python/lock%p-%i', 'custom_for_aws': '3rdparty/python/aws_lock%p-%i'}".

Comment on lines +56 to +58
# TODO: What should the input to `./pants lock` be when generating user lockfiles? Currently, it's
# the 3rd-party requirements that get used as input. Likely it should instead be all 3rd party
# reqs used by the transitive closure? That would better mirror generate_lockfile.sh.
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

This probably doesn't need to be exposed as a goal at all (in this PR)...?

Comment on lines +78 to +79
# TODO: Figure out which interpreter constraints to use...Among other reasons, this is
# tricky because python_requirement_library targets don't have interpreter constraints!
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. With one permutation (templated copy á la lock-%p-%i) per platform (%s) and interpreter constraint (%i).

argv=[
"requirements.in",
"--generate-hashes",
f"--output-file={dest}",
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this output format something that PEX can consume...? Should PEX be delegating to pip_tools in order to put the fetched stuff in its cache...?

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, and speaking of caching: should this be configured to use a named_cache (until PEX is invoking it)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting on caching. The --output-file is a literal requirements.txt file.

I haven't investigated yet how pip-tools is doing the fetching and where it gets cached, but fwict it delegates to Pip. That makes sense to try to share that cache with Pex, and at a minimum to use a dedicated named_cache. I'll add a TODO in the next PR.

@Eric-Arellano Eric-Arellano merged commit 55b5172 into pantsbuild:main Jul 8, 2021
@Eric-Arellano Eric-Arellano deleted the gen-lockfile branch July 8, 2021 22:49
@stuhood
Copy link
Sponsor Member

stuhood commented Jul 8, 2021

Fix the specs semantics for ./pants lock so that the lockfile is based on what gets used by your code, rather than what reqs are defined in your universe.

  • This will be important when we have multiple user lockfiles.

I think that the lock goal should not have any particular spec semantics: there will need to be some underlying reusable @rules (which will be used by the lock goal, but also by any @(goal_)rules consuming the resolves) that do a dance like:

  • Figure out which requirements are reachable from the specs, and which resolves are reachable from the specs.
  • See whether any of the reachable requirements have changed since the lockfile(s) for the relevant resolves were generated; warn/error if so.
  • Return an in-memory representation of the whole resolve (in the case of Python, a PEX to subset)

The first two points are the hand-waviest. Figuring out how to do this without full runs of dependees is going to be tricky.

illicitonion added a commit to illicitonion/pants that referenced this pull request Jul 9, 2021
Internal changes omitted from the release notes:

* [internal] `./pants lock` uses transitive closure as inputs ([pantsbuild#12312](pantsbuild#12312))

* [internal] Add experimental `./pants lock` to generate a lockfile with pip-compile ([pantsbuild#12300](pantsbuild#12300))

* upgrade tokio and futures crates ([pantsbuild#12308](pantsbuild#12308))

* upgrade tonic, tower, and hyper crates ([pantsbuild#12294](pantsbuild#12294))

* Set up pants to use the latest version of humbug ([pantsbuild#12302](pantsbuild#12302))

* Refactor BUILD file parsing. ([pantsbuild#12279](pantsbuild#12279))

* Use get_type_hints() to get typed type hints. ([pantsbuild#12287](pantsbuild#12287))

* [internal] Fix parameters of convert_source_to_sources_test ([pantsbuild#12274](pantsbuild#12274))

* Fix assert arguments in a test ([pantsbuild#12273](pantsbuild#12273))

* Introduce a native pants client. ([pantsbuild#11922](pantsbuild#11922))

* Don't run CI in forks ([pantsbuild#12267](pantsbuild#12267))

* Change how we embed docsite links in help text. ([pantsbuild#12261](pantsbuild#12261))
illicitonion added a commit to illicitonion/pants that referenced this pull request Jul 9, 2021
Internal changes omitted from the release notes:

* [internal] `./pants lock` uses transitive closure as inputs ([pantsbuild#12312](pantsbuild#12312))

* [internal] Add experimental `./pants lock` to generate a lockfile with pip-compile ([pantsbuild#12300](pantsbuild#12300))

* upgrade tokio and futures crates ([pantsbuild#12308](pantsbuild#12308))

* upgrade tonic, tower, and hyper crates ([pantsbuild#12294](pantsbuild#12294))

* Set up pants to use the latest version of humbug ([pantsbuild#12302](pantsbuild#12302))

* Refactor BUILD file parsing. ([pantsbuild#12279](pantsbuild#12279))

* Use get_type_hints() to get typed type hints. ([pantsbuild#12287](pantsbuild#12287))

* [internal] Fix parameters of convert_source_to_sources_test ([pantsbuild#12274](pantsbuild#12274))

* Fix assert arguments in a test ([pantsbuild#12273](pantsbuild#12273))

* Introduce a native pants client. ([pantsbuild#11922](pantsbuild#11922))

* Don't run CI in forks ([pantsbuild#12267](pantsbuild#12267))

* Change how we embed docsite links in help text. ([pantsbuild#12261](pantsbuild#12261))
illicitonion added a commit to illicitonion/pants that referenced this pull request Jul 9, 2021
Internal changes omitted from the release notes:

* [internal] `./pants lock` uses transitive closure as inputs ([pantsbuild#12312](pantsbuild#12312))

* [internal] Add experimental `./pants lock` to generate a lockfile with pip-compile ([pantsbuild#12300](pantsbuild#12300))

* upgrade tokio and futures crates ([pantsbuild#12308](pantsbuild#12308))

* upgrade tonic, tower, and hyper crates ([pantsbuild#12294](pantsbuild#12294))

* Set up pants to use the latest version of humbug ([pantsbuild#12302](pantsbuild#12302))

* Refactor BUILD file parsing. ([pantsbuild#12279](pantsbuild#12279))

* Use get_type_hints() to get typed type hints. ([pantsbuild#12287](pantsbuild#12287))

* [internal] Fix parameters of convert_source_to_sources_test ([pantsbuild#12274](pantsbuild#12274))

* Fix assert arguments in a test ([pantsbuild#12273](pantsbuild#12273))

* Introduce a native pants client. ([pantsbuild#11922](pantsbuild#11922))

* Don't run CI in forks ([pantsbuild#12267](pantsbuild#12267))

* Change how we embed docsite links in help text. ([pantsbuild#12261](pantsbuild#12261))

* Add JavacSubsystem and JavacBinary, allowing user-configurable JVM selection ([pantsbuild#12283](pantsbuild#12283))

* Revert "Deprecate unused `--process-execution-local-enable-nailgun` (pantsbuild#11600)" ([pantsbuild#12257](pantsbuild#12257))
illicitonion added a commit that referenced this pull request Jul 9, 2021
Internal changes omitted from the release notes:

* [internal] `./pants lock` uses transitive closure as inputs ([#12312](#12312))

* [internal] Add experimental `./pants lock` to generate a lockfile with pip-compile ([#12300](#12300))

* upgrade tokio and futures crates ([#12308](#12308))

* upgrade tonic, tower, and hyper crates ([#12294](#12294))

* Set up pants to use the latest version of humbug ([#12302](#12302))

* Refactor BUILD file parsing. ([#12279](#12279))

* Use get_type_hints() to get typed type hints. ([#12287](#12287))

* [internal] Fix parameters of convert_source_to_sources_test ([#12274](#12274))

* Fix assert arguments in a test ([#12273](#12273))

* Introduce a native pants client. ([#11922](#11922))

* Don't run CI in forks ([#12267](#12267))

* Change how we embed docsite links in help text. ([#12261](#12261))

* Add JavacSubsystem and JavacBinary, allowing user-configurable JVM selection ([#12283](#12283))

* Revert "Deprecate unused `--process-execution-local-enable-nailgun` (#11600)" ([#12257](#12257))
Eric-Arellano added a commit that referenced this pull request Jul 14, 2021
This allows you to use the new lockfile format, generated by pip-tools via `./pants --tag=-lockfile_ignore lock ::` and #12300. 

A lockfile cannot be used at the same time as a constraints file. This makes the code easier to implement and means that we don't break any prior APIs. We will likely deprecate constraints when the dust settles.

There are several major deficiencies:

- Only `pex_from_targets.py` consumes this lockfile. This means that tool lockfiles will now have no constraints and no lockfile, for now.
- Does not handle requirements disjoint to the lockfile.
- Does not support multiple user lockfiles, which, for example, is necessary to properly handle `platforms` with `pex_binary` and `python_awslambda`: we need one lockfile per platform, as demonstrated in https://github.com/Eric-Arellano/lockfile-platforms-problem/tree/main/multiple_pex__stu_proposal.


### Lockfile vs. constraints file (and `[python-setup].resolve_all_constraints`)

We're currently using pip's `--constraints` file support, which allows you to specify constraints that may not actually be used. At the same time, we default to `[python-setup].resolve_all_constraints`, which _does_ first install the entire constraints file, and then uses Pex's repository PEX feature to extract the relevant subset. This is generally a performance optimization, but there are some times `--resolve-all-constraints` is not desirable:

1. It is not safe to first install the superset, and you can only install the proper subset. This especially can happen when `platforms` are used. See #12222.
     - We proactively disable `--resolve-all-constraints` when `platforms` are used.
2. User does not like the performance tradeoff, e.g. because they have a huge repository PEX so it's slow to access.

--

In contrast, this PR stops using `--constraints` and roughly always does `[python-setup].resolve_all_constraints` (we now run `pex -r requirements.txt --no-transitive` and use repository PEXes). Multiple user lockfiles will allow us to solve the above issues:

> 1. It is not safe to first install the superset, and you can only install the proper subset.

We'll have a distinct lockfile for each `platform`, which avoids this situation. See https://github.com/Eric-Arellano/lockfile-platforms-problem/tree/main/multiple_pex__stu_proposal for an example.

> 2. User does not like the performance tradeoff

They can use multiple lockfiles to work around this.

--

Always using `[python-setup].resolve_all_constraints` reduces complexity: less code to support, fewer concepts for users to learn.

Likewise, if we did still want to use `--constraints`, we would also need to upgrade Pex to use Pip 21+, which gained support for URL constraints. We [hacked around URL constraints before](#11907), but that isn't robust. However, Pip 21+ drops Python 2 and 3.5 support: we'd need to release Pex 3 w/o Py2 support, and upgrade Pants to have workarounds that allow Py2 to still be used. To avoid project creep, it's better to punt on Pex 3.

[ci skip-rust]
[ci skip-build-wheels]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants