Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exp: multi-arch whl_library in bzlmod #1625

Closed
wants to merge 95 commits into from

Conversation

aignas
Copy link
Collaborator

@aignas aignas commented Dec 18, 2023

This is a reasonably polished POC to support bazel downloader to fetch wheels
and to setup multi-platform whl targets which would allow us to almost
correctly setup the dependency tree.

In general usecase this yield great performance improvements whilst running the
tests as the fetching of the wheels is much faster.

Ideas for improvements:

  • Setup auth for the wheel fetching.
  • Test with a private registry.
  • Add support for wheels located on the local file system.
  • Make the whl metadata fetching optional.

github-merge-queue bot pushed a commit that referenced this pull request Jan 25, 2024
With this PR we can deterministically parse the METADATA and generate a
`BUILD.bazel` file using the config settings introduced in #1555. Let's
imagine we had a `requirements.txt` file that used only wheels, we could
use the host interpreter to parse the wheel metadata for all the target
platforms and use the version aware toolchain at runtime. This
potentially
unlocks more clever layouts of the `bzlmod` hub repos explored in #1625
where we could have a single `whl_library` instance for all versions
within
a single hub repo.

Work towards #1643.
@aignas
Copy link
Collaborator Author

aignas commented Feb 29, 2024

Given the latest developments and https://discuss.python.org/t/lock-files-again-but-this-time-w-sdists/46593, this is paused for some time.

@aignas
Copy link
Collaborator Author

aignas commented Mar 4, 2024

Some notes on the implementation of how this could be done reasonably cleanly.

  1. The pip.parse tag class can create the following repos: pip hub, pip spoke repos, whl http_file repos and whl hub repo. The whl hub repo is common for all pip hub repos and the pip spoke repos use the whl files downloaded by http_file repos.
  2. The whl hub repo is loaded by rules_python so that we can very easily pass around the whl label references when constructing the pip spoke repos.
  3. Ideally the whl hub repo is set as a default in a configurable attribute.
  4. The whl repos are setup in the same way as feat(bzlmod): support wheel-only additions to the hub repo #1744 is done, but for multiple platforms.

If we would need to somehow split the whl hub and pip hub generation, then we would need to do as follows:

  1. Have a pypi.index tag class that can parse the lock files and generate the necessary URL and label references to the subset of the PyPI world that we need to include. It would generate a single hub repo with many spokes. Isolated mode of using the extension would still work.
  2. The pip.parse then uses the hub repo created by pypi.index tag class. It could have an attribute index_hub = attr.label() where we could set it to "@pypi_index//:BUILD.bazelafter doinguse_repo(pypi_index, pypi_index = "repo_from_extension")`.

In the second option the user would use it as:

pypi = use_extension("@rules_python//python/extensions/pypi.bzl", "pypi")
# The following can be called multiple times
pypi.index_requirements(
    index_url = "foo",
    extra_index_urls = ["bar"],
    srcs = [
        "//:my_requirements.txt",
    ],
)

use_repo(pypi, "pypi_index")

pip = use_extension("@rules_python//python/extensions/pip.bzl", "pip")
pip.index(index="@pypi_index//:packages.json")  # contains the package and the labels to the metadata files for each package.

# Use stuff as previously.
pip.parse(
    hub_repo = "pip",
    src = "//:my_requirements.txt",
)

aignas added a commit to aignas/rules_python that referenced this pull request Mar 10, 2024
This is a variant of bazelbuild#1625 and was inspired by bazelbuild#1788. In bazelbuild#1625, we
attempt to parse the simple API HTML files in the same `pip.parse`
extension and it brings the follownig challenges:

* The `pip.parse` cannot be easily use in `isolated` mode and it may
  be difficult to implement the isolation if bazelbuild/bazel#20186
  moves forward.
* Splitting the `pypi_index` out of the `pip.parse` allows us to accept
  the location of the parsed simple API artifacts encoded as a bazel
  label.
* Separation of the logic allows us to very easily implement usage of
  the downloader for cross-platform wheels.
* The `whl` `METADATA` might not be exposed through older versions of
  Artifactory, so having the complexity hidden in this single extension
  allows us to not increase the complexity and scope of `pip.parse` too
  much.
* The repository structure can be reused for `pypi_install` extension
  from bazelbuild#1728.

TODO:
- [ ] Add unit tests for functions in `pypi_index.bzl` bzlmod extension if
  the design looks good.
- [ ] Changelog.

Out of scope of this PR:
- Further usage of the downloaded artifacts to implement something
  similar to bazelbuild#1625 or bazelbuild#1744. This needs bazelbuild#1750 and bazelbuild#1764.
- Making the lock file the same on all platforms - We would need
  to fully parse the requirements file.
- Support for different dependency versions in the `pip.parse` hub repos
  based on each platform - we would need to be able to interpret
  platform markers in some way, but `pypi_index` should be good already.
- Implementing the parsing of METADATA to detect dependency cycles.
- Support for `requirements` files that are not created via
  `pip-compile`.
- Support for other lock formats, though that would be reasonably
  trivial to add.

Open questions:
- Support for VCS dependencies in requirements files - We should
  probably handle them as `overrides` in the `pypi_index` extension and
  treat them in `pip.parse` just as an `sdist`, but I am not sure it
  would work without any issues.
@aignas
Copy link
Collaborator Author

aignas commented Mar 10, 2024

Will re-implement this in separate PRs.

@aignas aignas closed this Mar 10, 2024
@aignas aignas deleted the exp/whl-minihub branch May 13, 2024 06:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants