Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Download platform agnostic wheels using rctx.download #1788

Closed

Conversation

michaelboulton
Copy link

If we can determine that there's a platform agnostic wheel (dep-ver-py3-none-any.whl) then download it using rctx.download, allowing it to be cached.

This partially solves #1357 but it still doesn't handle dependencies that have platform specific binary wheels like https://pypi.org/project/psycopg2-binary/ .

(I don't know the ins and outs of pip package resolution, but it seems like there is no easy way to get pip just to print "what wheel should be downloaded for this platform" without actually downloading it, which is why this PR can only handle platform agnostic wheels)

If we can determine that there's a platform agnostic wheel
(dep-ver-py3-none-any.whl) then download it using rctx.download,
allowing it to be cached.

This partially solves bazelbuild#1357 but
it still doesn't handle dependencies that have platform specific binary wheels
like https://pypi.org/project/psycopg2-binary/
Copy link

google-cla bot commented Mar 3, 2024

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

Copy link
Contributor

@jvolkman jvolkman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure @aignas had a PR somewhere that used Bazel's downloader in some situations, but I think it was using the pypi simple API, not the JSON API. Not sure what happened to that?

by_hash_type[alg] = [digest]

rctx.download(
url = "https://pypi.org/pypi/{}/{}/json".format(requirement_name, requirement_version),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are many people that use private or alternative indexes; hard-coding pypi.org wouldn't work in these scenarios.

Also, I'm not sure that the JSON api is widely supported outside of pypi. Artifactory is a common repository and it only supports the simple (HTML) API as far as I can tell.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was using the simple API and switched to using the JSON API because it was far simpler to just parse the output for one specific version 🤷 I wasn't really aware it was widely unsupported

Comment on lines +820 to +821
platform_agnostic_wheel = _try_finding_platform_agnostic_wheel(rctx, rctx.attr.requirement)
if platform_agnostic_wheel and not target_platforms:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If my understanding is correct, this logic would prefer a pure-python wheel over a native wheel which is not actually what pip would do. Pip would prefer a native wheel if one exists for the target platform, and this does occur in a handful of packages that ship native extensions as a performance optimization for certain platforms and a pure-python wheel for the rest (SQLAlchemy does this, for example).

@aignas
Copy link
Collaborator

aignas commented Mar 4, 2024

Yeah it is #1625, Right now I am hoping to get #1744 done first and then come back to this topic so that we can use the Simple API for all of those things.

@michaelboulton
Copy link
Author

Yeah it is #1625, Right now I am hoping to get #1744 done first and then come back to this topic so that we can use the Simple API for all of those things.

I had a look at something like this but doing all the platform detection appropriately seemed like a big chunk of work (as I can see in those PRs) which was overkill for my simple use case. I'm glad someone is looking at it though, and as your work is far more fleshed out I'll close this PR.

(If anyone does want to use this patch, it worked for me just to speed up clean builds, but YMMV)

aignas added a commit to aignas/rules_python that referenced this pull request Mar 10, 2024
This is a variant of bazelbuild#1625 and was inspired by bazelbuild#1788. In bazelbuild#1625, we
attempt to parse the simple API HTML files in the same `pip.parse`
extension and it brings the follownig challenges:

* The `pip.parse` cannot be easily use in `isolated` mode and it may
  be difficult to implement the isolation if bazelbuild/bazel#20186
  moves forward.
* Splitting the `pypi_index` out of the `pip.parse` allows us to accept
  the location of the parsed simple API artifacts encoded as a bazel
  label.
* Separation of the logic allows us to very easily implement usage of
  the downloader for cross-platform wheels.
* The `whl` `METADATA` might not be exposed through older versions of
  Artifactory, so having the complexity hidden in this single extension
  allows us to not increase the complexity and scope of `pip.parse` too
  much.
* The repository structure can be reused for `pypi_install` extension
  from bazelbuild#1728.

TODO:
- [ ] Add unit tests for functions in `pypi_index.bzl` bzlmod extension if
  the design looks good.
- [ ] Changelog.

Out of scope of this PR:
- Further usage of the downloaded artifacts to implement something
  similar to bazelbuild#1625 or bazelbuild#1744. This needs bazelbuild#1750 and bazelbuild#1764.
- Making the lock file the same on all platforms - We would need
  to fully parse the requirements file.
- Support for different dependency versions in the `pip.parse` hub repos
  based on each platform - we would need to be able to interpret
  platform markers in some way, but `pypi_index` should be good already.
- Implementing the parsing of METADATA to detect dependency cycles.
- Support for `requirements` files that are not created via
  `pip-compile`.
- Support for other lock formats, though that would be reasonably
  trivial to add.

Open questions:
- Support for VCS dependencies in requirements files - We should
  probably handle them as `overrides` in the `pypi_index` extension and
  treat them in `pip.parse` just as an `sdist`, but I am not sure it
  would work without any issues.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants