Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to specify base_url in python.toolchain() when using bzlmod #1172

Closed
vadikmironov opened this issue Apr 16, 2023 · 17 comments
Closed
Labels
type: bzlmod bzlmod work
Milestone

Comments

@vadikmironov
Copy link

🐞 bug report

Affected Rule

The issue is caused by the rule: python:extensions.bzl

Is this a regression?

Yes, the previous version in which this bug was not present was: works fine without bzlmod

Description

While trying to prepare my test repo for bzlmod switch, I've noticed that it seems to be no longer possible to override base_url location for python toolchains. This is a super useful feature for someone working behind a corporate firewall and maintaining a mirror of python-build-standalone artefacts.

🔬 Minimal Reproduction

Following example is perfectly valid when using WORKSPACE.bazel:

load("@rules_python//python:repositories.bzl", "python_register_toolchains")
python_register_toolchains(
    name = "python3_8_13",
    python_version = "3.8",
    base_url = "https://github.com/indygreg/python-build-standalone/releases/download",
)

but when defining a toolchain in bzlmod, there is simply no parameter to pass base_url value:

python = use_extension("@rules_python//python:extensions.bzl", "python")
python.toolchain(
    name = "python3_8_13",
    python_version = "3.8",
)

🔥 Exception or Error



None

🌍 Your Environment

Operating System:

  
Linux
  

Output of bazel version:

  
Bazelisk version: v1.16.0
Build label: 6.1.1
Build target: bazel-out/k8-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Wed Mar 15 15:44:56 2023 (1678895096)
Build timestamp: 1678895096
Build timestamp as int: 1678895096
  

Rules_python version:

  
bazel_dep(name = "rules_python", version = "0.20.0")
  

Anything else relevant?

@chrislovecnm
Copy link
Collaborator

I have not tried but you need to use an override: https://docs.bazel.build/versions/5.0.0/skylark/lib/globals.html#archive_override

@vadikmironov
Copy link
Author

Thanks Chris. That would make module download itself working, but the issue I have is with toolchains. I can't quite see how with bzlmod on I can set the base_url for all toolchains to be downloaded from. As I said this is useful in case of corporate github caching proxy (like jfrog artifactory).

For that matter, I don't see a way to pass via python module extension my own toolchains while in WORKSPACE.bazel that was fairly simple with python_register_toolchains and tool_versions argument. It would be helpful to understand what is the right way to do this type of rules_python customization with bzlmod if it possible at all. I understand this is a work in progress just by looking at #1155 which addresses a different issue of setting hermetic interpreter for pip.parse which I funnily enough also hit when attempted to switch to bzlmod for the first trial.

@chrislovecnm
Copy link
Collaborator

So this line defines the toolchain

use_repo(python, "python3_x86_64-unknown-linux-gnu")

Then I believe you can use:

The use https://docs.bazel.build/versions/5.0.0/skylark/lib/globals.html#archive_override

To override where the repo is downloaded from. It is documented here https://docs.bazel.build/versions/5.0.0/bzlmod.html

So you would have the following

use_repo(python, "python3_x86_64-unknown-linux-gnu")
archive_override("python3_x86_64-unknown-linux-gnu", urls = ["https://my-bcr-repo.com/foo"])

I have not tested this, but this is what the docs say.

@vadikmironov
Copy link
Author

I see, so for windows and darwin builds I will need to repeat this again. Workspace way certainly looks nicer, but oh well. Thanks a lot Chris. Is there any chance that this may become slighly less verbose towards 6.2+?

@jbgcarnes
Copy link

@vladmos I am having a very similar issue myself. Have you gotten the archive override to work? The provided snippet does not work form me.

Using:

bazel_dep(name = "rules_python", version = "0.20.0")

python = use_extension("@rules_python//python:extensions.bzl", "python")
python.toolchain(
  name = "python3",
  python_version = "3.10.2",
  configure_coverage_tool = False,
)
use_repo(python, "python3_toolchains", "python3", "python3_x86_64-lunknown-linux-gnu")
archive_override(module_name = "python3_x86_64-unknown-linux-gnu", urls = ["internal-mirror"])

I get the following issues:

  1. if you keep it as specified:
    It will still attempt to use the old url and fail with error fetch of repository 'rules_python~0.20.0~python~python3_x86_64-unknown-linux-gnu'

So module_name is not python3_x86_64-unknown-linux-gnu, however
2. setting module name to the full qualified archive:
you cannot provide the full override ('rules_python~0.20.0~python~python3_x86_64-unknown-linux-gnu') as you'll get an invalid module name (`rules_python0.20.0python~python3_x86_64-unknown-linux-gnu')

@chrislovecnm
Copy link
Collaborator

We need to add a base_url argument to the "python" extension. Since I have been poking around this code a lot more, I now understand the code. Here is an outline of the work if someone wants to contribute.

We need to add either a **kwargs argument to here:

tag_classes = {

Or a base_url arg.

We then need to pass that argument into the python_register_toolchains call here:

python_register_toolchains(

This will have python_register_toolchains use the value that we set. Here:

base_url = kwargs.pop("base_url", DEFAULT_RELEASE_BASE_URL)

@chrislovecnm
Copy link
Collaborator

@vadikmironov & @jbgcarnes PTAL at #1216.

@chrislovecnm
Copy link
Collaborator

@jbgcarnes
Copy link

@chrislovecnm I do have the bazel-downloader setup for most things. The issue primarily here is that the SHA is not the same as the ones you have baked into rules_python, so I have to specify both the URL and the SHA. I would generally prefer #1216 . For our Go for instance, in rules_go: in the go_sdk extension you can specify the URL and sha: https://github.com/bazelbuild/rules_go/blob/c1880723e3097b788d07c8ff84ff97e58292e0e7/go/private/extensions.bzl

@vadikmironov
Copy link
Author

I am indifferent really between rewrite and explicit arguments and I was going to check it out and report after seeing the comment yesterday, but maybe there are other reasons why base url was added explicitly in register toolchains call?

@rickeylev
Copy link
Contributor

rickeylev commented May 15, 2023

Would single_version_override be a better option? Despite the name, it also allows you to just apply patches to the module. This would allow you to patch just the BASE_URL variable in rules_python.

Adding args to python.toolchain() isn't verboten, but they need to operate well when there's a graph of MODULE's interacting. Actually, there's probably two problems with adding such an arg:

  1. If a sub-module calls python.toolchain() without the overridden URL, then it won't
    try to use the base url you want, and subsequently fail to download entirely.
  2. If a sub-module calls python.toolchain() with an overridden URL, then that runtime
    will get downloaded and possibly used by arbitrary things. Which seems prone to breaking
    unintentional things. (I previously wrote it as potentially nefarios, but since realized
    a nefarious module would have easier and more direct ways to do the same without that ability)

A separate tag class might work better here that is only usable from the root module (i.e. mctx.is_root() == True). e.g.

python = use_extension(...)
python.override_downloads(
  base_url = ...,
)

(or similar; an entire TOOL_VERSIONS map could be passed in, for example)

@chrislovecnm chrislovecnm added the type: bzlmod bzlmod work label May 17, 2023
@vadikmironov
Copy link
Author

@vadikmironov and @jbgcarnes would you rather use https://blog.aspect.dev/configuring-bazels-downloader or #1216?

Chris, first of all many thanks for starting #1216 . I can confirm that bazel downloader totally solves this with a simple rewrite command. This ticket can be closed if @jbgcarnes has no further comments.

The point that @rickeylev brought up would essentially allow to override toolchain packages and their locations for all supported versions and I think that suggestion is really awesome. If python.override_downloads (or maybe override_toolchains would be a better name?) is there and accepts all three components that are used in python_register_toolchains such as base_url, TOOL_VERSIONS and MINOR_MAPPING, this would allow custom toolchains to be registered. I imagine there is a number of usecases for custom built python and this API would give the same toolset in bzlmod that was sort of present via python_register_toolchains call in the past.

Please do let me know what is the preference - close this item and start a new one for python.override_downloads or you would rather prefer to continue on this issue?

@chrislovecnm
Copy link
Collaborator

@rickeylev what do you think?

@rickeylev rickeylev added this to the Bzlmod GA milestone Aug 1, 2023
Copy link

This issue has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days.
Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_python!

@github-actions github-actions bot added the Can Close? Will close in 30 days if there is no new activity label Jan 28, 2024
Copy link

This issue was automatically closed because it went 30 days without a reply since it was labeled "Can Close?"

@aignas
Copy link
Collaborator

aignas commented Aug 24, 2024

One extra thing that bzlmod users should be able to override is the TOOL_VERSIONS, the #2151 should fix this and this should close one more gap for feature parity between the bzlmod and WORKSPACE apis.

@aignas aignas removed the Can Close? Will close in 30 days if there is no new activity label Aug 24, 2024
@aignas
Copy link
Collaborator

aignas commented Sep 24, 2024

#2222 Has implemented this.

@aignas aignas closed this as completed Sep 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment