-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FR: Store resolved repository attributes in the Bzlmod lockfile #19026
Comments
cc @SalmaSamy |
This will require the repo rules used by the package manger ruleset to support resolved attributes, right? IIRC, currently only the
This is still true if we delegate the job to the repository rule, right? The resolved attributes are returned by the implementation function of the repo rule, which means the repo must be fetched at least once to get those resolved attributes. |
Third-party repo rules can and do support this as well. Even if they don't yet, this is a well-defined existing mechanism to return "pinning" information.
Yes, this is similar to how module extension results are added to the lockfile as they are run instead of all at once. I would imagine this feature to emit the resolved attributes only if a repo rule is fetched. If someone wants to eagerly lock everything, they could just use the upcoming offline mode feature to force everything to be fetched. |
Does this mean if I only do dependency resolution, the lock file will be generated without the resolved attributes info. |
Yes, I think that would necessarily be the case with lazy fetches and would transparently improve the situation in "update" mode: Compared to now, after a In order to generate a classical lockfile for everything, fetching everything would be required, as is the case now with custom implementations such as r_j_e's Maven pinning. |
I'm not sure this is as avoidable as the issue description suggests. I think there are two styles repository rules in-use:
I'm really excited by the idea of having a standard format, and a standard mechianism for repinning, but I'm not sure how much incremental fetching is a practical goal? On repinning - one of the big issues we face in rules_jvm_external and rules_rust is a conflict between two desires:
Because when a repository rule runs we can't detect whether you're repinning, we need to do something like sniff an env var we get users to manually set to know we shouldn't detect that things are out of date and error. (rules_rust automatically repins if this env var is set, rules_jvm_external just doesn't error if it's set, but requires you to A problem we run into here is that there may be a required order to do repins, and we may need to ignore bad repins (or get them done as a side effect). Let's say both Java and Rust need re-pinning - right now, there's no good way to express "Rust - ignore that you're out of date, I'm repinning Java, I'll come back to you soon to repin yourself". It's fortunate that these rulesets don't depend on each other, but ideally we'd be repinning rulesets in toplogical order... Another thing to note on repinning - rules_rust has different modes of repinning - you may be repinning to say:
We should work out how to support these :) |
I agree that this is the interesting case. Even in this case, isn't there a difference between fetching the metadata for each dep eagerly and truly fetching all deps artifacts eagerly? I don't know Cargo that well but for r_j_e, a module extension could potentially just download the poms and sha256 files and then use that information to generate The changes I propose in this issue wouldn't be relevant in this context. They would apply if there were no way to get the hash of a jar without downloading it first, which is the case in other situations (think http_archive).
With what I described above, r_j_e's
Could explain in more detail why these different repinning modes need to exist? Ideally (again, please give this a reality check ;-) ), we could split this into two different actions: resolving version requirements and manipulating version requirements. If users don't want this process to pick up newly released versions of unrelated deps, shouldn't they express this in their version requirements, e.g. by pinning to patch versions? Of course this can also be done by a tool that updates the version requirements with the result of the resolution - crucially, this would move the complexity of specifying exactly what a user wants to be kept at what version to the requirements instead of the lockfile or module extension logic.
This does sounds pretty complex and I hope we never get to that point, but if we do, module extensions can depend on others, which would give you the capability to cleanly evaluate them in topological order. |
I think there are three main reasons people repin:
Yes - I agree with this. Right now this is made much harder by the fact that these manipulations are done via running
From a theoretical perspective, yes, but in practice I think there are valid reasons to want to specify "Technically I'm compatible with newer versions than I happened to most recently lock, but I'd rather avoid them unless I need to". Before we had this functionality in rules_rust, we received several complaints about the unexpected updates (e.g. bazelbuild/rules_rust#1522, bazelbuild/rules_rust#1535 (comment), bazelbuild/rules_rust#1231). The main reasons I think people care about this are:
In most other cases (e.g. a breaking API change), I'd agree that version requirements should be updated, but I can see reasons for wanting to minimise churn nonetheless. |
Description of the feature request:
The information about resolved attributes returned by repo rules and collected by
--experimental_repository_resolved_file
should be stored in the Bzlmod lockfile.Repository rules already use this information to lazily (that is, when evaluated) return (often cryptographic) identifiers that make their results fully reproducible, which can also result in faster fetches (via Bazel's repository cache).
If
MODULE.bazel.lock
could be updated with this information incrementally, users would have strong "trust on first use" guarantees without having to manually provide cryptographic identifiers in an extension-dependent manner.What underlying problem are you trying to solve with this feature?
Lockfiles for package managers usually serve two purposes:
With the Bzlmod lockfile available with Bazel 6.3.0, there is a general solution for 1. that module extensions can use instead of rolling their own one.
However, 2. remains unsolved: When wrapping an external package manager for the fictitious
foo
language into a module extensionfoo_deps
, in order to benefit from Bazel's repository cache, users either have to manually specify hashes of the resolved deps in theirMODULE.bazel
file or a file referenced from it (such asgo.sum
for Go) or rely on custom module extension/repo rule logic to generate such a file (such asrules_jvm_external
's concept of pinning). The latter usually requires eagerly fetching all repos once to generate the lockfile, which can be slow. Since this process differs between module extensions, it results in additional friction beyond just updating Starlark files.Which operating system are you running Bazel on?
N/A
What is the output of
bazel info release
?N/A
If
bazel info release
returnsdevelopment version
or(@non-git)
, tell us how you built Bazel.No response
What's the output of
git remote get-url origin; git rev-parse master; git rev-parse HEAD
?No response
Have you found anything relevant by searching the web?
All module extensions for languages that do not come with a well-established lockfile format currently need to come up with their own format. This includes existing rulesets such as
rules_jvm_external
as well as new developments (e.g. for PHP.Any other information, logs, or outputs that you want to share?
No response
The text was updated successfully, but these errors were encountered: