Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bzlmod: with 6.3.0 generated lockfile it changes when running projects on different platforms if using requirements_darwin and requirements_windows parameters #19154

Closed
aignas opened this issue Jul 25, 2023 · 22 comments
Assignees
Labels
area-Bzlmod Bzlmod-specific PRs, issues, and feature requests P1 I'll work on this now. (Assignee required) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. type: bug

Comments

@aignas
Copy link

aignas commented Jul 25, 2023

🐞 bug report

Affected Rule

bzlmod pip.parse extension.

Is this a regression?

No, this is not a regression.

Description

rules_python repository rules result in a different dependency closure depending on which platform the rules are being run on. This means that the users may be unable to add the MODULE.bazel.lock to their version control if multiple platforms are being used.

Why this happens is because we at runtime select either the default or a platform specific requirements.txt file to be parsed by the bzlmod extension to create a hub repo for it. It will be different for different platforms and hence the differences in the lock file.

🔬 Minimal Reproduction

pip.parse(
    hub_name = "publish_deps",
    python_version = "3.10",
    requirements_darwin = "@rules_python//tools/publish:requirements_darwin.txt",
    requirements_lock = "@rules_python//tools/publish:requirements.txt",
    requirements_windows = "@rules_python//tools/publish:requirements_windows.txt",
)

Then creating a lock with bazel 6.3.0 on Linux and a different platform will yield to changes in the lock file.

🔥 Exception or Error

The lock file differs on Linux and Mac.

🌍 Your Environment

  • Linux Ubuntu 20.04 LTS
  • Latest MacOS on Intel Mac

Output of bazel version: 6.3.0

Rules_python version: 0.24.0

@aignas aignas changed the title bzlmod: with 6.3.0 generated lockfile it changes when running projects on different platforms if using requirements_darwin and requirements_windows parameters` bzlmod: with 6.3.0 generated lockfile it changes when running projects on different platforms if using requirements_darwin and requirements_windows parameters Jul 25, 2023
@Wyverald
Copy link
Member

/sub. also cc @meteorcloudy @SalmaSamy

@Wyverald
Copy link
Member

This is not necessarily something rules_python can solve. But then again, this is not necessarily something we can solve at all. @aherrmann will remember the lengthy discussion we had on conditional dependencies and how that might interact with lockfiles. If users need to produce different repos on different platforms, they will necessarily get different lockfiles; one way to deal with that might be to manually multiplex between different lockfiles themselves. We may also need to provide a flag to choose the lockfile path (so the user might say --lockfile_name=MODULE.bazel.lock.darwin).

@meteorcloudy
Copy link
Member

We may also need to provide a flag to choose the lockfile path (so the user might say --lockfile_name=MODULE.bazel.lock.darwin).

Yeah, combining with --enable_platform_specific_config, this could be a workaround for major platforms.

@aignas
Copy link
Author

aignas commented Aug 1, 2023

@Wyverald, @meteorcloudy, right now I don't see a way to switch the platform config per (os, arch) tuple (e.g. Having a different lock file for macos_x86_64 and macos_arm64 and linux_x86_64 and linux_arm64. What would be the idiomatic way to handle such cases?

@meteorcloudy
Copy link
Member

--enable_platform_specific_config doesn't work for different arches, we need to come up with a better solution for the lockfile to work with multiple platforms.

@meteorcloudy meteorcloudy transferred this issue from bazelbuild/rules_python Aug 3, 2023
@meteorcloudy meteorcloudy added area-Bzlmod Bzlmod-specific PRs, issues, and feature requests type: bug P1 I'll work on this now. (Assignee required) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. labels Aug 3, 2023
@meteorcloudy
Copy link
Member

I transferred this issue from rules_python to bazel, since it's a general problem with Bzlmod lockfile.

@fmeum
Copy link
Collaborator

fmeum commented Aug 7, 2023

I think that this issue warrants taking a step back and trying to understand why module extensions may have host-dependent results. The major use cases I am aware of:

  1. Fetching remote SDKs: These module extensions typically inspect the host to determine which remote SDK they should fetch (e.g. rules_go's go_sdk extension).
  2. Resolving non-module dependencies: rules_python's pip extension is an example of such an extension that inspect the host in order to determine the set of external dependencies. I could see rules_js being affected by this as well, even if the host information only influences which particular variant of a package to fetch, not the overall structure of the dependency graph.
  3. The great unknown: Module extensions can inspect the host in arbitrary ways and that is a feature. For example, in Provide hooks for host platform autodetection #8766, a module extension could be the way for rulesets to hook into the generation of constraints for the auto-configured host platform.

From a purely conceptual point of view, I think that the way to achieve host-agnostic resolution results for 1. and 2. is to fully embrace toolchain resolution and configurable dependencies. Put harshly, every inspection of host properties in such extensions will result in incorrect cross-platform builds and is thus wrong. Instead, extensions should strive to resolve dependencies and register SDKs for all reasonable target/exec platforms (possibly within a configurable set) and let toolchain resolution and select() handle the choice at analysis time. I know that this can be highly non-trivial to pull off and don't fault anyone for not doing it this way, but ultimately the loading phase is not the right place to make platform decisions anymore.

Since the explicit purpose of extensions as in 3. is to inspect the host and encode the result in repo rule instantiations, I don't see how multiplexing could solve this general case without duplicating the entire detection logic outside Bazel, which would defeat the purpose of having the extensions in the first place.
Instead, I think that we may want to allow an extension to mark itself as providing host-dependent results, for example via a configure or local parameter on module_extension (similar to the parameters of the same name on repository_rule). If True, the extension results would not be added to the lockfile. Such extensions should aim to be very fast to evaluate (as they aren't cached persistently) and should only produce Starlark metadata, not binaries.

@aherrmann
Copy link
Contributor

As @Wyverald points out, there were lengthy discussions about this on the external dependencies overhaul proposal. Unfortunately, I don't seem to have access to the comment threads anymore. @Wyverald do they still exist?

From a purely conceptual point of view, I think that the way to achieve host-agnostic resolution results for 1. and 2. is to fully embrace toolchain resolution and configurable dependencies. Put harshly, every inspection of host properties in such extensions will result in incorrect cross-platform builds and is thus wrong.

In principle, I agree with this. In practice this can be very difficult to achieve. I believe the concrete examples that were discussed on the proposal mentioned above would all fall under category 2.

  • Many language specific package managers have a notion of platform specific dependencies, e.g. Haskell dependencies can depend on OS, arch, compiler version, and custom flags; similarly Rust dependencies can be platform specific or feature dependent. The extent to which these parameters can be configured when instructing the package manager to resolve dependencies is often too limited to generate all required target platform specific configurations on any one given host platform.
  • Some package managers only exist for specific platforms. E.g. when I worked on the Daml project we used Nix to provide system dependencies on MacOS and Linux, and Scoop to provide system dependencies on Windows. Scoop only exists for Windows, and Nix, at least practically right now, only exists for Unix platforms. So, there is no way to generate lock-files for both on one single host platform (ignoring emulation or the like).

@fmeum
Copy link
Collaborator

fmeum commented Aug 8, 2023

@aherrmann Do you know how large multi-platform Rust projects deal with this problem? Given that dependencies can combine platform and feature constraints in essentially arbitrary ways, the lockfile problem doesn't seem to be any easier to solve for them than it is for us.

Given these complexities, it may help to consider the different purposes served by typical package manager lockfiles in isolation:

  1. Lockfiles encode how the package manager resolved the transitive dependency requirements into a consistent set of all dependencies.
  2. Lockfiles provide cryptographic identifiers for the artifacts corresponding to the resulting set of dependencies.

Of these, 1. is apparently highly platform-dependent, whereas 2. can potentially be fulfilled by a go.sum type file to which new information is only ever added. Translated into Bzlmod terms, with sufficiently unique naming, 1. would correspond to resolving module extension tags into the names of repositories to generate (such as my_dep_linux_amd64) and 2. would correspond to the map from repository names to all other attributes (such as the URL and SHA-256 hash of the my_dep_linux_amd64.so artifact).

It may just not be possible to solve 1. with the Bzlmod lockfile for general extensions without pushing the full multiplexing complexity onto the user. Maybe there is a way to ensure that 2. is still addressed by it even for such extensions, perhaps by splitting the file?

@SalmaSamy
Copy link
Contributor

We are working on adding new attributes to the module extension use_os and use_arch to mark it as os or architecture dependent. Then we will use them -if set to true- to store different versions of the same extension in the lockfile depending on the os and arch.

@SalmaSamy SalmaSamy self-assigned this Aug 23, 2023
copybara-service bot pushed a commit that referenced this issue Aug 23, 2023
Will be followed by using these two attributes in the lockfile to store platform dependent extensions

related issue: #19154

PiperOrigin-RevId: 559392081
Change-Id: Ib0cfa2df648effe785068fa4d03da02c39d1fdd8
@fmeum
Copy link
Collaborator

fmeum commented Aug 23, 2023

We are working on adding new attributes to the module extension use_os and use_arch to mark it as os or architecture dependent. Then we will use them -if set to true- to store different versions of the same extension in the lockfile depending on the os and arch.

I like that this kind of detection is built-in and doesn't require manual multiplexing.

We probably still have to support the case of a module extension that inspects the system more deeply than just os name and arch and thus necessarily needs to opt out of the lockfile. I wonder whether a custom Starlark function could allow this and also be a bit more flexible than fixed use_os/use_arch attributes (e.g. allow to collapse certain architectures or OSes into a single key).

For example, we could have module_extension accept a user-defined Starlark function via a lockfile_key parameter. This function would receive a single keyword parameter os with the repository_os object and has to return a string or None. If it returns a string (such as linux_amd64), this becomes the lockfile key, if it returns None, it won't be locked. Extensions that do not provide this parameter continue to be locked without a key.

The use case of use_os = True and use_arch = True would become:

foo = module_extension(
    ...,
    lockfile_key = lambda os: os.name + "_" + os.arch,
)

Crucially, if we see a need to extend this scheme in the feature, we can just pass in new optional keyword parameters.

@SalmaSamy @Wyverald Curious to hear your thoughts about this.

@Wyverald
Copy link
Member

I think the lockfile_key thing is a bit too generic but also potentially leads to misuse.

  1. The lambda has to accept exactly 1 repository_os object forever, unless the user is careful to use lambda os, **kwargs: X.
  2. Concatenating using _ seems innocuous enough, but technically we never promised that os.name or os.arch wouldn't contain _ themselves.
  3. It's somewhat unclear that returning None from this causes the extension to not be locked. If I'm familiar with repo rules, I'd just expect a local=True parameter to do that. If I'm not, something named never_lock would be much clearer.

@Wyverald
Copy link
Member

Another problem with the lambda approach is that we can't serialize lambdas, which means we can't know whether the lockfile_key attribute has changed for a given module extension, which in turn makes it impossible to know whether we can reuse entries in the lockfile. Consider the following (unlikely) scenario:

foo = module_extension(..., lockfile_key = lambda os: os.name)
# run Bazel on blahOS and arm64, `foo` is locked with "lockfile key" of "blahOS"
# author changes definition of `foo` to
foo = module_extension(..., lockfile_key = lambda os: os.arch)
# run Bazel on Windows, but on a weird architecture called "blahOS"
# we reuse the existing entry even though we shouldn't

@fmeum
Copy link
Collaborator

fmeum commented Aug 23, 2023

  1. The lambda has to accept exactly 1 repository_os object forever, unless the user is careful to use lambda os, **kwargs: X.

We could use the pattern that Ivo suggested for a redo of rule implementation functions: Pass in repository_os to the parameter named os if it exists, otherwise don't. In this way you don't have to make your lambdas future proof manually.

  1. Concatenating using _ seems innocuous enough, but technically we never promised that os.name or os.arch wouldn't contain _ themselves.

Yeah, that's a good point. We could accept returning a list or tuple.

  1. It's somewhat unclear that returning None from this causes the extension to not be locked. If I'm familiar with repo rules, I'd just expect a local=True parameter to do that. If I'm not, something named never_lock would be much clearer.

That's also a good point. I kind of read this as "no key means nothing to store this under", but I can see why others wouldn't.

My main gripe with the current approach is the fact that, with local, we will have at least three Boolean flags that aren't independent: local = True and use_os = False doesn't really make sense. If we identify more things we want to key on, we immediately need yet another flag. If we think that this list is unlikely to grow, then I agree that the current approach is probably better.

@fmeum
Copy link
Collaborator

fmeum commented Aug 23, 2023

Another problem with the lambda approach is that we can't serialize lambdas, which means we can't know whether the lockfile_key attribute has changed for a given module extension, which in turn makes it impossible to know whether we can reuse entries in the lockfile. Consider the following (unlikely) scenario:

foo = module_extension(..., lockfile_key = lambda os: os.name)
# run Bazel on blahOS and arm64, `foo` is locked with "lockfile key" of "blahOS"
# author changes definition of `foo` to
foo = module_extension(..., lockfile_key = lambda os: os.arch)
# run Bazel on Windows, but on a weird architecture called "blahOS"
# we reuse the existing entry even though we shouldn't

Convinced ;⁠-⁠) (although I guess we could hash the Starlark files containing the function and store the hash in the lockfile, but let's not go there)

@Wyverald
Copy link
Member

Wyverald commented Aug 23, 2023

The actual data to be modeled here is something like:

data Factor = Os | Arch  -- with more to be added if needed
data WhenToLock = Never | DependentOnFactors (Set Factor)

or in Java terms...

enum Factor { OS, ARCH }
Optional<Set<Factor>> whenToLock; // `empty()` means "never lock"

Accordingly, a Starlark API that keeps extensibility in mind could look like:

when_to_lock = None, # means never lock; or
when_to_lock = ["os", "arch"], # a list of strings; duplicates forbidden; possible values include "os", "arch", etc

But that's a little overkill IMO, especially since the "Factor" enum is unlikely to grow. Compared to this "principled" design, I'd rather go with the somewhat-dirty-but-simple "use_os + use_arch + local but local=True overrides everything else" option :)

@fmeum
Copy link
Collaborator

fmeum commented Aug 24, 2023

I agree that the enum solution feels clunky in Starlark, especially since when_to_lock = None means "never lock" whereas when_to_lock = [] means "lock without a key".

There are a few aspects of the use_os/use_arch that I would like to discuss:

  1. Their name suggests that a module extension should set use_os to True if it uses ctx.os.name anywhere in its implementation. However, it only has to set this to True if its evaluation result depends on the current OS name. For example, if rules_jvm_external needs to run a Java tool to resolve its Maven dependencies, it might need to inspect the host machine to determine how to run it (e.g. whether it needs to run java or java.exe), but the result of the resolution is independent of that. Maybe we can find a more descriptive name? key_by_os or depends_on_os came to my mind, but there is probably something better.
  2. repository_os.name and repository.arch are essentially free form fields with no normalization applied. When generating the key, are you planning to use OS.getCurrent() and CPU.getCurrent() instead?
  3. I see that the new attributes default to False. Have you considered using True as the default to make sure that users can be more certain that committing lockfiles won't break builds across machines? If we go with False, we should do a survey of the common extensions in the BCR and verify they set these flags correctly before the 6.4.0 release.

@SalmaSamy
Copy link
Contributor

@Wyverald @fmeum I created this mini_doc to discuss the implementation details for this.

@fmeum
Copy link
Collaborator

fmeum commented Aug 24, 2023

@Wyverald @fmeum I created this mini_doc to discuss the implementation details for this.

Thanks, that's very helpful! I left a few comments.

SalmaSamy added a commit that referenced this issue Aug 31, 2023
…ts use_os and use_arch to true

- Created new key for module extensions in lockfile containing the extenionId, os, and arch
- Store the key information in the extension event
- Update the adapter for the key

related issue: #19154

PiperOrigin-RevId: 559432346
Change-Id: Ib48d7590e6d35397471a4ff46c58226eecf339c2
@meteorcloudy
Copy link
Member

@bazel-io fork 6.4.0

SalmaSamy added a commit that referenced this issue Sep 12, 2023
… and arch

Updated the logic to save and fetch module extensions from the lockfile depending on their os and arch dependency.

fixes #19154

PiperOrigin-RevId: 564707355
Change-Id: Ib48d7590e6d35397471a4ff46c58226eecf339c2

# Conflicts:
#	scripts/bootstrap/compile.sh
#	src/test/py/bazel/bzlmod/bazel_lockfile_test.py
#	src/test/shell/bazel/bazel_determinism_test.sh
SalmaSamy added a commit that referenced this issue Sep 12, 2023
Will be followed by using these two attributes in the lockfile to store platform dependent extensions

related issue: #19154

PiperOrigin-RevId: 559392081
Change-Id: Ib0cfa2df648effe785068fa4d03da02c39d1fdd8

# Conflicts:
#	src/main/java/com/google/devtools/build/lib/starlarkbuildapi/repository/RepositoryModuleApi.java
#	src/main/java/com/google/devtools/build/skydoc/fakebuildapi/repository/FakeRepositoryModule.java
#	src/test/java/com/google/devtools/build/lib/bazel/bzlmod/StarlarkBazelModuleTest.java
SalmaSamy added a commit that referenced this issue Sep 12, 2023
Will be followed by using these two attributes in the lockfile to store platform dependent extensions

related issue: #19154

PiperOrigin-RevId: 559392081
Change-Id: Ib0cfa2df648effe785068fa4d03da02c39d1fdd8
SalmaSamy added a commit that referenced this issue Sep 12, 2023
… and arch

Updated the logic to save and fetch module extensions from the lockfile depending on their os and arch dependency.

fixes #19154

PiperOrigin-RevId: 564707355
Change-Id: Ib48d7590e6d35397471a4ff46c58226eecf339c2

# Conflicts:
#	scripts/bootstrap/compile.sh
#	src/test/py/bazel/bzlmod/bazel_lockfile_test.py
#	src/test/shell/bazel/bazel_determinism_test.sh
meteorcloudy added a commit that referenced this issue Sep 13, 2023
Store different extensions in the lockfile if they depend on os and arch

Includes:
- Old refactor commits for StarlarkRepositoryModule &
RepositoryModuleApi from
- Adding os and arch attributes to module extension 
- Updating the logic to save and fetch module extensions from the
lockfile depending on their os and arch dependency.

fixes #19154

PiperOrigin-RevId: 564707355
Change-Id: Ib48d7590e6d35397471a4ff46c58226eecf339c2

---------

Co-authored-by: Googler <wyv@google.com>
Co-authored-by: Yun Peng <pcloudy@google.com>
@Wyverald Wyverald added this to the 7.0.0 branch cut milestone Sep 19, 2023
@iancha1992
Copy link
Member

A fix for this issue has been included in Bazel 6.4.0 RC1. Please test out the release candidate and report any issues as soon as possible. Thanks!

aignas added a commit to aignas/rules_python that referenced this issue Sep 27, 2023
aignas added a commit to aignas/rules_python that referenced this issue Sep 27, 2023
github-merge-queue bot pushed a commit to bazelbuild/rules_python that referenced this issue Sep 28, 2023
This ensures that under bzlmod with `--lockfile-mode=update` we
would generate an entry per os/arch, which is needed because the
hermetic toolchain interpreter path is os/arch dependent.

Summary:
- add bazel_features dep
- mark the pip extension as arch/os dependent

Related: bazelbuild/bazel#19154

---------

Co-authored-by: Richard Levasseur <richardlev@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-Bzlmod Bzlmod-specific PRs, issues, and feature requests P1 I'll work on this now. (Assignee required) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. type: bug
Projects
Archived in project
Development

No branches or pull requests

8 participants
@aignas @Wyverald @aherrmann @meteorcloudy @fmeum @SalmaSamy @iancha1992 and others