Duplicate builds due to transitions of targets that don't depend on build settings #13281

aherrmann · 2021-03-30T11:11:17Z

Description of the problem:

Targets that don't depend on a user defined build setting are effectively duplicated when depended upon by targets that depend on a build setting and use a configuration transition.

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

Check out aherrmann/bazel-configuration-rebuild@2be2a62

Follow the steps in the README:
The target //:cat is independent of the user defined build flag //:flag.
The target //:flag_cat on the other hand depends on the value of the flag and
on //:cat as well.

The target //:cat is only built once, and cached under the same action cache
key, if the configuration is set via command-line flag.

$ bazel clean --expunge; bazel build //:flag_cat --disk_cache=.cache/default --build_event_publish_all_actions --build_event_json_file=default.json

$ bazel clean --expunge; bazel build //:flag_cat --//:flag=different_flag --disk_cache=.cache/different --build_event_publish_all_actions --build_event_json_file=different.json

$ jq 'select(.id.actionCompleted.label == "//:cat")' <default.json | jq -s length
1

$ jq 'select(.id.actionCompleted.label == "//:cat")' <different.json | jq -s length
1

$ rg 'bin/cat.txt' .cache/*/ac
.cache/default/ac/b9/b966d571c54c4aa37fc5ae7629f95caaa01ad79474fdc406c9e9a79341c950f6
2:"bazel-out/k8-fastbuild/bin/cat.txtD

.cache/different/ac/b9/b966d571c54c4aa37fc5ae7629f95caaa01ad79474fdc406c9e9a79341c950f6
2:"bazel-out/k8-fastbuild/bin/cat.txtD

The targets //:transition_default and //:transition_different use a
configuration transition to set the value of the build setting //:flag to
"default_flag" and "different_flag" respectively.

The target //:cat is built twice and cached under different action cache keys
when building these transition targets, even though //:cat does not depend on
the build setting.

$ bazel clean --expunge; bazel build //:transition_default //:transition_different --disk_cache=.cache/transition --build_event_publish_all_actions --build_event_json_file=transition.json

$ jq 'select(.id.actionCompleted.label == "//:cat")' <transition.json | jq -s length
2

$ rg 'bin/cat.txt' .cache/transition/ac
.cache/transition/ac/b9/b966d571c54c4aa37fc5ae7629f95caaa01ad79474fdc406c9e9a79341c950f6
2:"bazel-out/k8-fastbuild/bin/cat.txtD

.cache/transition/ac/8a/8a979615135482bd13afbdf6004d5c87c0a5f3427bcedc299f496eeff4946e29
2:2bazel-out/k8-fastbuild-ST-1d73fa6ab729/bin/cat.txtD

The CAS key is of course the same, as the contents of the output are the same.

Preferably, //:cat would only be built once in the transition case.

What operating system are you running Bazel on?

Ubuntu 20.10

What's the output of `bazel info release`?

release 4.0.0

Have you found anything relevant by searching the web?

This seems to be the same issue as reported on the following mailing list thread: https://groups.google.com/g/bazel-discuss/c/zVEc7gzbyu0/m/5UcZ8aXOBQAJ .
This seems to fall under the following umbrella issue: #6526 .
I decided to create a dedicated issue nonetheless, since I didn't see the specific issue of dependencies that don't depend on the build setting being rebuilt and cached separately on the issue tracker.
As I understand, #12568 suggests to address this issue through the setting's visibility attribute. However, that seems like a coarse approximation in general. A setting with public visibility might still only affect few targets.

Any other information, logs, or outputs that you want to share?

The use-case that prompted this issue is using user defined build settings to define feature flags. We then want to test different settings of feature flags in the same bazel test //... run, so we define transitions to generate targets that pin the feature flags. This causes unrelated dependencies to be rebuilt, e.g. core dependencies like com_google_protobuf. Ideally, only targets that directly depend on the feature flag, or transitively depend on a target that depends on the feature flag would be rebuilt.

The text was updated successfully, but these errors were encountered:

Warchant · 2021-11-29T17:19:51Z

Exists in 4.1.0

gregestren · 2022-02-02T00:30:01Z

I switched this to a feature request, although I understand the experience you're expressing. I did that because this is a Bazel design challenge, which would take focused design improvements to resolve. It's not a bug in the current design: the current behavior is expected for the given design.

What you're describing falls into some form of "trimming". If you're specifically concerned about actions running on remote executors, the work I'm doing at #6526 may address that and I'd encourage you to follow that issue.

aherrmann · 2022-02-22T10:59:24Z

@gregestren Thank you for the update! Makes sense, I understand the reasoning.

What you're describing falls into some form of "trimming". If you're specifically concerned about actions running on remote executors, the work I'm doing at #6526 may address that and I'd encourage you to follow that issue.

Thanks for the reference. In this issue here I'm not concerned about remote execution but about the configuration aspect.

For example the community|enterprise feature switch described here.
Another use-case we've considered would be the various configurations that targets can be built under in Haskell: E.g. the language runtime can be linked in various combinations of "ways", such as "vanilla", "threaded", "debug", .... Similarly, Haskell libraries can be built in various configurations such as static, dynamic, debug, profiling, .... These can have complex implications on the dependency graph. E.g. profiling mode only supports static linking, but Template Haskell requires dynamically linked library dependencies (on Unix by default - there are alternatives, but they have downsides). So, to build a library that uses Template Haskell in profiling mode we need to build it's dependencies both "static with profiling" and "dynamic without profiling". Transitions could be an elegant way to express this, however, in light of this current issue that doesn't seem feasible. An alternative is to have rules emit actions for all these various combinations and then have dependencies pick the variants that they need in their configuration, but, that produces a combinatorial number of actions and is not very elegant.

gregestren · 2022-03-02T00:30:03Z

What's your primary concern about this behavior? I.e. build slowness? Memory usage?

The basic challenge is knowing where in the dep graph the dependency on the flag actually ends. And how to encode that in a way that remains correct (don't remove the flag prematurely), performance, and not burdensome for the user (i.e. manually tagging every edge on the graph that should keep the flag, which doesn't scale well.

Are there any alternative configurations where the part of the graph that needs the flag is its own distinct subgraph? i.e. isolate the part of the graph to propagate the flag by flag structure? I think that opens up more options.

As an aside, I'm surprised to read about collision at https://github.com/aherrmann/bazel-transitions-demo#gotchas. I didn't look super-thoroughly, but that should be hard to actually do in practice (i.e. collisions shouldn't really happen unless someone is egregiously abusing a rule). Is that specific to pkg_tar?

aherrmann · 2022-03-02T14:38:00Z

What's your primary concern about this behavior? I.e. build slowness? Memory usage?

Build slowness induced by unnecessary rebuilds of the same target. E.g. external dependencies such as com_google_protobuf are built multiple times even though they are unaffected by the feature flag. (See "Any other information, logs, or outputs that you want to share?" in the issue description)

Are there any alternative configurations where the part of the graph that needs the flag is its own distinct subgraph? i.e. isolate the part of the graph to propagate the flag by flag structure? I think that opens up more options.

I'm not sure I understand what this means. Do you have an example in mind of what this would look like?
Looking at the minimal example, I'd say //:cat is independent of the flag and everything else forms the sub-graph that depends on the flag. Nonetheless, also //:cat is rebuilt under transitions to different flag values.

As an aside, I'm surprised to read about collision at https://github.com/aherrmann/bazel-transitions-demo#gotchas. I didn't look super-thoroughly, but that should be hard to actually do in practice (i.e. collisions shouldn't really happen unless someone is egregiously abusing a rule). Is that specific to pkg_tar?

The issue occurs when one combines the same target under different configurations in rules or other mechanisms that operate on paths relative to the package or $(BINDIR) rather than execRoot. I.e. Bazel might distinguish the outputs as bazel-out/k8-fastbuild/bin/app/bin and bazel-out/k8-fastbuild-ST-391c2674bb99/bin/app/bin, but pkg_tar will map them both to /bin.
Similarly, paths in the runfiles tree are based on paths relative to $(BINDIR). aherrmann/bazel-transitions-demo@0344cb1 illustrates this issue. Running //app:runfiles produces

$ bazel run //app:runfiles
bin-ce
Using community edition
Data file: ENTERPRISE

bin-ee
Using enterprise edition
Data file: ENTERPRISE

bin
Using enterprise edition
Data file: ENTERPRISE

(i.e. there is a collision on both bin and data.txt)

This seems to be sensitive to the order in which dependencies are declared: If I swap the deps

 sh_binary(
     name = "runfiles",
     srcs = ["runfiles.sh"],
-    data = ["bin-ce", "bin-ee"],
+    data = ["bin-ee", "bin-ce"],
     deps = ["@bazel_tools//tools/bash/runfiles"],
 )

then the output becomes

bin-ce
Using community edition
Data file: COMMUNITY

bin-ee
Using enterprise edition
Data file: COMMUNITY

bin
Using community edition
Data file: COMMUNITY

I guess whether this counts as egregious abuse depends on the use-case. We abandoned the transitions based approach because of this issue, so I can't provide a production example. But the following made-up example doesn't seem too absurd: Following the community vs. enterprise version example above, we may want to write an integration test that both enterprise and community clients can participate in the same network. In that case both versions would be runfile dependencies of the integration test target. At that point we need to be careful not to trigger the issue above.

github-actions · 2023-06-13T01:34:30Z

Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 1+ years. It will be closed in the next 14 days unless any other activity occurs or one of the following labels is added: "not stale", "awaiting-bazeler". Please reach out to the triage team (@bazelbuild/triage) if you think this issue is still relevant or you are interested in getting the issue resolved.

fmeum · 2023-06-13T06:03:50Z

@bazelbuild/triage not stale

aiuto added Starlark configuration Starlark transitions, build_settings untriaged team-Configurability Issues for Configurability team and removed Starlark configuration Starlark transitions, build_settings labels Mar 31, 2021

aiuto added this to the configurability:trimming milestone Apr 7, 2021

aiuto added P2 We'll consider working on this in future. (Assignee optional) and removed untriaged labels Apr 7, 2021

This was referenced Oct 14, 2021

Support fine grained module level build definitions tweag/rules_haskell#1552

Open

Generating API documentation using aspect doesn't appear to work tweag/rules_haskell#1603

Closed

gregestren added the type: bug label Nov 24, 2021

gregestren added type: feature request and removed type: bug labels Feb 2, 2022

aherrmann mentioned this issue Feb 22, 2022

Per rule profiling configuration tweag/rules_haskell#1315

Open

aherrmann mentioned this issue Apr 8, 2022

--distinct_host_configuration=false causes conflicting actions on bazel test --test_arg #14848

Open

fmeum mentioned this issue Oct 28, 2022

Runfile collisions across transitions #16379

Open

github-actions bot added the stale Issues or PRs that are stale (no activity for 30 days) label Jun 13, 2023

sgowroji removed the stale Issues or PRs that are stale (no activity for 30 days) label Jun 13, 2023

iancha1992 added the not stale Issues or PRs that are inactive but not considered stale label Jun 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Duplicate builds due to transitions of targets that don't depend on build settings #13281

Duplicate builds due to transitions of targets that don't depend on build settings #13281

aherrmann commented Mar 30, 2021

Warchant commented Nov 29, 2021

gregestren commented Feb 2, 2022

aherrmann commented Feb 22, 2022

gregestren commented Mar 2, 2022

aherrmann commented Mar 2, 2022

github-actions bot commented Jun 13, 2023

fmeum commented Jun 13, 2023

Duplicate builds due to transitions of targets that don't depend on build settings #13281

Duplicate builds due to transitions of targets that don't depend on build settings #13281

Comments

aherrmann commented Mar 30, 2021

Description of the problem:

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

What operating system are you running Bazel on?

What's the output of bazel info release?

Have you found anything relevant by searching the web?

Any other information, logs, or outputs that you want to share?

Warchant commented Nov 29, 2021

gregestren commented Feb 2, 2022

aherrmann commented Feb 22, 2022

gregestren commented Mar 2, 2022

aherrmann commented Mar 2, 2022

github-actions bot commented Jun 13, 2023

fmeum commented Jun 13, 2023

What's the output of `bazel info release`?