Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trim transitioned Go settings on non-Go dependencies #3108

Merged
merged 5 commits into from
May 9, 2022

Conversation

fmeum
Copy link
Collaborator

@fmeum fmeum commented Apr 8, 2022

What type of PR is this?

Bug fix

What does this PR do? Why is it needed?

By resetting Go-specific settings changed by go_transition to their previous values when crossing a non-deps dependency edge, e.g. srcs or data, dependencies through those edges are not affected by the change in Go settings. This has two advantages:

  • Correctness: Previously, if a go_binary with linkmode = "c-archive" used another go_binary to generate srcs, that go_binary would also be built as a c-archive and thus fail to run during the build. This was observed in Genrules in 6.0.0-pre.20220112.2 occasionally try to run shared libraries built by go_binary bazel#14626.
  • Performance: With the new Bazel flag --experimental_output_directory_naming_scheme=diff_against_baseline, transitions can return to the top-level configuration since the Starlark settings that were affected by a transition at some point during the build are no longer tracked explicitly in a setting, but computed as a configuration diff. Resetting the configuration for non-deps dependencies thus prevents redundant rebuilds of non-Go deps caused by changes in Go settings, achieving a version of "configuration trimming" for Go.

Which issues(s) does this PR fix?

Fixes #2999
Fixes #3120
Fixes bazelbuild/bazel#14626

Other notes for review

This is based on #3005 and completes the work started there. #3005 can be reviewed and merged first if preferred.

@fmeum fmeum changed the title Trim transitioned Go settings on non-Go dependencies WIP: Trim transitioned Go settings on non-Go dependencies Apr 8, 2022
@fmeum fmeum force-pushed the reset-mode-transition branch 8 times, most recently from 23040f6 to 64429b2 Compare April 9, 2022 07:41
@fmeum fmeum changed the title WIP: Trim transitioned Go settings on non-Go dependencies Trim transitioned Go settings on non-Go dependencies Apr 9, 2022
@fmeum fmeum marked this pull request as ready for review April 9, 2022 07:47
@fmeum fmeum force-pushed the reset-mode-transition branch 2 times, most recently from 880eee1 to efd118d Compare April 11, 2022 08:59
@achew22 achew22 requested review from achew22 and linzhp May 4, 2022 17:24
@achew22
Copy link
Member

achew22 commented May 4, 2022

@linzhp can you try this out in your big repo and see if it works as well?

@linzhp
Copy link
Contributor

linzhp commented May 4, 2022

It passed in Uber, but let's merge #3005 first

@fmeum
Copy link
Collaborator Author

fmeum commented May 5, 2022

@linzhp I rebased onto master and renamed the transition to match the scheme you suggested in #3005.

go/private/rules/transition.bzl Outdated Show resolved Hide resolved
if not "//go/config:" in setting:
return None
name = setting.split(":")[1]
return filter_transition_label("@io_bazel_rules_go//go/private/rules:original_" + name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems filter_transition_label does nothing to a string like this, right? In addition, bazelbuild/bazel#10499 is already fixed. Do we still need filter_transition_label at all?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It converts @io_bazel_rules_go//go/config:race to //go/config:race when io_bazel_rules_go is the main repo and returns it unchanged otherwise. It may not be necessary here since the "original_" settings are never reference on the command-line, but using it for some settings and not others seemed even more confusing to me - it would also affect the way transition inputs/outputs are declared below.

The bug is fixed, but the full fix is only available in Bazel 5+. Once the minimum version has been updated to 5.0.0, filter_transition_label can be dropped.

# Encoding as JSON makes it possible to embed settings of arbitrary
# types (currently bool, string and string_list) into a string
# setting.
settings[original_value_setting] = json.encode(original_settings[setting])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can we not using the original type like:

Suggested change
settings[original_value_setting] = json.encode(original_settings[setting])
settings[original_value_setting] = original_settings[setting]

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I extended the comment to explain why that wouldn't work. In short: We need a way to distinguish between "explicitly set to default" and "not explicitly set". The alternative would be to track an additional config.bool per memoized setting, but the embedding into JSON just seemed much simpler. It has the added benefit that all "original" settings can have the same type.

go/private/rules/transition.bzl Outdated Show resolved Hide resolved
Comment on lines 200 to 203
if not original_value_setting:
continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if not original_value_setting:
continue
if not original_value_setting or value == original_settings[setting]:
continue

This would reduce the nesting for the logic below.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Transition outputs have to be set explicitly, even if to their default. If I simplify the if below in the way I believe you are hinting at, I get transition outputs [...] were not defined by transition function errors.

go/private/rules/transition.bzl Outdated Show resolved Hide resolved
go/private/rules/transition.bzl Outdated Show resolved Hide resolved
go/private/rules/transition.bzl Outdated Show resolved Hide resolved
name = setting.split(":")[1]
return filter_transition_label("@io_bazel_rules_go//go/private/rules:original_" + name)

_ORIGINAL_VALUE_SETTINGS = [_original_value_setting(setting) for setting in POTENTIALLY_TRANSITIONED_SETTINGS]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_ORIGINAL_VALUE_SETTINGS = [_original_value_setting(setting) for setting in POTENTIALLY_TRANSITIONED_SETTINGS]
_ORIGINAL_SETTING_KEYS = [_original_value_setting(setting) for setting in POTENTIALLY_TRANSITIONED_SETTINGS]

Also, because _original_value_setting may return None. You need to filter them out

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code expects this function to be "in sync" with what is now called TRANSITIONED_GO_SETTINGS anyway and if it is correctly synced, _ORIGINAL_SETTING_KEYS will never contain None. If the two get out of sync, returning None should raise the alarm more effectively than silently dropping elements, so I would see value in not filtering them out. What do you think?

go/private/rules/transition.bzl Outdated Show resolved Hide resolved
fmeum added 2 commits May 6, 2022 09:17
By resetting Go-specific settings changed by go_transition to their
previous values when crossing a non-deps dependency edge, e.g. srcs or
data, dependencies through those edges are not affected by the change in
Go settings. This has two advantages:

* Correctness: Previously, if a go_binary with linkmode = "c-archive"
  used another go_binary to generate srcs, that go_binary would also be
  built as a c-archive and thus fail to run during the build. This was
  observed in bazelbuild/bazel#14626.
* Performance: With the new Bazel flag
  --experimental_output_directory_naming_scheme=diff_against_baseline,
  transitions can return to the top-level configuration since the
  Starlark settings that were affected by a transition at some point
  during the build are no longer tracked explicitly in a setting, but
  computed as a configuration diff. Resetting the configuration for non-
  deps dependencies thus prevents redundant rebuilds of non-Go deps
  caused by changes in Go settings, achieving a version of
  "configuration trimming" for Go.
The test verifies that Go settings such as gotags and linkmode do not
affect dependencies through attributes other than deps.
@fmeum fmeum force-pushed the reset-mode-transition branch 2 times, most recently from 791114a to 461314d Compare May 6, 2022 09:08
# encoding, e.g. "\"\"" or "[]").
settings[original_key] = json.encode(original_settings[key])
else:
settings[original_key] = ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you set this to None or not setting it at all, to indicate that it's not explicitly set?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't work: Since the setting have type string, they can only be set to a string, not None.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tried None? According to the official doc:

settings is a dictionary {String:Object}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried it, it does indeed return an error with Bazel 4.2.1.

Comment on lines 416 to 417
original_key = _original_setting_key(key)
original_value = settings[original_key]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
original_key = _original_setting_key(key)
original_value = settings[original_key]
original_key = _original_setting_key(key)
if not original_key or original_key not in settings:
continue

So you can skip original_key not explicitly set

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could implement that change, but could you explain why you think it's needed? If _original_setting_key(key) is None for key in TRANSITIONED_GO_SETTING_KEYS, then that's a bug in this file. I could possibly see adding a check with a fail, but a silent continue would - imo - just make these bugs harder to spot.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, adding a fail with more debuggable message is better. Because _original_setting_key(key) can return None, it's better to check for that when None is not acceptable.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a commit that pushed the conversion into the construction of a dict that contains the mapping and added a fail in the conversion function. What do you think?

go/private/rules/transition.bzl Outdated Show resolved Hide resolved
# encoding, e.g. "\"\"" or "[]").
settings[original_key] = json.encode(original_settings[key])
else:
settings[original_key] = ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tried None? According to the official doc:

settings is a dictionary {String:Object}

Comment on lines 416 to 417
original_key = _original_setting_key(key)
original_value = settings[original_key]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, adding a fail with more debuggable message is better. Because _original_setting_key(key) can return None, it's better to check for that when None is not acceptable.

go/private/rules/transition.bzl Outdated Show resolved Hide resolved
non_go_transition = transition(
implementation = _non_go_transition_impl,
inputs = TRANSITIONED_GO_SETTING_KEYS + _SETTING_KEY_TO_ORIGINAL_SETTING_KEY.values(),
outputs = TRANSITIONED_GO_SETTING_KEYS + _SETTING_KEY_TO_ORIGINAL_SETTING_KEY.values(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
outputs = TRANSITIONED_GO_SETTING_KEYS + _SETTING_KEY_TO_ORIGINAL_SETTING_KEY.values(),
outputs = TRANSITIONED_GO_SETTING_KEYS,

can we do this and not setting new_settings[original_key] in the impl?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By not resetting all settings to their command-line defaults, non-Go dependencies of a transitioned go_binary would still differ from top-level targets in the configuration: They would have the top-level config values in their original_* settings, whereas top-level (untransitioned) targets do not. This would still solve the correctness issue linked in the PR description, but would not address the performance problem.

@fmeum
Copy link
Collaborator Author

fmeum commented May 9, 2022

Had to fix-up the latest commit's use of filter_transition_label.

tests/core/transition/hermeticity_test.go Show resolved Hide resolved
go/private/rules/transition.bzl Outdated Show resolved Hide resolved
tests/core/transition/hermeticity_test.go Show resolved Hide resolved
@linzhp linzhp self-assigned this May 9, 2022
Copy link
Contributor

@linzhp linzhp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested again on Uber's Go monorepo: passed

@linzhp linzhp merged commit fee2095 into bazelbuild:master May 9, 2022
pjjw added a commit to pjjw/rules_go that referenced this pull request Jun 3, 2022
…ild#3108)"

This causes trouble with libmongocrypt. To be investigated, but patching
this "correctness" fix out for now.

This reverts commit fee2095.
@ripatel-fd
Copy link

Builds are still duplicated for C dependencies as of v0.38.0.

A cmake rule in rules_foreign_cc is built twice when included in two Go targets, with and without gotags respectively.

cmake target 1: affected by starlark transition: []

cmake target 2: affected by starlark transition: [@io_bazel_rules_go//go/config:tags, @io_bazel_rules_go//go/private/rules:original_tags]

@fmeum fmeum deleted the reset-mode-transition branch January 24, 2023 15:34
@fmeum
Copy link
Collaborator Author

fmeum commented Jan 24, 2023

@ripatel-jump Are you sure that this is a regression in 0.38.0? I would think that this has been the behaviour for a long time. Do you happen to have a reproducer I can test this with?

ripatel-fd pushed a commit to ripatel-fd/rules_go that referenced this pull request Jan 24, 2023
bazelbuild#3108 introduced an
improvement that aims to remove Go-specific transitions on non-Go
dependencies.

This fix was incomplete. It did not cover the case where a setting
transitions from truthy to falsy. (e.g. `["bla"] => []`)

This further reduces duplicated non-Go targets, specifically reducing
the number of times Cgo C/C++ dependencies are rebuilt.
@ripatel-fd
Copy link

@fmeum Thanks for the quick response. I debugged a bit further and I believe the regression comes from this PR: #3116

This is still all very confusing to me because I cannot find a good way to debug the transitions that get applied on my targets.

It is evident from the source though that after #3116 go_library and go_test use different transition functions for cdeps and various other attrs. There is probably more to this though because I'm only ever using cdeps in go_library, and importing that go_library via go_test

@ripatel-fd
Copy link

Attached clean reproducer in bug report #3430

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants