Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build event streams generated by Bazel 6.3.0 and later no longer include the remote execution message in such a way that it's attached to the action that failed #19327

Closed
EdSchouten opened this issue Aug 24, 2023 · 8 comments
Labels
team-Remote-Exec Issues and PRs for the Execution (Remote) team type: bug

Comments

@EdSchouten
Copy link
Contributor

EdSchouten commented Aug 24, 2023

Description of the bug:

PR #18757 changed the way Bazel emits remote execution messages. They are now console events instead of being attached to stdout/stderr. This change on its own is all right, but it now means that if you look at the build event file they are only part of the "progress" messages. This is really annoying, as it means the messages don't show up anywhere near the place where the actual execution failure is described.

You could in theory extract the information from the progress message, but what's annoying about that is that you likely need to use some VT100 escape sequence parser to interpret that data in a meaningful way.

Furthermore, the event that is printed to the console is incomplete. Sure, it contains the target name, but it does not have the configuration ID either. This means that if the same target is built multiple times (e.g., once for the host, once for the target), it's not possible to figure out which it belongs to.

Which category does this issue belong to?

Remote Execution

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

Build this target against a remote execution system that generates execution messages:

genrule(
    name = "foo",
    outs = ["bar"],
    cmd = "false",
)

Then run bazel build --build_event_json_file=${HOME}/x ...

Which operating system are you running Bazel on?

Any

What is the output of bazel info release?

6.3.0

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD ?

No response

Is this a regression? If yes, please try to identify the Bazel commit where the bug was introduced.

#18757

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

No response

@iancha1992
Copy link
Member

@EdSchouten Could you please provide output or logs when you execute?

@EdSchouten
Copy link
Contributor Author

This is what I get in the console:

Starting local Bazel server and connecting to it...
INFO: Invocation ID: feb660ab-213e-4631-8863-ab082d206ce9
INFO: Analyzed target //:foo (5 packages loaded, 8 targets configured).
INFO: Found 1 target...
INFO: Remote execution message for Genrule //:foo: Action details (uncached result): https://REDACTED
ERROR: /Users/ed/bla/BUILD:1:8: Executing genrule //:foo failed: (Exit 1): bash failed: error executing command (from target //:foo) /bin/bash -c 'source external/bazel_tools/tools/genrule/genrule-setup.sh; false'
Target //:foo failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 5.033s, Critical Path: 0.74s
INFO: 2 processes: 2 internal.
INFO: Build Event Protocol files produced successfully.
FAILED: Build did NOT complete successfully

I can't share the full build event log, as it contains a lot of private data that I don't want to share (e.g., tokens). Here's the progress message that gets generated:

{"id":{"progress":{"opaqueCount":3}},"children":[{"progress":{"opaqueCount":4}},{"actionCompleted":{"primaryOutput":"bazel-out/darwin-fastbuild/bin/bar","label":"//:foo","configuration":{"id":"b4c2d3c9895aa5cb978bd161a6a622042d639e2ecb3d59cc6a7ef5fc46a7e07e"}}}],"progress":{"stderr":"\r\u001b[1A\u001b[K\u001b[32mAnalyzing:\u001b[0m target //:foo (1 packages loaded, 0 targets configured)\n\r\u001b[1A\u001b[K\u001b[32mINFO: \u001b[0mAnalyzed target //:foo (5 packages loaded, 8 targets configured).\n\u001b[0m checking cached actions\n\r\u001b[1A\u001b[K\u001b[32mINFO: \u001b[0mFound 1 target...\n\u001b[0m checking cached actions\n\r\u001b[1A\u001b[K\u001b[32m[0 / 2]\u001b[0m 2 actions, 0 running\n    [Prepa] BazelWorkspaceStatusAction stable-status.txt\n    [Prepa] Executing genrule //:foo\n\r\u001b[1A\u001b[K\r\u001b[1A\u001b[K\r\u001b[1A\u001b[K\u001b[32m[1 / 2]\u001b[0m Executing genrule //:foo; 0s remote\n\r\u001b[1A\u001b[K\u001b[32m[1 / 2]\u001b[0m [Sched] Executing genrule //:foo\n\r\u001b[1A\u001b[K\u001b[32mINFO: \u001b[0mRemote execution message for Genrule //:foo: Action details (uncached result): https://REDACTED\n\u001b[32m[1 / 2]\u001b[0m Executing genrule //:foo; 0s remote\n"}}

Notice that the URL is in there, but that it's not really possible to extract it in a meaningful/portable way.

@stagnation
Copy link

stagnation commented Aug 25, 2023

I can create a reproduction


I get different results for local/remote with bazel 6.3.0 here,
bb-deployment can easily launch Buildbarn with docker-compose, and then build itself.

A recipe/automation script here: https://github.com/stagnation/bb-deployments/blob/feature/reproduce-bazel-issue-19327/19327.sh

remote.json
image

local.json an additional target completed message
image

coeuvre added a commit to coeuvre/bazel that referenced this issue Aug 28, 2023
Since it makes it harder for BEP consumers to present the messages in a structured way.

Fixes bazelbuild#19327.

This reverts commit 6d089b3.
@coeuvre
Copy link
Member

coeuvre commented Aug 28, 2023

@bazel-io flag

@bazel-io bazel-io added the potential release blocker Flagged by community members using "@bazel-io flag". Should be added to a release blocker milestone label Aug 28, 2023
@iancha1992
Copy link
Member

@bazel-io fork 6.4.0

@oquenchil
Copy link
Contributor

oquenchil commented Aug 29, 2023

Should be fixed by #19347.

coeuvre added a commit to coeuvre/bazel that referenced this issue Aug 31, 2023
Since it makes it harder for BEP consumers to present the messages in a structured way.

Fixes bazelbuild#19327.

This reverts commit 6d089b3.
iancha1992 pushed a commit to iancha1992/bazel that referenced this issue Sep 5, 2023
Since it makes it harder for BEP consumers to present the messages in a structured way.

Fixes bazelbuild#19327.

This reverts commit 6d089b3.

Closes bazelbuild#19347.

PiperOrigin-RevId: 562793125
Change-Id: I6ea41323c4ea98d416e4cfd9728d0d753aef9b5e
iancha1992 added a commit that referenced this issue Sep 6, 2023
Since it makes it harder for BEP consumers to present the messages in a
structured way.

Fixes #19327.

This reverts commit 6d089b3.

Closes #19347.

Commit
d3d167f

PiperOrigin-RevId: 562793125
Change-Id: I6ea41323c4ea98d416e4cfd9728d0d753aef9b5e

Co-authored-by: Chi Wang <chiwang@google.com>
@iancha1992
Copy link
Member

A fix for this issue has been included in Bazel 6.4.0 RC1. Please test out the release candidate and report any issues as soon as possible. Thanks!

@stagnation
Copy link

Looks good to me, the output differs a little bit. But there is now a targetCompleted message to follow
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team-Remote-Exec Issues and PRs for the Execution (Remote) team type: bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants