Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collect coverage for other languages #3287

Merged
merged 1 commit into from
Sep 11, 2022

Conversation

fmeum
Copy link
Collaborator

@fmeum fmeum commented Sep 10, 2022

What type of PR is this?

Feature

What does this PR do? Why is it needed?

When using --combined_report=lcov, the coverage report for a go_test will also include coverage for data dependencies executed by the test, e.g. java_binary or cc_binary tools run in an integration test.

Note: This commit does not make it so that coverage is collected for go_binarys executed by go_tests - Go doesn't provide exit hooks, so this would be rather involved to implement.

When using `--combined_report=lcov`, the coverage report for a `go_test`
will also include coverage for data dependencies executed by the test,
e.g. `java_binary` or `cc_binary` tools run in an integration test.

Note: This commit does *not* make it so that coverage is collected for
`go_binary`s executed by `go_test`s - Go doesn't provide exit hooks, so
this would be rather involved to implement.
@fmeum fmeum marked this pull request as ready for review September 10, 2022 12:55
@linzhp linzhp merged commit d94e692 into bazelbuild:master Sep 11, 2022
@fmeum fmeum deleted the bazel-6293-cc-coverage branch September 11, 2022 10:26
@skevy
Copy link

skevy commented Sep 13, 2022

@linzhp @fmeum this caused a regression for us when running in RBE.

When running bazel coverage now with remote execution, I get errors like:

Cannot locate runfiles directory. (Set $JAVA_RUNFILES to inhibit searching.)

and

GCov does not exist at the given path: ''

While I can get the second error to go away by manually passing a GCOV path, the first error seems to be caused by this very old outstanding Bazel issue: bazelbuild/bazel#4685

Would it be possible to put this behavior behind some kind of flag? We don't actually need this behavior, and this is preventing us (and likely other remote build execution users) from upgrading to 0.35.0.

@fmeum
Copy link
Collaborator Author

fmeum commented Sep 13, 2022

@skevy Do any of the workarounds (e.g. --experimental_split_coverage_postprocessing --experimental_fetch_all_coverage_outputs) work for you? It would be great if we could get this behavior to stick, but if we can't get, I can hide it behind a flag and try to fix the upstream issue before making it the default again.

@skevy
Copy link

skevy commented Sep 13, 2022

@fmeum they do not. I'll continue working with our RBE partner on this but it's definitely a blocker for now (especially given how long that issue has been open).

@fmeum
Copy link
Collaborator Author

fmeum commented Sep 13, 2022

@skevy I know it's not going to be easy, but if you can provide me with a reproducer for this issue with the flags I mentioned above, I could actively work on fixing this upstream.

@fmeum
Copy link
Collaborator Author

fmeum commented Sep 13, 2022

@linzhp Should we introduce a flag that chooses between the two mergers?

@linzhp
Copy link
Contributor

linzhp commented Sep 14, 2022

Let's see if we can fix it first before considering supporting two mergers

@arjantop-cai
Copy link

We have the same problem and are blocked on this, want to upgrade to Go 1.19 which required 0.35 which fails with remote cache.

To reproduce any remote cache with --remote_download_minimal flag should work

arjantop-cai added a commit to cookieai-jar/rules_go that referenced this pull request Sep 30, 2022
@arjantop-cai
Copy link

@linzhp @fmeum Any progress on this? I attempted to create a fork but did not manage to get it to work (some relience on hardcoded bazelbuild/rules_go?) so we are pretty much stuck on this.

@fmeum
Copy link
Collaborator Author

fmeum commented Oct 14, 2022

Coverage collection for both java_test and cc_test should work with remote execution, assuming the required flags are used. This is verified in Bazel's integration tests here. rules_go uses the same mechanisms, so it shouldn't be any different.

Instead of regressing one feature to support another, I would first like to understand better what exactly doesn't work. @skevy @arjantop-cai Which set of flags make this fail? Are you using:

coverage --experimental_fetch_all_coverage_outputs
coverage --experimental_split_coverage_postprocessing

What about different values of --remote_download_outputs? I would expect at least --remote_download_output=all to work. It's quite possible that all we need to fix this problem is to simply document some flags (possibly involving --experimental_remote_download_regex) to add to the remote config. I would greatly appreciate any kind of concrete reproducer.

@arjantop-cai
Copy link

@fmeum None of the test cases actually cover minimal, I am pretty sure all works because that always works. But minimal works only with very specific config.
I will do more testing but currently the above error is what you get back. Previous versions have no problems with downloading (go only) coverage data.

@fmeum
Copy link
Collaborator Author

fmeum commented Oct 14, 2022

The following flags make the Bazel integration tests pass:

--experimental_fetch_all_coverage_outputs
--experimental_split_coverage_postprocessing
--remote_download_minimal
--experimental_remote_download_regex='(?:.*/coverage\.dat)|(?:.*/_coverage/.*)'

Could you try them?

@fmeum
Copy link
Collaborator Author

fmeum commented Oct 14, 2022

The behavior changed because we (rules_go) are now using Bazel's standard procedure for merging coverage across languages rather than rolling our own, but that procedure isn't fully set up for BwoB.
I just submitted an upstream fix that should make this work by default: bazelbuild/bazel#16475

@esprehn
Copy link

esprehn commented Dec 8, 2022

@skevy and I are setting --remote_download_output=all and this is still failing.

bazel coverage \
   --config=CI \
   --remote_download_outputs=all \
   --@io_bazel_rules_go//go/config:cover_format=go_cover \
   --instrumentation_filter="^//projects/" \
   --target_pattern_file="/tmp/affected_project_patterns.txt"

Of the flags we pass for CI these are the relevant ones:

build:remote --remote_timeout=3600
build:remote --jobs=1024
build:remote --remote_local_fallback=false
build:remote --define=EXECUTOR=remote
build:remote --disk_cache=
build:remote --experimental_inmemory_dotd_files
build:remote --experimental_inmemory_jdeps_files
build:remote --incompatible_strict_action_env=true
build:remote --spawn_strategy=remote
build:remote --remote_download_minimal
build:remote --remote_default_exec_properties="OSFamily=Linux"
build:remote --grpc_keepalive_time=30s # this is a workaround for bazel hanging in scheduling. There’s a known issue if the connection drops without a TCP RST.
# Remote workers currently require engflow-modified bazel binaries
build:remote --experimental_remote_mark_tool_inputs=true

ex output:

exec ${PAGER:-/usr/bin/less} "$0" || exit 1
Executing tests from //projects/errors:errors_test
whoami: cannot find name for user ID 108
-----------------------------------------------------------------------------
PASS
coverage: 84.2% of statements
GCov does not exist at the given path: ''
/mnt/engflow/worker/work/2/exec/bazel-out/k8-opt-exec-20B0B623/bin/external/remote_coverage_tools/Main: Cannot locate runfiles directory. (Set $JAVA_RUNFILES to inhibit searching.)

It seems related to bazelbuild/bazel#4685 but even with remote_download_outputs=all it still fails.

@fmeum
Copy link
Collaborator Author

fmeum commented Dec 8, 2022

@esprehn You are missing --experimental_split_coverage_postprocessing and --experimental_fetch_all_coverage_outputs.

@esprehn
Copy link

esprehn commented Dec 8, 2022

Why is experimental_fetch_all_coverage_outputs required if remote_download_output=all is used?

Also am I to understand that coverage in bazel is just broken with RBE unless you pass that experimental postprocessing flag? :/

@fmeum
Copy link
Collaborator Author

fmeum commented Dec 8, 2022

The name of that flag isn't optimal, it doesn't have anything to do with BwoB and remote_download_output=all. I am trying to get these flags flipped, but there are some edge cases (think SandboxFS) that break in that case.

Coverage is broken in Bazel with RBE unless you pass the flag or do what Google does internally: Compile the Java coverage merger tool into a native binary, which doesn't use runfiles and thus works without the flag.

@esprehn
Copy link

esprehn commented Dec 8, 2022

Ah that makes sense, thanks for explaining! Looking forward to the defaults being flipped to working. :)

@esprehn
Copy link

esprehn commented Dec 14, 2022

@fmeum we've looked into this and the problem seems to come from the combination of --@io_bazel_rules_go//go/config:cover_format=go_cover and the changes in this PR.

Specifically passing:

--@io_bazel_rules_go//go/config:cover_format=go_cover
--experimental_split_coverage_postprocessing
--experimental_fetch_all_coverage_outputs
--remote_download_outputs=all

the coverage.dat is downloaded but is always zero bytes. If you remove --@io_bazel_rules_go//go/config:cover_format=go_cover then the coverage report appears but in lcov format.

If you remove either of the other two flags coverage crashes in remote mode.

Is there a way to preserve the go_cover format support?

@fmeum
Copy link
Collaborator Author

fmeum commented Dec 15, 2022

@esprehn What happens if you remove both of the --experimental flags? Does that also crash Bazel?

go_cover doesn't need any kind of coverage post-processing and the only thing changed by this PR is the tool we register to do coverage post-processing, so I am a bit surprised to see things break for --@io_bazel_rules_go//go/config:cover_format=go_cover - unless it's just the output generator deleting the content of the go cover file it doesn't understand.

Could you try what happens if you create this target in your own workspace and pass its label to Bazel via --coverage_output_generator=...? That should restore the behavior before this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants