Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow targets that depend on test results #11240

Open
alanfalloon opened this issue Apr 28, 2020 · 7 comments
Open

Allow targets that depend on test results #11240

alanfalloon opened this issue Apr 28, 2020 · 7 comments
Labels
P4 This is either out of scope or we don't have bandwidth to review a PR. (No assignee) team-Bazel General Bazel product/strategy issues type: feature request

Comments

@alanfalloon
Copy link
Contributor

Description of the feature request:

We want to write a rule that consumes test results. Depending on a test target only gives access to the test executable and whatever other providers we implement, but there is no way to depend on the results of running the test.

Feature requests: what underlying problem are you trying to solve with this feature?

We have custom reporting and metrics that we want to extract from completed tests.

  • The reporting is a roll-up of many tests so it can't just be added to the end of the testing action itself.
  • It also depends on additional metadata generated during the build actions that produce the test executables, so we want to keep them in-sync with the outputs from running the tests.
  • The post-processing somewhat expensive so we would like to do it in remote-execution and as a separate target.

What operating system are you running Bazel on?

macOS and linux.

What's the output of bazel info release?

release 3.0.0

Have you found anything relevant by searching the web?

No.

I have seen the built-in bazel coverage support which generates a post-test action all within the Java, but it is not configurable enough to hijack for our needs. Side note: a design to provide generic support for post-test actions would remove the need for a baked-in special-case coverage action and open the door for community support for coverage that meets the needs of the diverse languages and targets that the community supports -- as well as supporting generic post-test reporting like what we require.

The closest thing that I can come up with is to make a bazel run target that pulls in the metadata for the tests from providers and adds them as rundata then looks in the testlogs location to extract the data from the test run, but this is extremely error prone since we either need to:

  • assume that the test data is up to date with the tests themselves.
  • run the tests explicitly in the bazel run script which then needs to know all the correct options to pass along to the bazel test invocation for the users setup.
@jin
Copy link
Member

jin commented Apr 29, 2020

Test results and metadata (logs, undeclared outputs) are recorded in the build event output stream. Build event protocol is the API for observing events that happened during a build or test.
Have you looked into using that instead?

https://docs.bazel.build/versions/master/build-event-protocol.html

@jin jin added team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website untriaged type: feature request labels Apr 29, 2020
@alanfalloon
Copy link
Contributor Author

I was aware of the BEP, but I took a closer look just in case. However, I still don't see how it solves my problem.

It may be a more reliable way for a bazel run based script to find the test outputs rather than scraping the bazel info bazel-testlogs location, but it doesn't let me add any actions to the build graph that can consume the test outputs. I still need to do the processing outside of the bazel test operation -- which means that I can't effectively demand more than one, or use remote execution, or cache the results, and I'm still stuck figuring out how to keep the results and metadata in-sync on my own.

@allada
Copy link
Contributor

allada commented Jul 2, 2020

We also have a similar request/need.

Our specific use case/problem:

  • We use bazel remote execution configured with build without bytes
  • We use platform rules and mark rules with exec_compatible_with denoting the requirements of the test (eg: GPU, CPU1, CPU8, DISPLAY exc...)
  • We have some complex tests that run in sequence but have different inputs from different parts of our source tree and the test results could be cached as such (ie: don't need to run test most of the time on certain parts of the test).
  • These tests also sometimes have different stages that require GPU and others that use CPU

Detailed example:
We have a DNN model that runs across a large set of data and takes 50% of total time this part of the test requires GPU. After the model is done some metadata for each frame is put into a json file and placed in undeclared_outputs and bundled in outputs.zip.

A second section of code runs that needs this metadata from the model output but also other data from test input data and combines them together. This part of the code only needs CPU and is quite expensive and uses also ~50% of total time.

Example:
Untitled drawing (1)

There are other examples where we could use this kind of ability, like if a test takes a very long time to execute, we could potentially run two parts of the test independently of each other but not required to run in sequence (eg: if Semantic Segmentation has no state between frames, you could run one video file sharded across N nodes then combine them back together to perform aggregated statistics across all frames). This example is almost achievable using bazel's sharding, but since it cannot collect all the results and process them all at once it's not useful.

One way we can make this work is to make a custom bazel rule that runs these steps as genrules or bzl rules then combine them together with a final test rule, however the problem with this approach is that we'll have to use --build_tag_filters or else we'll end up running this as part of the build steps and we only want to run the expensive parts if a test is requested on the target.

@philwo philwo added P4 This is either out of scope or we don't have bandwidth to review a PR. (No assignee) team-Bazel General Bazel product/strategy issues type: feature request and removed team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website type: feature request untriaged labels Dec 8, 2020
@github-actions
Copy link

Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 2+ years. It will be closed in the next 14 days unless any other activity occurs or one of the following labels is added: "not stale", "awaiting-bazeler". Please reach out to the triage team (@bazelbuild/triage) if you think this issue is still relevant or you are interested in getting the issue resolved.

@github-actions github-actions bot added the stale Issues or PRs that are stale (no activity for 30 days) label Apr 18, 2023
@brentleyjones
Copy link
Contributor

@bazelbuild/triage Not stale.

@sgowroji sgowroji removed the stale Issues or PRs that are stale (no activity for 30 days) label Apr 19, 2023
@SoftwareApe
Copy link

Is there any progress on this? Especially the caching of long-running tests would help us save a lot of compute resources in such a divide-and-conquer scenario.

@ilyapoz
Copy link

ilyapoz commented Feb 20, 2024

Another use case I have.
We have "tests with canonization": run some code, compare output artifacts produced to the saved copy. If they're equal, the test passes. If not, it fails, but you have an easy way to update the canonical data.
Right now this is implemented as the test having two modes: check and canonize. The problem is if the main part that produces data is expensive, I want to cache it and make the second call simply copy the results from cache.
So I would have two tests:

  1. checks that code runs and produces some result
  2. checks that results are equal to the expected ones. Or updates them in canonization mode.

2 depends on 1

So if on the first run 1 succeeds but 2 fails, I rerun it in canonize mode and 1 is cached, 2 updates the results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P4 This is either out of scope or we don't have bandwidth to review a PR. (No assignee) team-Bazel General Bazel product/strategy issues type: feature request
Projects
None yet
Development

No branches or pull requests

8 participants