-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support test sharding #1041
Support test sharding #1041
Conversation
9e05e13
to
910172b
Compare
This is very useful enhancement. I guess, being a couple days away from 5.0 GA, it will be part of the 5.1 train. What do you think, @junit-team/junit-lambda ? |
Sorry, @sormuras sorry I didn't see these issues sooner. I have been bugging the Bazel team to look at the Launcher API for a while, but they have been busy with other things |
@kcooney Can you provide a few more details why the visitors need to see descriptors before filters are applied? Please note that some filters (see implementations of Alternatively, could Bazel inspect the |
Some changes would break backwards compatibility so we should consider them for 5.0 GA. |
Tentatively slated for 5.0 GA in order to allow the team to assess this PR before GA (even though parts of it might get moved to 5.1). |
This avoids inconsistencies between getParent() and getParent().get().getChildren()
910172b
to
7f30f1d
Compare
FYI: I realized that the only constructor for |
This allows users of the launcher to learn about the discovered tests before filters are applied. This is essential for build systems like Bazel that want to "shard" (aka partition) a test run into multiple processes while ensuring that the tests included in each "shard" are consistent, even if test discovery does not result in a deterministic ordering of tests (see https://goo.gl/Yj4fXL for what Bazel does to Shard JUnit4 tests today).
Yes, Bazel would add a custom PostDiscoveryFilter for test sharding. I will do my best. @iirina can perhaps add more details. For those that don't know, Bazel is a build tool, similar to Maven, but with advanced caching capabilities. In order to do that, Bazel actions need to be deterministic. Internally, each action knows about all of its inputs, so Bazel always knows what artifacts to build and what not to build. Test sharding allows a test to be automatically split into multiple partitions ("shards") and run in separate processes. Here's an example of a sharded test:
Let's say that Let's assume that the test uses the round-robin strategy so that which shard a test is run is determined by this expression:
Imagine that the first time the test is run, all three shards get the same sequence of test methods, in this order:
Bazel will run the these test methods on each shard:
Imagine that test
In this case, shard 2 will run To avoid this, the round-robin strategy sorts the tests before determining which test is on which shard. So the shard that fails will always have the same tests when it is retried. More generally, Bazel will pass the sharding strategy all of the tests, and the sharding strategy provides a filter. For Bazel to support sharding for JUnit 5 tests, it will need a way to see all tests before the filter runs. It's okay if some engines provide filtering during discovery, as long as they do that in a deterministic way. So, for example, the test engine couldn't decide to filter out all tests annotated as
How do you get access to the |
7f30f1d
to
529ef31
Compare
That's correct. My idea was to first call |
One additional idea: To ensure that a shard isn't modified, could Bazel store the unique IDs of the contained tests and re-run them using |
Thanks for the response, @marcphilipp I think your
Bazel could not store the unique IDs for the contained tests since it doesn't read the Java source files when determining whether a build action needs to be rerun (it just computes a hash of the files)
I think that would work, as long as the Thinking through this a bit in the next two paragraphs. If it's too long for you, feel free to ignore the rest. In fact, that might be better, since Bazel fails if the test is "over-sharded" so a shard is empty (so we don't waste resources starting a JVM only to find out there are no tests). Bazel checks if a shard is empty after the sharding filter runs but before command line filtering happens. Bazel could check for empty shards by looking at the I don't think dynamically-generated tests would be a problem, as long as it's deterministic. For example, in my previous comment, if test |
I'm closing this PR because test sharding can be implemented by using the existing Launcher API as described above. The potential problem of |
Thanks, Marc. Great work on making the launcher API so flexible :-) I sent out a related pull request related to test sharding. See #1055 |
FYI, I hacked together an extension to do this. It’s a bit unsatisfying because it filters with local information only, which is not great for balancing. I’d love an official built-in sharding filter! |
Interesting implementation of an Will bring it to our team discussion this week. Scope is Jupiter-only, in contrast to the issue discussed here, which targeted Platform, IIRC. PS: I'd love to update square/javapoet#677 as well. |
Overview
Add LauncherDiscoveryRequestBuilder.visitors()
This allows users of the launcher to learn about the discovered tests
before filters are applied. This is essential for build systems
like Bazel that want to "shard" (aka partition) a test run into
multiple processes while ensuring that the tests included in
each "shard" are consistent, even if test discovery does not
result in a deterministic ordering of tests (see https://goo.gl/Yj4fXL
for what Bazel does to Shard JUnit4 tests today).
This change also creates a wrapper to make a TestDescription
unmodifiable. This ensures that tests cannot be removed during
the visit phase, and cannot be added during the filtering stage.
Finally, FilterResult is made final, so that no one can override
the methods to make them inconsistent with one another.
I hereby agree to the terms of the JUnit Contributor License Agreement.
Definition of Done
@API
annotations