Support test sharding #1041

kcooney · 2017-09-02T07:46:45Z

Overview

Add LauncherDiscoveryRequestBuilder.visitors()

This allows users of the launcher to learn about the discovered tests
before filters are applied. This is essential for build systems
like Bazel that want to "shard" (aka partition) a test run into
multiple processes while ensuring that the tests included in
each "shard" are consistent, even if test discovery does not
result in a deterministic ordering of tests (see https://goo.gl/Yj4fXL
for what Bazel does to Shard JUnit4 tests today).

This change also creates a wrapper to make a TestDescription
unmodifiable. This ensures that tests cannot be removed during
the visit phase, and cannot be added during the filtering stage.

Finally, FilterResult is made final, so that no one can override
the methods to make them inconsistent with one another.

I hereby agree to the terms of the JUnit Contributor License Agreement.

Definition of Done

There are no TODOs left in the code
Method preconditions are checked and documented in the method's Javadoc
Coding conventions (e.g. for logging) have been followed
Change is covered by automated tests
Public API has Javadoc and @API annotations
Change is documented in the User Guide and Release Notes
All continuous integration builds pass

sormuras · 2017-09-02T08:15:52Z

This is very useful enhancement. I guess, being a couple days away from 5.0 GA, it will be part of the 5.1 train. What do you think, @junit-team/junit-lambda ?

kcooney · 2017-09-02T08:33:05Z

Sorry, @sormuras sorry I didn't see these issues sooner. I have been bugging the Bazel team to look at the Launcher API for a while, but they have been busy with other things

marcphilipp · 2017-09-02T10:54:24Z

@kcooney Can you provide a few more details why the visitors need to see descriptors before filters are applied? Please note that some filters (see implementations of DiscoveryFilter) are already applied by test engines during discovery. Will Bazel add a custom PostDiscoveryFilter for sharding?

Alternatively, could Bazel inspect the TestPlan, i.e. the result of the discovery process (without any filters), and then run test execution with additional filters (TestIdentifer is already immutable)?

marcphilipp · 2017-09-02T11:02:40Z

This is very useful enhancement. I guess, being a couple days away from 5.0 GA, it will be part of the 5.1 train. What do you think, @junit-team/junit-lambda ?

Some changes would break backwards compatibility so we should consider them for 5.0 GA.

marcphilipp · 2017-09-02T11:06:20Z

Tentatively slated for 5.0 GA in order to allow the team to assess this PR before GA (even though parts of it might get moved to 5.1).

This avoids inconsistencies between getParent() and getParent().get().getChildren()

kcooney · 2017-09-02T17:46:08Z

FYI: I realized that the only constructor for FilterResult is private, so there is no need to make it final. I removed the commit.

This allows users of the launcher to learn about the discovered tests before filters are applied. This is essential for build systems like Bazel that want to "shard" (aka partition) a test run into multiple processes while ensuring that the tests included in each "shard" are consistent, even if test discovery does not result in a deterministic ordering of tests (see https://goo.gl/Yj4fXL for what Bazel does to Shard JUnit4 tests today).

kcooney · 2017-09-02T18:20:53Z

Can you provide a few more details why the visitors need to see descriptors before filters are applied? Please note that some filters (see implementations of DiscoveryFilter) are already applied by test engines during discovery. Will Bazel add a custom PostDiscoveryFilter for sharding?

Yes, Bazel would add a custom PostDiscoveryFilter for test sharding.

I will do my best. @iirina can perhaps add more details.

For those that don't know, Bazel is a build tool, similar to Maven, but with advanced caching capabilities. In order to do that, Bazel actions need to be deterministic. Internally, each action knows about all of its inputs, so Bazel always knows what artifacts to build and what not to build.

Test sharding allows a test to be automatically split into multiple partitions ("shards") and run in separate processes. Here's an example of a sharded test:

java_test(
    name = "ServerTest",
    srcs = glob(["*.java"])
    shard_count = 3,
    flaky = 1,
    deps = [
        ...
    ],
)

Let's say that ServerTest has six test methods: a, b, c, d, e, f.

Let's assume that the test uses the round-robin strategy so that which shard a test is run is determined by this expression:

shard = testIndex % numShards;

Imagine that the first time the test is run, all three shards get the same sequence of test methods, in this order:

[ a, b, c, d, e, f]

Bazel will run the these test methods on each shard:

shard 0: [a, d]
shard 1: [b, e]
shard 2: [c, f]

Imagine that test f fails because the code under test is broken. The java_test rule is marked as flaky so Bazel will try shard 2 again. In this re-run, JUnit provides the test methods in this order:

[ f, a, b, c, d, e]

In this case, shard 2 will run [b, e]. The failing test method is no longer on shard 2, and the test passes.

To avoid this, the round-robin strategy sorts the tests before determining which test is on which shard. So the shard that fails will always have the same tests when it is retried.

More generally, Bazel will pass the sharding strategy all of the tests, and the sharding strategy provides a filter. For Bazel to support sharding for JUnit 5 tests, it will need a way to see all tests before the filter runs.

It's okay if some engines provide filtering during discovery, as long as they do that in a deterministic way. So, for example, the test engine couldn't decide to filter out all tests annotated as @Nightly unless the test run was happening between midnight and 6am. The developer could, of course, specify in the build rule to run tests annotated with @Nightly and then decide to only run those tests once a night.

Alternatively, could Bazel inspect the TestPlan, i.e. the result of the discovery process (without any filters), and then run test execution with additional filters

How do you get access to the TestPlan before the filters are applied? testPlanExecutionStarted appears to be called after applyPostDiscoveryFilters() is called. Or are there opportunities after applyPostDiscoveryFilters() to do more filtering?

marcphilipp · 2017-09-02T18:28:20Z

How do you get access to the TestPlan before the filters are applied? testPlanExecutionStarted appears to be called after applyPostDiscoveryFilters() is called. Or are there opportunities after applyPostDiscoveryFilters() to do more filtering?

That's correct. My idea was to first call Launcher.discover() with a LauncherDiscoveryRequest without any PostDiscoveryFilters. discover() returns the TestPlan which can be inspected by Bazel. If it wants to run shard 1, it can pass a new LauncherDiscoveryRequest to Launcher.execute() that includes its implementation of PostDiscoveryFilter. Does that make sense?

marcphilipp · 2017-09-02T18:34:06Z

One additional idea: To ensure that a shard isn't modified, could Bazel store the unique IDs of the contained tests and re-run them using UniqueIdSelectors?

kcooney · 2017-09-02T19:33:01Z

Thanks for the response, @marcphilipp

I think your TestPlan solution will work (see below). I am still a bit worried that a PostDiscoveryFilter can modify the tree in unpredictable ways. I didn't realize that TestIdentifier existed. Perhaps PostDiscoveryFilter should take that instead of a TestDescriptor so we wouldn't need an UnmodifiableTestDescriptor. Although that would also be a breaking change, it is a small one (see https://github.com/kcooney/junit-lambda/tree/PostDiscoveryFilter)

One additional idea: To ensure that a shard isn't modified, could Bazel store the unique IDs of the contained tests and re-run them using UniqueIdSelectors?

Bazel could not store the unique IDs for the contained tests since it doesn't read the Java source files when determining whether a build action needs to be rerun (it just computes a hash of the files)

That's correct. My idea was to first call Launcher.discover() with a LauncherDiscoveryRequest without any PostDiscoveryFilters. discover() returns the TestPlan which can be inspected by Bazel. If it wants to run shard 1, it can pass a new LauncherDiscoveryRequest to Launcher.execute()

I think that would work, as long as the TestPlan contained a superset of the tests executed by Launcher.execute().

Thinking through this a bit in the next two paragraphs. If it's too long for you, feel free to ignore the rest.

In fact, that might be better, since Bazel fails if the test is "over-sharded" so a shard is empty (so we don't waste resources starting a JVM only to find out there are no tests). Bazel checks if a shard is empty after the sharding filter runs but before command line filtering happens. Bazel could check for empty shards by looking at the TestPlan, then when running the tests filter on shard and command-line args.

I don't think dynamically-generated tests would be a problem, as long as it's deterministic. For example, in my previous comment, if test b created test z during the test run, that's fine, since z will always be on the same shard as b.

marcphilipp · 2017-09-07T18:17:02Z

I'm closing this PR because test sharding can be implemented by using the existing Launcher API as described above. The potential problem of PostDiscoveryFilters modifying TestDescriptors persists and we might introduce an unmodifiable wrapper in a future release. For now, we've mitigated that edge case by documenting that PostDiscoveryFilters must not modify TestDescriptors in any way.

kcooney · 2017-09-10T00:24:45Z

Thanks, Marc. Great work on making the launcher API so flexible :-)

I sent out a related pull request related to test sharding. See #1055

swankjesse · 2020-05-26T22:35:27Z

FYI, I hacked together an extension to do this. It’s a bit unsatisfying because it filters with local information only, which is not great for balancing.
https://gist.github.com/swankjesse/92c93842f9c3705ca976031e7d0e664a

I’d love an official built-in sharding filter!

sormuras · 2020-05-28T01:57:02Z

Interesting implementation of an ExecutionCondition, Jesse.

Will bring it to our team discussion this week.

Scope is Jupiter-only, in contrast to the issue discussed here, which targeted Platform, IIRC.

PS: I'd love to update square/javapoet#677 as well.

ghost assigned kcooney Sep 2, 2017

ghost added the status: in progress label Sep 2, 2017

kcooney force-pushed the support-test-sharding branch from 9e05e13 to 910172b Compare September 2, 2017 08:06

sormuras added theme: discovery theme: execution theme: programming model type: enhancement labels Sep 2, 2017

marcphilipp added this to the 5.0 GA milestone Sep 2, 2017

kcooney added 2 commits September 2, 2017 10:43

Do not allow TestDescriptors to be added to more than one parent.

ede58c2

This avoids inconsistencies between getParent() and getParent().get().getChildren()

Do not allow filters to modify the passed-in TestDescriptors.

7e2555e

kcooney force-pushed the support-test-sharding branch from 910172b to 7f30f1d Compare September 2, 2017 17:44

kcooney force-pushed the support-test-sharding branch from 7f30f1d to 529ef31 Compare September 2, 2017 18:21

marcphilipp closed this Sep 7, 2017

ghost removed the status: in progress label Sep 7, 2017

marcphilipp removed this from the 5.0 GA milestone Sep 7, 2017

kcooney deleted the support-test-sharding branch September 9, 2017 07:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support test sharding #1041

Support test sharding #1041

kcooney commented Sep 2, 2017 •

edited

Loading

sormuras commented Sep 2, 2017

kcooney commented Sep 2, 2017

marcphilipp commented Sep 2, 2017

marcphilipp commented Sep 2, 2017

marcphilipp commented Sep 2, 2017

kcooney commented Sep 2, 2017

kcooney commented Sep 2, 2017

marcphilipp commented Sep 2, 2017

marcphilipp commented Sep 2, 2017

kcooney commented Sep 2, 2017

marcphilipp commented Sep 7, 2017

kcooney commented Sep 10, 2017

swankjesse commented May 26, 2020

sormuras commented May 28, 2020

Support test sharding #1041

Support test sharding #1041

Conversation

kcooney commented Sep 2, 2017 • edited Loading

Overview

Definition of Done

sormuras commented Sep 2, 2017

kcooney commented Sep 2, 2017

marcphilipp commented Sep 2, 2017

marcphilipp commented Sep 2, 2017

marcphilipp commented Sep 2, 2017

kcooney commented Sep 2, 2017

kcooney commented Sep 2, 2017

marcphilipp commented Sep 2, 2017

marcphilipp commented Sep 2, 2017

kcooney commented Sep 2, 2017

marcphilipp commented Sep 7, 2017

kcooney commented Sep 10, 2017

swankjesse commented May 26, 2020

sormuras commented May 28, 2020

kcooney commented Sep 2, 2017 •

edited

Loading