Skip to content

Conversation

vinnybod
Copy link
Contributor

Adds test sharding support to the pytest runner.

I took some inspiration from https://github.com/caseyduquettesc/rules_python_pytest . In particular, I used it as a reference for which Bazel environment variables need to be pulled in the shim.

The pytest plugin is a derivative work of https://github.com/AdamGleave/pytest-shard . That dependency hasn't been updated in a long time, and the sharding was based off hashing which often results in unbalanced shards or errors from empty shards. I vendored the code into the repo, modified it to use a plain round robin strategy for selection, and converted the code to be class based so it can be used with pytest.main().

Test plan

I added an example to the examples directory and I tested it with my own repository here: https://github.com/vinnybod/bazel-examples/blob/python-test-sharding/test-shard-python/BUILD.bazel

@vinnybod vinnybod changed the title Test sharding pytest test sharding Jan 21, 2025
@alexeagle
Copy link
Member

I haven't had time to study this yet, note I had #401 which may or may not overlap.

@vinnybod
Copy link
Contributor Author

I did see #401 , but there hadn't been activity on it in a while. I think this could still be merged in the meantime.

If you do want to commit it with #401 instead, I think it would still be worth while to vendor in the pytest-shard plugin with the round-robin patch, always loading it, since its a better UX for developers.

@alexeagle
Copy link
Member

alexeagle commented Jan 23, 2025

round-robin is not the best strategy, because it tends to mean a shard has to do many different setup routines.

Take as example (invented from what I remember of JUnit, but I think it applies)

file 1
  test class A
    setup A
    test case i
    test case ii
  test class B
    setup B
    test case iii
    test case iv
file 2
  test class C
    setup C
    test case v
    test case vi

If this is meant to have --shard_count=2, you'd rather split the test cases [i, ii, iii] , [iv, v, vi] so that the first shard can skip setup C and the second shard can skip setup A.

IIUC, round robin means you'll have both shards having to run all three setups.

We did test sharding in Jasmine, https://github.com/aspect-build/rules_jasmine/blob/main/jasmine/private/runner.cjs#L32-L53 where we sort and then partition them

Copy link
Member

@alexeagle alexeagle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks for bringing this in

@vinnybod
Copy link
Contributor Author

@alexeagle I think that is less of a problem when you only have one test file per py_test target. But I suppose if we want to make this more optimal for use cases where developers are globbing a bunch of *_test.py into a single target, the change could help.

If I'm understanding your explanation correctly, instead of a round robin, we should just take the full list [i, ii, iii, iv, v, vi] and partition it by the number of shards, keeping the default ordering (n=3, [i, ii], [iii, iv], [v, vi]). Does that sound right?

I did look at rules_jvm's implementation and they use hashes, so they don't take into account any ordering.

@alexeagle
Copy link
Member

Yes that's exactly right. Obviously this makes a difference for tests with heavy fixtures, with a larger number of test cases and a larger shard count the more we can "group by fixture" the less duplicate work is performed. I think rules_jvm probably has this wrong too, but I'm pretty sure the google3 implementation gets it right.

@fzakaria
Copy link

my 2c:
The discussion about sharding strategy seems like an implementation detail that can be further refined;
The addition to allow sharding even round robin might still be a benefit to some -- the strategy can continously be refined once the support is enabled.

@alexeagle
Copy link
Member

The discussion about sharding strategy seems like an implementation detail that can be further refined

Sure, it's always possible to leave a TODO - but you have to appreciate that typically on OSS projects you have a contributor who's on the hook to land the change, and later is likely to disappear. As the maintainer you then end up with that tech debt on your own head. So we're usually a bit defensive about making it right at the beginning. I could trust you guys to come back to this optimization later though.

@vinnybod
Copy link
Contributor Author

Looks like a new change in https://github.com/aspect-build/rules_py/tree/main/examples/virtual_deps broke these changes. Investigating.

Copy link

aspect-workflows bot commented Jan 24, 2025

Test

6 test targets passed

Targets
//examples/multi_version:py_version_default_test [k8-fastbuild]                 1s
//examples/multi_version:py_version_test [k8-fastbuild-ST-494921797612]         1s
//examples/pytest:nested/pytest [k8-fastbuild]                                  2s
//examples/pytest:pytest_test [k8-fastbuild]                                    1s
//examples/pytest:sharding_test [k8-fastbuild]                                  2s
//examples/virtual_deps:pytest_test [k8-fastbuild]                              1s

Total test execution time was 9s. 23 tests (79.3%) were fully cached saving 41s.

@vinnybod vinnybod requested a review from alexeagle January 24, 2025 17:53
@alexeagle alexeagle merged commit a23ffaa into aspect-build:main Jan 28, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants