Add support for batched `pytest` execution #17385

danxmoran · 2022-10-28T16:11:23Z

It's sometimes safe to run multiple python_tests within a single pytest process, and doing so can give significant wins by allowing reuse of expensive test setup/teardown logic. To allow users to opt into this behavior we introduce a new batch_compatibility_tag field on python_test, with semantics:

If unset, the test is assumed to be incompatible with all others and will run it a dedicated pytest process
If set and != the value on some other python_test, the test is explicitly incompatible with the other and is guaranteed to not run in the same pytest process
If set and == the value on some other python_test, the tests are explicitly compatible and may run in the same pytest process

Compatible tests may not end up in the same pytest batch if:

There are "too many" compatible tests in a partition, as determined by [test].batch_size
Compatible tests have some incompatibility in Pants metadata (i.e. different resolve or extra_env_vars)

When tests with the same batch_compatibility_tag have incompatibilities in some other Pants metadata, the custom partitioning rule will split them into separate partitions. We'd previously discussed raising an error in this case when calling the field batch_key/batch_tag, but IMO that approach will have a worse UX - this way users can set a high-level batch_compatibility_tag using __defaults__ and then have things continue to Just Work as they add/remove extra_env_vars and parameterize resolves on lower-level targets.

All the existing tests for the logic changed here are integration tests, so I'll need to wait until the entire batching implementation is done to check if the new logic works... but this at least makes the types line up.

Same as coverage, the tests for the change logic are all integration tests so we'll need to wait until the end to see if this works properly...

The custom partitioner still creates one partition per field-set, but the runner rule should now be capable of testing an arbitrary number of input field-sets. As part of this, I've defined the metadata type for `pytest` partitions and shifted some logic for calculating metadata fields from the runner rule into the partitioner rule.

If two `python_test` targets have the same value for `batch_compatibility_tag`, then they are eligible to be batched into the same `pytest` process. They might _not_ be batched in the same process if: * There are "too many" compatible tests as determined by the `[test].batch_size` config option * Compatible tests have some incompatible Pants metadata (i.e. different `extra_env_vars` or `xdist_concurrency`) Tests that don't set the new field are assumed to be incompatible with all others, and continue to run one-target-per-process.

src/python/pants/backend/python/goals/pytest_runner.py

Improve batching hit-rate when the vars are the same, but listed in a different order.

Ensure that the expected results don't change in the presence of batching.

stuhood

Looks great! Thanks a lot.

src/python/pants/backend/python/target_types.py

src/python/pants/backend/python/goals/coverage_py_integration_test.py

src/python/pants/backend/python/goals/pytest_runner.py

Only calculate it if needed.

Make sure coverage collection works in both cases

I think ideally the plugin wouldn't need to think about this, but for now the core `test` logic will raise an error if a partition spans multiple environments so we should do our best not to get into that situation...

benjyw · 2022-10-28T21:35:51Z

🎉

Eric-Arellano

Neat!

Eric-Arellano · 2022-10-31T21:55:33Z

src/python/pants/backend/python/goals/pytest_runner.py

        if concurrency is None:
            contents = await Get(DigestContents, Digest, field_set_source_files.snapshot.digest)
            concurrency = _count_pytest_tests(contents)
        xdist_concurrency = concurrency

+    timeout_seconds: int | None = None
+    for field_set in request.field_sets:
+        timeout = field_set.timeout.calculate_from_global_options(test_subsystem, pytest)


It probably would be good to update the help for the timeout field to mention this summing behavior

Eric-Arellano · 2022-10-31T22:01:19Z

src/python/pants/backend/python/target_types.py

+            * Compatible tests have some incompatibility in Pants metadata (i.e. different \
+                `resolve`s or `extra_env_vars`).


Thoughts on adding logs in this situation?

Not opposed - in our repo we don't (yet) use the Pants features that cause this sort of incompatibility so I'm not sure how to judge the noisiness vs. usefulness

Maybe debug level to start

danxmoran added 4 commits October 28, 2022 12:02

Update pytest coverage types to handle multiple addresses.

f0d88d4

All the existing tests for the logic changed here are integration tests, so I'll need to wait until the entire batching implementation is done to check if the new logic works... but this at least makes the types line up.

Update pytest plugin setup types to handle multiple addresses.

95ade02

Same as coverage, the tests for the change logic are all integration tests so we'll need to wait until the end to see if this works properly...

danxmoran added the category:new feature label Oct 28, 2022

danxmoran requested review from Eric-Arellano and stuhood and removed request for Eric-Arellano October 28, 2022 16:11

danxmoran commented Oct 28, 2022

View reviewed changes

src/python/pants/backend/python/goals/pytest_runner.py Outdated Show resolved Hide resolved

danxmoran added 4 commits October 28, 2022 13:59

Add tests for partitioner rule.

9d23d28

Sort extra_env_vars in metadata.

3f6b2fd

Improve batching hit-rate when the vars are the same, but listed in a different order.

Add tests for batched execution.

b1dea44

Add batching to coverage_py integration test.

d9dad29

Ensure that the expected results don't change in the presence of batching.

danxmoran marked this pull request as ready for review October 28, 2022 18:44

danxmoran mentioned this pull request Oct 28, 2022

Docs for new batched-pytest execution #17391

Closed

stuhood approved these changes Oct 28, 2022

View reviewed changes

danxmoran added 4 commits October 28, 2022 15:28

Move warning calculation to inline method.

31e258f

Only calculate it if needed.

Run pytest-coverage tests using batching and no batching

e70125c

Make sure coverage collection works in both cases

Add more context to new field's help text.

78af8d5

Also partition by environment

2c0485e

I think ideally the plugin wouldn't need to think about this, but for now the core `test` logic will raise an error if a partition spans multiple environments so we should do our best not to get into that situation...

stuhood enabled auto-merge (squash) October 28, 2022 20:24

Merge branch 'main' into danxmoran/batch-pytest

6dc32e3

auto-merge was automatically disabled October 28, 2022 20:24
Head branch was pushed to by a user without write access

Eric-Arellano enabled auto-merge (squash) October 28, 2022 20:28

Eric-Arellano merged commit d39ad4b into pantsbuild:main Oct 28, 2022

danxmoran deleted the danxmoran/batch-pytest branch October 31, 2022 12:54

Eric-Arellano reviewed Oct 31, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for batched `pytest` execution #17385

Add support for batched `pytest` execution #17385

danxmoran commented Oct 28, 2022 •

edited

stuhood left a comment

benjyw commented Oct 28, 2022

Eric-Arellano left a comment

Eric-Arellano Oct 31, 2022

Eric-Arellano Oct 31, 2022

danxmoran Oct 31, 2022

Eric-Arellano Oct 31, 2022

		* Compatible tests have some incompatibility in Pants metadata (i.e. different \
		`resolve`s or `extra_env_vars`).

Add support for batched pytest execution #17385

Add support for batched pytest execution #17385

Conversation

danxmoran commented Oct 28, 2022 • edited

stuhood left a comment

Choose a reason for hiding this comment

benjyw commented Oct 28, 2022

Eric-Arellano left a comment

Choose a reason for hiding this comment

Eric-Arellano Oct 31, 2022

Choose a reason for hiding this comment

Eric-Arellano Oct 31, 2022

Choose a reason for hiding this comment

danxmoran Oct 31, 2022

Choose a reason for hiding this comment

Eric-Arellano Oct 31, 2022

Choose a reason for hiding this comment

Add support for batched `pytest` execution #17385

Add support for batched `pytest` execution #17385

danxmoran commented Oct 28, 2022 •

edited