Feature Request: Run unit test multiple times #11354

JarredAllen · 2022-11-09T19:10:59Z

Problem

In my code, there are some flaky unit tests that pass most, but not all, of the time. To check if a change has fixed the flakiness of the test, it would be convenient to run the test many times and see if it succeeded or not, but afaict cargo doesn't have a way of doing this.

Proposed Solution

It'd be nice if there could be a command-line flag to run tests repeatedly. I'm imagining a command-line syntax like cargo test --repeat=100 testname, which will search for tests named "testname" (like cargo presently does) and then run the test(s) found 100 times, but I'm not too picky about the exact syntax.

Notes

No response

The text was updated successfully, but these errors were encountered:

jswang · 2022-11-09T19:23:29Z

This would be very helpful for my use cases as well!

ehuss · 2022-11-09T20:26:12Z

This would definitely be useful, I have a macro in my editor for repeating a test. Something built-in would be nicer. However, it is not clear exactly how this should work. For example, it may be better for this to be implemented in the harness itself, in which case rust-lang/rust#65218 would be the issue for that.

andrewgazelka · 2022-11-10T01:11:49Z

While this is in development, you can also use cargo nextest, which does this.

ImmanuelSegol · 2023-03-04T03:03:35Z

@JarredAllen How do I get this working ?

andrewgazelka · 2023-03-08T19:18:22Z

@JarredAllen How do I get this working ?

This is a feature request. It still needs to be implemented.

heisen-li · 2023-12-22T12:23:42Z

It seems that this work is relatively important. I am currently learning testing related codes, can I try to complete this issue? Is there anything I need to pay special attention to? Or other plans?

epage · 2023-12-22T13:26:43Z

This is marked needs-design, meaning someone needs to put forward a more detailed proposal for what to do before we move forward with implementation.

In particular, we need to figure out which combination of layers this belongs in,

if libtest, t-libs-api is likely to defer that to custom test harnesses
if cargo-test and we just repeat what was said
or some other design that mixes these

heisen-li · 2023-12-27T12:18:00Z

Sorry for sharing my negligible view., it seems best to make modifications in libtest, fine control is not possible with cargo.

epage · 2023-12-27T16:26:58Z

Our plans for cargo would allow fine control in the future.

We are looking at making cargo and libtest communicate through a greater knowledge of the CLI, including being able to enable json output, putting the responsibility for rendering on cargo.

This would allow cargo test to track individual tests and decide what to do with them, like re-running a failed test.

What would be good is to explore prior art to see if it has any affect on the design. For example, would people want to be able to annotate individual tests about retrying? If so, we'd either want retrying within the test harness or that would be good feedback for the test runner/harness communication.

bjackman · 2024-04-15T08:29:36Z

For example, would people want to be able to annotate individual tests about retrying?

The way you say "retry" and "annotate" makes me think there are two separate usecases in poeple's minds here:

I have discovered that a test is flaky. I am trying to debug/fix it, I need to run it 1000 times to reproduce the failure/evaluate my fix.
Our tests are flaky, we will use retries to make our CI dashboard green.

For usecase 1, I have found that a single global "repeat all selected tests N times" commandlflag works fine. I have seen various names for this, I think --runs-per-test is my favourite because it makes the "repeat all tests" thing obvious. rust-lang/rust#65218 mentions a couple of examples of prior art for this, I have used both of those examples and they seem to work well. I think this is also what the OP suggested.

For usecase 2, there is the philosophical question of how opinionated Cargo wants to be about software engineering. I have worked on a project where the test tools have a global "retry any test that fails" flag. This probably sounds like a bad idea, and my experience has indeed suggested that, exactly as you would suspect, using it means your tests get flakier over time instead of less flaky. My impression of the Rust culture is that people would instinctively agree that this is a harmful feature for test tools to have.

A less toxic design IMO is to be able to annotate invididual tests as "known flaky", and then leave it up to the person/tool running the tests to decide if that means "don't bother running it at all" or "run it up to N times until is passes". Google's monorepo has a tag that works like that and it seems OK to me. This means you can maintain your nice green CI dashboard, but you have a ratchet where you at least notice if a formerly-stable test becomes flaky. It also means if you run a test to change your WIP PR, and the test fails, and it doesn't have the "flaky" tag, you know you probably broke the test. Whereas with the global retry flag you have to go and look in your CI history and do some informal Bayesian analysis.

Anyway, I think that this FR was probably motivated by usecase 1, and that usecase seems much easier to solve, so it might make sense to focus on that.

JarredAllen added the C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` label Nov 9, 2022

ehuss added the Command-test label Dec 11, 2022

weihanglo added the S-needs-design Status: Needs someone to work further on the design for the feature or fix. NOT YET accepted. label May 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Run unit test multiple times #11354

Feature Request: Run unit test multiple times #11354

JarredAllen commented Nov 9, 2022

jswang commented Nov 9, 2022

ehuss commented Nov 9, 2022

andrewgazelka commented Nov 10, 2022

ImmanuelSegol commented Mar 4, 2023 •

edited

andrewgazelka commented Mar 8, 2023

heisen-li commented Dec 22, 2023 •

edited

epage commented Dec 22, 2023

heisen-li commented Dec 27, 2023

epage commented Dec 27, 2023

bjackman commented Apr 15, 2024 •

edited

Feature Request: Run unit test multiple times #11354

Feature Request: Run unit test multiple times #11354

Comments

JarredAllen commented Nov 9, 2022

Problem

Proposed Solution

Notes

jswang commented Nov 9, 2022

ehuss commented Nov 9, 2022

andrewgazelka commented Nov 10, 2022

ImmanuelSegol commented Mar 4, 2023 • edited

andrewgazelka commented Mar 8, 2023

heisen-li commented Dec 22, 2023 • edited

epage commented Dec 22, 2023

heisen-li commented Dec 27, 2023

epage commented Dec 27, 2023

bjackman commented Apr 15, 2024 • edited

ImmanuelSegol commented Mar 4, 2023 •

edited

heisen-li commented Dec 22, 2023 •

edited

bjackman commented Apr 15, 2024 •

edited