Allow a higher number of flaky test attempts#28635
Allow a higher number of flaky test attempts#28635dzbarsky wants to merge 1 commit intobazelbuild:masterfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request increases the maximum number of flaky test attempts from 10 to 1000. While I understand the motivation to improve the testing of flaky test reporting, a 100x increase presents a significant risk of misuse, potentially leading to builds consuming excessive resources. My review includes a suggestion to use a more moderate, safer limit.
| public static class TestAttemptsConverter extends PerLabelOptions.PerLabelOptionsConverter { | ||
| private static final int MIN_VALUE = 1; | ||
| private static final int MAX_VALUE = 10; | ||
| private static final int MAX_VALUE = 1000; |
There was a problem hiding this comment.
Increasing the maximum number of attempts by 100x from 10 to 1000 is a very large increase. While this enables your specific use case, it introduces a significant risk of misuse. A user setting --flaky_test_attempts=1000 could cause builds with flaky tests to consume extreme amounts of CI resources and time, potentially appearing to hang for hours.
A more moderate limit, such as 100, would still be a 10x increase over the current value and would likely be sufficient for most practical purposes of testing flaky test infrastructure, while mitigating the risk of extreme resource consumption. A limit of 1000 seems excessive and potentially dangerous for general use.
| private static final int MAX_VALUE = 1000; | |
| private static final int MAX_VALUE = 100; |
There was a problem hiding this comment.
100 retries allows 1-(29/30)^100 which is around 3% FN/FP rate, I think that will still trigger often enough to be annoying
I'm interested in having an always-flaky test to integration test our build tooling's handling of flaky test reporting.
Currently, there is no great way for a test to know that's it is a flaky re-run. Previously I saw the test drop a marker file in /tmp, fail, and then succeed on the retry when it detects the marker file. However, this is incompatible with remote execution or hermetic /tmp.
It would be ideal to inject a flaky test attempt count via ENV var so we can deterministically know we are a retry (similar to how
--runs_per_testis injected), but I can see that behavior being a little worrisome.In the absence of that, we can fail the test with probability 7/8, which gives a ~15% false positive and false negative rate at 10 retries. Bumping the max retries to 1000 allows to fail the test with probability 199/200, which is roughly .5% false positive and false negative rate.
I know this is a bit of a weird use case, but hopefully this is a harmless-enough bump!