Skip to content

1.2.0 - Flakyness detection

Choose a tag to compare

@marvinhagemeister marvinhagemeister released this 18 Aug 10:32
· 551 commits to master since this release

This release adds a new --repeat-flaky COUNT command line flag to detect flaky tests. It works by repeating any failing test cases until they either pass or the limit is reached. When the result of all runs of a test is inconsistent, it will be marked as flaky.

Pseudo examples with --repeat-flaky 3:

Test A -> error // failed, run test again
Test A 2 -> error // failed, run test again
Test A 3 -> success // result changed, this test must be flaky

Test B -> error // failed, run test again
Test B 2 -> success // result changed, this test must be flaky. No need to run again

Test C -> error // failed, run test again
Test C 2 -> error // failed, run test again
Test C 3 -> error // result consistent -> we have a real error

Test D -> success // Success, nothing to do here

Note, that the runner status total number of tasks may increase when --repeat-flaky is test. It displays the number of pending tasks in relation to total one.

10/100 done, 1 failed, 1 expected to fail, 3 flaky
^     ^          ^           ^                 ^
|     |        tests       tests             tests
tasks tasks