Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rerun failed tests automatically and detect flaky tests #4347

Closed
pihme opened this issue Apr 22, 2020 · 6 comments · Fixed by #4625
Closed

Rerun failed tests automatically and detect flaky tests #4347

pihme opened this issue Apr 22, 2020 · 6 comments · Fixed by #4625
Assignees
Labels
kind/toil Categorizes an issue or PR as general maintenance, i.e. cleanup, refactoring, etc.

Comments

@pihme
Copy link
Contributor

pihme commented Apr 22, 2020

Description

@pihme pihme added the kind/toil Categorizes an issue or PR as general maintenance, i.e. cleanup, refactoring, etc. label Apr 22, 2020
@pihme pihme added this to the CI Builds should reflect state milestone Apr 22, 2020
@pihme
Copy link
Contributor Author

pihme commented Apr 22, 2020

Issue is postponed until prioritization of flaky tests is discussed.

@pihme pihme changed the title Rerun failed tests automatically Rerun failed tests automatically and detect flaky tests May 6, 2020
@pihme
Copy link
Contributor Author

pihme commented May 6, 2020

@menski I could use your help here. I googled my way to a PoC, but I struggle to get it running on the Zeebe Pipeline.

The PoC is here: https://github.com/pihme/jenkinsTestBed/tree/rerun-failed-tests

It works as follows:

  • runTest.sh runs the tests and then scans the log for flaky tests; if it finds them it collects the interesting lines, extracts the test names and writes them into a file, and fails the stage of the build
  • Jenkinsfile then looks for the file, and if it's there writes the content into the build description

In Zeebe the second step seems to be not working. Maybe it's because of the "container" tag and the file I write isn't visible to the world outside.

Besides that there is potential for improvement:

  • in the PoC I didn't capture the exit code of the original maven run. This would be important to mark the stage as failed if it's an ordinary test failure
  • I do the extraction of the tests in two steps: first I use grep to look for the lines I am interested in, and then I use awk to extract a certain "column". I guess both steps could be combined into an elegant regex with capture groups
  • now that I think of it, the code will also report tests that always fail as flaky. It needs a better matcher:
[WARNING] Flakes: 
[WARNING] com.github.pihme.jenkinstestbed.module1.FlakyTest.shouldFailOnceInTwoInvocations
[ERROR]   Run 1: FlakyTest.shouldFailOnceInTwoInvocations:22 failed
[ERROR]   Run 2: FlakyTest.shouldFailOnceInTwoInvocations:22 failed
[INFO]   Run 3: PASS

vs a test that always fails:

[ERROR] com.github.pihme.jenkinstestbed.module1.FlakyTest.shouldFailNineInTenInvocations
[ERROR]   Run 1: FlakyTest.shouldFailNineInTenInvocations:15 failed
[ERROR]   Run 2: FlakyTest.shouldFailNineInTenInvocations:15 failed
[ERROR]   Run 3: FlakyTest.shouldFailNineInTenInvocations:15 failed
[ERROR]   Run 4: FlakyTest.shouldFailNineInTenInvocations:15 failed
[ERROR]   Run 5: FlakyTest.shouldFailNineInTenInvocations:15 failed
[ERROR]   Run 6: FlakyTest.shouldFailNineInTenInvocations:15 failed
[ERROR]   Run 7: FlakyTest.shouldFailNineInTenInvocations:15 failed
[ERROR]   Run 8: FlakyTest.shouldFailNineInTenInvocations:15 failed
[ERROR]   Run 9: FlakyTest.shouldFailNineInTenInvocations:15 failed
[ERROR]   Run 10: FlakyTest.shouldFailNineInTenInvocations:15 failed
[ERROR]   Run 11: FlakyTest.shouldFailNineInTenInvocations:15 failed
[ERROR]   Run 12: FlakyTest.shouldFailNineInTenInvocations:15 failed
[ERROR]   Run 13: FlakyTest.shouldFailNineInTenInvocations:15 failed
[ERROR]   Run 14: FlakyTest.shouldFailNineInTenInvocations:15 failed
[ERROR]   Run 15: FlakyTest.shouldFailNineInTenInvocations:15 failed
[ERROR]   Run 16: FlakyTest.shouldFailNineInTenInvocations:15 failed

I will continue working on this myself.
But I will also appreciate some help here.

@menski
Copy link
Contributor

menski commented May 20, 2020

@pihme is there a branch where I can see this applied to the Zeebe Pipeline, as you mentioned that the second step is failing there?

@pihme
Copy link
Contributor Author

pihme commented May 20, 2020

@pihme
Copy link
Contributor Author

pihme commented May 22, 2020

@menski: Made several improvements. It is now so close to something usable that the missing piece is all the more infuriating.

The missing piece is the capture of the exit code of a piped command. In my toy project this works with:

mvn test -Dsurefire.rerunFailingTestsCount=15 | tee test.txt

STATUS=${PIPESTATUS[0]}

In Zeebe it doesn't. Might have to do with the first line in the script. In my toy project I use #!/bin/bash, in Zeebe it is just #!/bin/sh which may or may not be a bash shell. (oh, how I love Linux)

Any help is welcome.

@menski
Copy link
Contributor

menski commented May 25, 2020

@pihme regarding the PIPESTATUS this is a bash feature, and you are correct you have to run it as #!/bin/bash script to use it. But that is okay. We normally try to use sh as minimum in case we have an image which does not expose bash at all. But for the maven commands it's fine to switch to bash. Let me know if you need help here.

zeebe-bors bot added a commit that referenced this issue Jun 8, 2020
4625: chore(ci): rerun failed tests, scan log, report flaky tests r=menski a=pihme

## Description

* rerun failing tests
* extract flaky tests from log 
* collect flaky tests at the end and add them to description

## Related issues

closes #4347

#

Co-authored-by: pihme <pihme@users.noreply.github.com>
@zeebe-bors zeebe-bors bot closed this as completed in 93a464f Jun 8, 2020
github-merge-queue bot pushed a commit that referenced this issue Apr 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/toil Categorizes an issue or PR as general maintenance, i.e. cleanup, refactoring, etc.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants