-
-
Notifications
You must be signed in to change notification settings - Fork 757
Improve awareness of frequently flaking tests #6500
Copy link
Copy link
Open
Labels
discussionDiscussing a topic with no specific actions yetDiscussing a topic with no specific actions yetflaky testIntermittent failures on CI.Intermittent failures on CI.stabilityIssue or feature related to cluster stability (e.g. deadlock)Issue or feature related to cluster stability (e.g. deadlock)
Metadata
Metadata
Assignees
Labels
discussionDiscussing a topic with no specific actions yetDiscussing a topic with no specific actions yetflaky testIntermittent failures on CI.Intermittent failures on CI.stabilityIssue or feature related to cluster stability (e.g. deadlock)Issue or feature related to cluster stability (e.g. deadlock)
Problem:

We have tests that are failing rather frequently. The test report is a good resource to get an overview, but you have to actively take a look at it to identify those frequently failing tests. At the same time, #6452 has demonstrated that these flakes can point toward general issues that affect users. Had we taken a look earlier, we would have caught #6494 earlier as well.
Question:
How can we improve our awareness of and response time to tests that start flaking frequently?
Possible Solution:
One possible solution would be to implement a bot that files a new ticket for every test that starts flaking at least x times in the last y CI runs on
main. The extreme being a ticket for any failing test onmain.There also exist (paid) tools that allow tracking flaking tests in more detail.
cc @fjetter