Improve flaky test reporting in Sentry #31

agis · 2020-09-02T07:07:58Z

Currently, flaky tests are all emitted as a single event, with the same title "Flaky jobs detected". Thus, flaky job events from different CI builds all end up under a single Sentry event. For instance, this is the sole flaky job event as reported in one of our test suites:

The problem with this approach is that it's hard to answer questions such as:

when was this particular flaky test introduced?
which file has the most flaky test?
which are the files currently that contain flaky tests?

Also, it's impossible to set alerts (e.g. using code owners) based on the file which flaky jobs occur in, or collaborate an specific issues to solve a particular flaky job (since can't resolve a specific flaky job).

We have to think of a better way to report flaky jobs, whether this involves changing the fingerprint of the events, submitting separate events per flaky job/file, changing the title of events, or a combination of these.

kpelelis · 2020-09-04T11:13:56Z

I will work on this one

kpelelis · 2020-09-04T12:00:08Z

Since this topic is open ended, we can discuss different ideas about it. To answer the questions, I think we need to leverage the way Sentry groups reports. In my opinion, it would be better if we could group reports per file, and each time a flaky job is raised, append it to the event list.

when was this particular flaky job introduced?

Check the timestamp of the first event

which file has the most flaky jobs?

Sort by events

Which are the files currently that contain flaky jobs?

Unresolved events

glampr · 2020-09-04T12:37:33Z

It would be nice to also include instructions on how to reproduce the execution order that lead to the error.

agis · 2020-09-04T13:36:05Z

Since this topic is open ended, we can discuss different ideas about it. To answer the questions, I think we need to leverage the way Sentry groups reports. In my opinion, it would be better if we could group reports per file, and each time a flaky job is raised, append it to the event list.

when was this particular flaky job introduced?

Check the timestamp of the first event

which file has the most flaky jobs?

Sort by events

Which are the files currently that contain flaky jobs?

Unresolved events

@kpelelis "Reports per file" is the most straightforward and simple thing to do and I believe provides some immediate benefits (everything you mentioned). Sentry uses what it calls "fingerprint" to decide how to group events. Depending on the default behavior, we might need to explicitly change the fingerprint to achieve this, or the event title change might be sufficient.

It would be nice to also include instructions on how to reproduce the execution order that lead to the error.

@glampr We could do that, but that could potentially be a list of 500 spec files that were executed in the same worker, prior to the flaky one, and that would be a huge payload to submit to Sentry. We could perhaps submit the N (5-10) jobs that run prior to the flaky one, as a best-effort approach. That said, this requires some effort and it's not straightforward to implement, so I suggest to track it in a new issue.

glampr · 2020-09-04T14:11:07Z

If I understand correctly, we don't need all possible combinations that lead to the error, for most cases one would be sufficient, to be able to reproduce the error locally and fix it.

agis · 2020-09-07T11:02:03Z

@glampr RSpecQ workers run in a loop, continuously popping tests of the queue and executing them, until the queue is empty. As a result, prior to encountering a flaky test, a worker might have executed a lot of other jobs (i.e. spec files), and you can't know which one is the one that caused the flakiness (if at all).

So we'd have to keep a list of all the jobs that the worker executed prior executing the flaky one, during the current build.

One thing we could also do, but this too is not so straightforward, is to emit the RSpec seed to Sentry. We could do these in next iterations.

glampr · 2020-09-07T11:15:29Z

Makes sense thanks @agis.
I will create a new issue to track this per your suggestion.

agis added good first issue Good for newcomers sentry Issues related to the Sentry integration labels Sep 2, 2020

kpelelis self-assigned this Sep 4, 2020

agis changed the title ~~Improve flaky job reporting in Sentry~~ Improve flaky test reporting in Sentry Sep 4, 2020

glampr mentioned this issue Sep 7, 2020

Provide info on how to reproduce flaky spec #32

Closed

kpelelis mentioned this issue Sep 7, 2020

reporter: Improve flaky job Sentry reporting #33

Merged

kpelelis closed this as completed in #33 Sep 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve flaky test reporting in Sentry #31

Improve flaky test reporting in Sentry #31

agis commented Sep 2, 2020 •

edited

Loading

kpelelis commented Sep 4, 2020

kpelelis commented Sep 4, 2020

glampr commented Sep 4, 2020

agis commented Sep 4, 2020

glampr commented Sep 4, 2020

agis commented Sep 7, 2020

glampr commented Sep 7, 2020

Improve flaky test reporting in Sentry #31

Improve flaky test reporting in Sentry #31

Comments

agis commented Sep 2, 2020 • edited Loading

kpelelis commented Sep 4, 2020

kpelelis commented Sep 4, 2020

glampr commented Sep 4, 2020

agis commented Sep 4, 2020

glampr commented Sep 4, 2020

agis commented Sep 7, 2020

glampr commented Sep 7, 2020

agis commented Sep 2, 2020 •

edited

Loading