Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve flaky test reporting in Sentry #31

Closed
agis opened this issue Sep 2, 2020 · 7 comments · Fixed by #33
Closed

Improve flaky test reporting in Sentry #31

agis opened this issue Sep 2, 2020 · 7 comments · Fixed by #33
Assignees
Labels
good first issue Good for newcomers sentry Issues related to the Sentry integration

Comments

@agis
Copy link
Collaborator

agis commented Sep 2, 2020

Currently, flaky tests are all emitted as a single event, with the same title "Flaky jobs detected". Thus, flaky job events from different CI builds all end up under a single Sentry event. For instance, this is the sole flaky job event as reported in one of our test suites:

sentry_rspecq_grouping

The problem with this approach is that it's hard to answer questions such as:

  1. when was this particular flaky test introduced?
  2. which file has the most flaky test?
  3. which are the files currently that contain flaky tests?

Also, it's impossible to set alerts (e.g. using code owners) based on the file which flaky jobs occur in, or collaborate an specific issues to solve a particular flaky job (since can't resolve a specific flaky job).

We have to think of a better way to report flaky jobs, whether this involves changing the fingerprint of the events, submitting separate events per flaky job/file, changing the title of events, or a combination of these.

@agis agis added good first issue Good for newcomers sentry Issues related to the Sentry integration labels Sep 2, 2020
@kpelelis kpelelis self-assigned this Sep 4, 2020
@kpelelis
Copy link
Contributor

kpelelis commented Sep 4, 2020

I will work on this one

@kpelelis
Copy link
Contributor

kpelelis commented Sep 4, 2020

Since this topic is open ended, we can discuss different ideas about it. To answer the questions, I think we need to leverage the way Sentry groups reports. In my opinion, it would be better if we could group reports per file, and each time a flaky job is raised, append it to the event list.

  1. when was this particular flaky job introduced?

Check the timestamp of the first event

  1. which file has the most flaky jobs?

Sort by events

  1. Which are the files currently that contain flaky jobs?

Unresolved events

@glampr
Copy link

glampr commented Sep 4, 2020

It would be nice to also include instructions on how to reproduce the execution order that lead to the error.

@agis agis changed the title Improve flaky job reporting in Sentry Improve flaky test reporting in Sentry Sep 4, 2020
@agis
Copy link
Collaborator Author

agis commented Sep 4, 2020

Since this topic is open ended, we can discuss different ideas about it. To answer the questions, I think we need to leverage the way Sentry groups reports. In my opinion, it would be better if we could group reports per file, and each time a flaky job is raised, append it to the event list.

  1. when was this particular flaky job introduced?

Check the timestamp of the first event

  1. which file has the most flaky jobs?

Sort by events

  1. Which are the files currently that contain flaky jobs?

Unresolved events

@kpelelis "Reports per file" is the most straightforward and simple thing to do and I believe provides some immediate benefits (everything you mentioned). Sentry uses what it calls "fingerprint" to decide how to group events. Depending on the default behavior, we might need to explicitly change the fingerprint to achieve this, or the event title change might be sufficient.

It would be nice to also include instructions on how to reproduce the execution order that lead to the error.

@glampr We could do that, but that could potentially be a list of 500 spec files that were executed in the same worker, prior to the flaky one, and that would be a huge payload to submit to Sentry. We could perhaps submit the N (5-10) jobs that run prior to the flaky one, as a best-effort approach. That said, this requires some effort and it's not straightforward to implement, so I suggest to track it in a new issue.

@glampr
Copy link

glampr commented Sep 4, 2020

If I understand correctly, we don't need all possible combinations that lead to the error, for most cases one would be sufficient, to be able to reproduce the error locally and fix it.

@agis
Copy link
Collaborator Author

agis commented Sep 7, 2020

@glampr RSpecQ workers run in a loop, continuously popping tests of the queue and executing them, until the queue is empty. As a result, prior to encountering a flaky test, a worker might have executed a lot of other jobs (i.e. spec files), and you can't know which one is the one that caused the flakiness (if at all).

So we'd have to keep a list of all the jobs that the worker executed prior executing the flaky one, during the current build.

One thing we could also do, but this too is not so straightforward, is to emit the RSpec seed to Sentry. We could do these in next iterations.

@glampr
Copy link

glampr commented Sep 7, 2020

Makes sense thanks @agis.
I will create a new issue to track this per your suggestion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers sentry Issues related to the Sentry integration
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants