Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem: some database CI tests are flaky #39

Closed
ledcoyote opened this issue Aug 25, 2021 · 7 comments
Closed

Problem: some database CI tests are flaky #39

ledcoyote opened this issue Aug 25, 2021 · 7 comments
Labels

Comments

@ledcoyote
Copy link

Problem

Several database tests that run in CI will fail, but on a subsequent re-run will pass without changes having been made. As yet unknown which specific tests are suspect.

@weex
Copy link
Member

weex commented Aug 26, 2021

I don't know if it's db-related but I understand there's a cucumber test instability from reporting in upstream diaspora#8279

@ledcoyote
Copy link
Author

Ah, I see. I think I should close this issue as duplicating diaspora#7373, which is much more detailed. Do you agree?

@weex
Copy link
Member

weex commented Aug 26, 2021

Let's keep it open. Wherever it's solved, we'll make sure the solution gets to the other place.

@weex
Copy link
Member

weex commented Sep 3, 2021

diaspora#7373 does have a list of tests near the top which fail randomly and that I have run into today as I test #45. So that's a +1 on the value part of this.

As for whether specific test flakiness is the right problem, my thinking is that tests of end-user behavior that are pretty easy for a dev to test locally might be more trouble than they are worth. Yesterday I learned these cucumber tests require Google Chrome's binary which I consider a problem in a free software project.

@weex weex added the critical label Sep 17, 2021
@weex
Copy link
Member

weex commented Sep 17, 2021

This helps to see the most common recent failures.

gh run list -w CI -R c4social/diaspora | grep completed | grep failure | cut -f7 | while read -r line ; do
    gh run view $line --log-failed | grep Failing -A 1 | cut -f3 | cut -d' ' -f2-
done

Reading https://collectiveidea.com/blog/archives/2015/05/26/fixing-intermittent-failing-tests it seems many of these may have to do with race conditions and can be solved by making the step wait for a thing to return before moving on. For example this kind of thing where an action like click_link is followed immediately by end means the next step might be expecting a state change that hasn't finished.

When /^I select all aspects$/ do
  within('#aspects_list') do
    click_link "Select all"
  end
end

@weex
Copy link
Member

weex commented Oct 26, 2021

Wit #79 having been applied, flakiness isn't being seen locally any more. Closing this but feel free to reopen if more is detected in CI or locally.

@weex weex closed this as completed Oct 26, 2021
@weex
Copy link
Member

weex commented Oct 26, 2021

If this seems fixed after a couple weeks more of development, then we'll make a PR for upstream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants