Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to retry flaky Jasmine tests #19135

Merged
merged 2 commits into from Apr 18, 2019

Conversation

Projects
None yet
4 participants
@nathansobo
Copy link
Contributor

commented Apr 10, 2019

Our builds are currently plagued by a handful of timing-dependent tests that fail intermittently. The ideal solution would be to find and fix the source of indeterminacy in these tests, as I did in atom/settings-view#1123, but that can be time consuming. An even better solution would be to restructure these code paths and tests to not depend on timing in a critical way, but that would be even more difficult. Another alternative is just to delete the offending tests, but I'm reluctant to lose the coverage on functionality that they are trying to document.

So in the interests of reducing spurious build failures quickly without resorting to deleting tests, I'm introducing the ability to retry flaky tests as an experiment. Teammates have expressed concerns about the long-term impact of doing this and I myself am not totally sure it's a good idea, but I think it's worth trying. This feature essentially transforms flaky tests that cause massive delays in our build process into technical debt. We still to prioritize cleaning them up, but we can mitigate their impact on our build process in the short term. If we're still seeing a test be unreliable after retrying it, at that point I think we
should either delete it or make fixing it a top priority in that sprint.

So if you encounter a flaky test and you've spent some time trying to figure it out and just can't, you now have a new option. You can add the following to the top of the test body:

 it('fails 5% of the time on Windows x86 for unknown reasons and ruins our lives', async function () {
    this.RETRY_FLAKY_TEST_AND_SLOW_DOWN_THE_BUILD()

    // ... insert remainder of poorly-designed test case here ;-)
})

I've deliberately made using this option ugly and screamingly obvious to discourage its use. Sorry in advance if this ends up being a bad idea.

@nathansobo nathansobo requested review from smashwilson and rafeca Apr 10, 2019

@50Wliu

This comment has been minimized.

Copy link
Member

commented Apr 10, 2019

Will this work for tests that are using arrow functions? (e.g. in the test you changed, if it would still work with async () =>)

@nathansobo

This comment has been minimized.

Copy link
Contributor Author

commented Apr 10, 2019

@50Wliu I know it works for async test functions. I didn't test with => syntax, but I don't see any reason why it shouldn't work. It essentially just restores the spec's queue and re-enqueues it, so anything that works already in Jasmine should work unless I'm missing something.

@nathansobo

This comment has been minimized.

Copy link
Contributor Author

commented Apr 10, 2019

One thing I forgot to note. Don't use this in package specs until Atom 1.38.0 hits stable, because it won't be available in CI. Or at the very least feature detect it.

@rafeca

This comment has been minimized.

Copy link
Contributor

commented Apr 11, 2019

Will this work for tests that are using arrow functions? (e.g. in the test you changed, if it would still work with async () =>)

I don't think that this is going to work for tests defined with arrow functions, since the arrow function will bind the context of the spec to be the context of the describe callback, which based on Jasmine's code is going to be a jasmine.Suite instance.

Additionally, any spec that uses the async-spec-helpers.js helper methods that we have around the many Atom repos is not going to be able to use the retry logic, since the overriden it functions defined there pass an undefined context to the spec function (source).

Good thing is that after #18917 there's no need to use the it implementations from async-spec-helpers.js anymore (in fact I removed all of them from Atom core), and the new global it functions introduced in #18917 handle the context correctly.

@nathansobo

This comment has been minimized.

Copy link
Contributor Author

commented Apr 11, 2019

Of course. Feeling pretty dumb. Arrows bond the context and I depend on this. I did that so you could easily combine retires with fit etc. The async helpers situation is more concerning. But they are gone now?

@Arcanemagus

This comment has been minimized.

Copy link
Contributor

commented Apr 16, 2019

The async helpers situation is more concerning. But they are gone now?

After #18917 they are no longer needed as the functionality they included is now the default behavior of the built in Jasmine runner, however as mentioned only the bundled packages that have already been migrated into atom/atom have been updated, the rest would still potentially be using an existing separate implementation.

@nathansobo

This comment has been minimized.

Copy link
Contributor Author

commented Apr 18, 2019

@Arcanemagus It seems like if we wanted to use this feature in such a package, we could migrate the tests to use the new global versions of the Jasmine functions at that time.

I just encountered another instance of a flaky pathwatcher test that this PR would have avoided, so I'm going to merge it. We should use this tool sparingly and reevaluate going forward.

@nathansobo nathansobo merged commit d4d14a4 into master Apr 18, 2019

2 checks passed

Atom Pull Requests #20190410.4 succeeded
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details

@nathansobo nathansobo deleted the ns-jasmine-retries branch Apr 18, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.