Skip to content
This repository has been archived by the owner on Feb 24, 2021. It is now read-only.

Fix Flaky tests #226

Merged
merged 36 commits into from
Jul 17, 2020
Merged

Fix Flaky tests #226

merged 36 commits into from
Jul 17, 2020

Conversation

nomisRev
Copy link
Member

@nomisRev nomisRev commented Jul 7, 2020

We've been seeing some flakiness in the past 2-3 weeks.
It's been mostly stabilized, but not quite yet.

We seem to be seeing some flakiness due to stress-related increased CPU latency. This test suite runs fine on my machine and some of this flakiness is not reproducible locally.

Due to increased CPU latency, scheduling becomes less precise. This is especially noticeable in tests that use sleep, currentTimeMillis or nanoTime.

We're also running all Gradle modules in parallel, which means that the test suite of Arrow Fx, Arrow. Fx Coroutines, Reactor & RxJava are all running at the same time.
Perhaps this is not ideal for test suites that uses concurrency in almost every test.

Therefore I think we need to see where we can rewrite flaky sleep or scheduling tests to use latches instead of sleep. See #211 for such a fix.

This PR is meant for discussion, so we can try and think of a battle plan to tackle this.

2 suggestions I think could help us try and pinpoint this:

  • We could disable parallel gradle execution, and merge this PR, keep an eye on the flakiness
  • Collaborate, debug & experiment in this PR on the CI itself to see if we can make it stable in this PR.

@JorgeCastilloPrz
Copy link
Member

Disabling parallel execution temporarily could be good to pinpoint the issue first. We disable it for a while and keep an eye on that flakiness. The aim would be to reduce the risk of falling into false positives that end up driving a big potentially non needed refactor.

@rachelcarmena
Copy link
Member

Thanks @nomisRev 🙌

I think we should follow the Kotest improvements. For instance, 4.1.1 was released. However, I cannot find the release notes (not found in CHANGELOG neither GitHub releases).

Simon, you could include a new variable version for Kotest here to speed up those updates from this repository.

Copy link
Member

@aballano aballano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nomisRev
Copy link
Member Author

@rachelcarmena @aballano @danimontoya @raulraja
It seems the Arrow Fx Coroutines test suite is much more stable, but I can see QueueTest still fail in a flaky manner.

I think we can fix the QueueTest by removing the assertion on elapsed time. The semantics will still be asserted, and we also have tests for waitFor that pass in less complex scenarios.

It seems this was already flaky before the 0.10.5 release, found this PR.
Fix Queue flaky tests

@nomisRev
Copy link
Member Author

nomisRev commented Jul 17, 2020

CI failed in 6ce9d18 on Dokka. This is a known issue, and we're waiting for the new version of Dokka.
Both build steps were green.

Execution failed for task ':arrow-core:dokka'.
> Guice provision errors:
  
  1) Error injecting constructor, java.lang.RuntimeException: Exception while loading package-list from ExternalDocumentationLinkImpl(url=https://kotlinlang.org/api/latest/jvm/stdlib/, packageListUrl=https://kotlinlang.org/api/latest/jvm/stdlib/package-list)
    at org.jetbrains.dokka.PackageListProvider.<init>(ExternalDocumentationLinkResolver.kt:35)
    at org.jetbrains.dokka.PackageListProvider.class(ExternalDocumentationLinkResolver.kt:35)
    while locating org.jetbrains.dokka.PackageListProvider
      for parameter 4 at org.jetbrains.dokka.ExternalDocumentationLinkResolver.<init>(ExternalDocumentationLinkResolver.kt:204)
    while locating org.jetbrains.dokka.ExternalDocumentationLinkResolver
      for parameter 4 at org.jetbrains.dokka.DeclarationLinkResolver.<init>(DeclarationLinkResolver.kt:10)
    while locating org.jetbrains.dokka.DeclarationLinkResolver
      for parameter 0 at org.jetbrains.dokka.PackageDocs.<init>(PackageDocs.kt:20)
    at org.jetbrains.dokka.PackageDocs.class(PackageDocs.kt:19)
    while locating org.jetbrains.dokka.PackageDocs

@nomisRev
Copy link
Member Author

@JorgeCastilloPrz @rachelcarmena @rachelcarmena @aballano

The build & build integration were green for 3 commits in a row, with an increased iteration rate of 350 compared to 100 on master.
I'm going to merge now, but we should keep an eye out for anomalies.

@nomisRev nomisRev changed the title Flaky tests Fix Flaky tests Jul 17, 2020
@nomisRev nomisRev merged commit c25744f into master Jul 17, 2020
@nomisRev nomisRev deleted the sv-flaky-test-hunt branch July 17, 2020 11:55
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants