Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #2032 & #3714: stabilize AndroidX workers #3716

Merged
merged 3 commits into from
Aug 22, 2021

Conversation

BenHenning
Copy link
Sponsor Member

@BenHenning BenHenning commented Aug 22, 2021

Explanation

Fix #2032
Fix #3714

Stabilizes both PlatformParameterSyncUpWorker & LogUploadWorker by using a ListenableWorker instead of CoroutineWorker. In order to ensure the tests aren't flaky, we have to fully control the execution pipeline for WorkManager. We get halfway there by using a SynchronousExecutor when setting up the work manager, but CoroutineExecutor has a "hop" using Kotlin's default executor before doWork() is called. Even though we later use withContext to hop over to our own background executor, that little hop is on an unmanaged thread that we cannot synchronize against in the test (which results in flakes).

The fix is to use ListenableWorker directly and immediately kick-off a coroutine using our own dispatcher, avoiding any hops. With this setup, the test environment utilizing both SynchronousExecutor and TestCoroutineDispatchers results in a fully managed & deterministic environment. Fortunately, there should be more or less no change to the workers' production behavior since the new implementation added here is close to equivalent to CoroutineWorker (at least for doWorker; there are other aspects of CoroutineWorker that differ which we are ignoring--I think this is fine).

One note on correctness: an issue with both the new implementation of the workers & CoroutineWorker is neither anticipate the potential of the coroutine task indefinitely hanging. We really ought to introduce a withTimeout, but we can't in a way that won't reintroduce a potential flake in the test until #3715 is resolved. Given this isn't a change in existing prod behavior, I suspect that it's fine.

To verify that this fix works, I ran both workers' test suites 100 times locally and saw no failures (previously, I would see about ~20 failures for PlatformParameterSyncUpWorker and ~12 for LogUploadWorker).

Finally, to prevent this again in the future I introduced a new pattern check to prohibit using CoroutineWorker. I also added one for SettableFuture for good measure since it's challenging to use correctly, though I suspect this will rarely be hit since we prefer using coroutines over ListenableFutures.

Checklist

  • The PR title starts with "Fix #bugnum: ", followed by a short, clear summary of the changes. (If this PR fixes part of an issue, prefix the title with "Fix part of #bugnum: ...".)
  • The PR explanation includes the words "Fixes #bugnum: ..." (or "Fixes part of #bugnum" if the PR only partially fixes an issue).
  • The PR follows the style guide.
  • The PR does not contain any unnecessary auto-generated code from Android Studio.
  • The PR is made from a branch that's not called "develop".
  • The PR is made from a branch that is up-to-date with "develop".
  • The PR's branch is based on "develop" and not on any other branch.
  • The PR is assigned to an appropriate reviewer in both the Assignees and the Reviewers sections.

Stabilizes AndroidX worker classes PlatformParameterSyncUpWorker and
LogUploadWorker by synchronously kicking off a managed coroutine in the
test environment.
@BenHenning BenHenning marked this pull request as ready for review August 22, 2021 05:47
@vinitamurthi vinitamurthi removed their assignment Aug 22, 2021
@BenHenning
Copy link
Sponsor Member Author

Thanks @vinitamurthi.

Looks like Gradle tests are failing from a missing dep. Will push a fix momentarily; just need to verify it locally first.

Copy link
Contributor

@Arjupta Arjupta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining this fix. Changes LGTM.

@Arjupta Arjupta assigned BenHenning and unassigned Arjupta Aug 22, 2021
@oppiabot oppiabot bot added the PR: LGTM label Aug 22, 2021
@oppiabot
Copy link

oppiabot bot commented Aug 22, 2021

Hi @BenHenning, this PR is ready to be merged. Please address any remaining comments prior to merging, and feel free to merge this PR once the CI checks pass and you're happy with it. Thanks!

@BenHenning
Copy link
Sponsor Member Author

@BenHenning
Copy link
Sponsor Member Author

Given the only failure is an unrelated CI flake, merging this.

@BenHenning BenHenning merged commit 48da897 into develop Aug 22, 2021
@BenHenning BenHenning deleted the stabilize-worker-tests branch August 22, 2021 09:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PlatformParameterSyncUpWorkerTest is very flaky LogUploadWorkerTests sometimes fail on github actions.
3 participants