Skip to content

Wear: use coroutines from GMS tasks#6638

Merged
TimoPtr merged 6 commits intomainfrom
fix/crash_on_timeout
Apr 1, 2026
Merged

Wear: use coroutines from GMS tasks#6638
TimoPtr merged 6 commits intomainfrom
fix/crash_on_timeout

Conversation

@TimoPtr
Copy link
Copy Markdown
Member

@TimoPtr TimoPtr commented Mar 27, 2026

Summary

While looking on the Play Store console, our biggest crash (94% of them)
image
is due to the fact that we are not handling proper ApiException in the OnboardingActivity. This PR solves this but also migrate to proper coroutines instead of using new threads. I also took the opportunity to also review the coroutine scopes in the listeners used to communicate with the mobile app.

Checklist

  • New or updated tests have been added to cover the changes following the testing guidelines.
  • The code follows the project's code style and best_practices.
  • The changes have been thoroughly tested, and edge cases have been considered.
  • Changes are backward compatible whenever feasible. Any breaking changes are documented in the changelog for users and/or in the code for developers depending on the relevance.

Any other notes

I also remove the Suppress annotation about VisibleForTest since it has been addressed in the upstream.

Copilot AI review requested due to automatic review settings March 27, 2026 09:06
}
}

private fun sendHomeAssistantInstance(nodeId: String) = runBlocking {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One less runBlocking

private fun findExistingInstances() {
private suspend fun findExistingInstances() = withContext(Dispatchers.IO) {
Timber.d("findExistingInstances")
Tasks.await(Wearable.getDataClient(this).getDataItems("wear://*/home_assistant_instance".toUri())).apply {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This most probably the cause of the crashes on the play console but I'm not 100% since the stacktrace is not mentioning anything from our side.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reduces Wear onboarding-related crashes by migrating GMS Task usage from manual threads/blocking calls to coroutine-based await() and by handling ApiException failures more explicitly.

Changes:

  • Replace Thread { ... } / blocking task usage in Wear onboarding with lifecycleScope coroutines and Task.await()
  • Introduce coroutine scopes for Wear listener services and cancel them in onDestroy()
  • Narrow some exception handling to ApiException when interacting with Wearable DataClient/MessageClient tasks

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 7 comments.

File Description
wear/src/main/kotlin/io/homeassistant/companion/android/phone/PhoneSettingsListener.kt Move service work onto a dedicated coroutine scope, add main-thread hop for startActivity, and adjust error handling for wearable data operations
wear/src/main/kotlin/io/homeassistant/companion/android/onboarding/OnboardingActivity.kt Replace background threads and Tasks.await with coroutine-based async work and ApiException handling
wear/src/main/kotlin/io/homeassistant/companion/android/onboarding/OnboardingPresenterImpl.kt Remove VisibleForTests suppression annotation/import
app/src/full/kotlin/io/homeassistant/companion/android/onboarding/WearOnboardingListener.kt Replace runBlocking with a service coroutine scope and Task.await() for sending instance data

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

@jpelgrom jpelgrom changed the title Wear: use coroutines fro GMS tasks Wear: use coroutines from GMS tasks Mar 29, 2026
@TimoPtr TimoPtr requested a review from jpelgrom March 30, 2026 08:31
Copy link
Copy Markdown
Member

@jpelgrom jpelgrom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been unable to onboard a new watch with the changes in this PR. After switching the watch back to a build based on the main branch, onboarding completed immediately.

It appears to be cancelling something but adb isn't being very helpful - it switches between cancelled in the first or second (like this trace) network call or just abruptly ends after the first network call.

2026-03-30 12:14:04.496  7022-7051  n.android.debug         io....stant.companion.android.debug  I  Explicit concurrent mark compact GC freed 96KB AllocSpace bytes, 0(0B) LOS objects, 44% free, 7819KB/13MB, paused 1.133ms,10.483ms total 195.624ms
2026-03-30 12:14:06.215  7022-7022  OnboardingPresenterImpl io....stant.companion.android.debug  D  onDataChanged: [1]
2026-03-30 12:14:06.216  7022-7204  GoogleSignatureVerifier io....stant.companion.android.debug  I  package info is not set correctly
2026-03-30 12:14:06.223  7022-7749  PhoneSettingsListener   io....stant.companion.android.debug  D  onDataChanged 1
2026-03-30 12:14:06.261  7022-7750  OkHttp                  io....stant.companion.android.debug  I  --> POST https://ha.example.com/auth/token
2026-03-30 12:14:06.262  7022-7750  OkHttp                  io....stant.companion.android.debug  I  Content-Type: application/x-www-form-urlencoded
2026-03-30 12:14:06.263  7022-7750  OkHttp                  io....stant.companion.android.debug  I  Content-Length: 119
2026-03-30 12:14:06.264  7022-7750  OkHttp                  io....stant.companion.android.debug  I  
2026-03-30 12:14:06.267  7022-7750  OkHttp                  io....stant.companion.android.debug  I  grant_type=authorization_code&code=b7dc9956d25342ad9da44e1a9025c642&client_id=https%3A%2F%2Fhome-assistant.io%2Fandroid
2026-03-30 12:14:06.268  7022-7750  OkHttp                  io....stant.companion.android.debug  I  --> END POST (119-byte body)
2026-03-30 12:14:06.767  7022-7038  n.android.debug         io....stant.companion.android.debug  I  Waiting for a blocking GC ProfileSaver
2026-03-30 12:14:07.062  7022-7051  n.android.debug         io....stant.companion.android.debug  I  Explicit concurrent mark compact GC freed 721KB AllocSpace bytes, 0(0B) LOS objects, 43% free, 8119KB/13MB, paused 16.502ms,11.677ms total 459.981ms
2026-03-30 12:14:07.064  7022-7038  n.android.debug         io....stant.companion.android.debug  I  WaitForGcToComplete blocked ProfileSaver on Explicit for 297.619ms
2026-03-30 12:14:07.066  7022-7750  OkHttp                  io....stant.companion.android.debug  I  <-- 200 https://ha.example.com/auth/token (797ms)
2026-03-30 12:14:07.066  7022-7750  OkHttp                  io....stant.companion.android.debug  I  date: Mon, 30 Mar 2026 10:14:05 GMT
2026-03-30 12:14:07.066  7022-7750  OkHttp                  io....stant.companion.android.debug  I  content-type: application/json
2026-03-30 12:14:07.066  7022-7750  OkHttp                  io....stant.companion.android.debug  I  cache-control: no-store
2026-03-30 12:14:07.066  7022-7750  OkHttp                  io....stant.companion.android.debug  I  pragma: no-cache
2026-03-30 12:14:07.067  7022-7750  OkHttp                  io....stant.companion.android.debug  I  referrer-policy: no-referrer
2026-03-30 12:14:07.067  7022-7750  OkHttp                  io....stant.companion.android.debug  I  x-content-type-options: nosniff
2026-03-30 12:14:07.067  7022-7750  OkHttp                  io....stant.companion.android.debug  I  x-frame-options: SAMEORIGIN
2026-03-30 12:14:07.067  7022-7750  OkHttp                  io....stant.companion.android.debug  I  vary: accept-encoding
2026-03-30 12:14:07.072  7022-7750  OkHttp                  io....stant.companion.android.debug  I  
2026-03-30 12:14:07.072  7022-7750  OkHttp                  io....stant.companion.android.debug  I  {"access_token":"eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJlMDdkZTk2NDhlZTU0OTMwYTcxNWUwMmYxM2ZkOTZjZSIsImlhdCI6MTc3NDg2NTY0NSwiZXhwIjoxNzc0ODY3NDQ1fQ.2Srt2vP2Dzp_P_YLop6vL9hbsf9TQxDo00U8xLiwbHo","token_type":"Bearer","refresh_token":"d182024223edebfd0cf62c9db31f5462a2898167fbf2ef463bb56ac4fcee1289123ab138e40f7947d9b51066a4a8332b476dae0eb859492439db5f194a40d57b","expires_in":1800,"ha_auth_provider":"homeassistant"}
2026-03-30 12:14:07.072  7022-7750  OkHttp                  io....stant.companion.android.debug  I  <-- END HTTP (802ms, 424-byte body)
2026-03-30 12:14:07.358  7022-7726  NetworkCha...workChange io....stant.companion.android.debug  D  Register network callback
2026-03-30 12:14:07.428  7022-7761  ServerConn...iderImplKt io....stant.companion.android.debug  D  Using external URL
2026-03-30 12:14:07.455  7022-7727  NetworkCha...workChange io....stant.companion.android.debug  D  Unregister network callback
2026-03-30 12:14:07.553  7022-7090  NetworkCha...workChange io....stant.companion.android.debug  D  Register network callback
2026-03-30 12:14:07.588  7022-7725  ServerConn...iderImplKt io....stant.companion.android.debug  D  Using external URL
2026-03-30 12:14:07.608  7022-7726  NetworkCha...workChange io....stant.companion.android.debug  D  Unregister network callback
2026-03-30 12:14:07.781  7022-7022  LeakCanary              io....stant.companion.android.debug  D  Watching instance of io.homeassistant.companion.android.phone.PhoneSettingsListener (io.homeassistant.companion.android.phone.PhoneSettingsListener received Service#onDestroy() callback) with key 040429ac-fc72-4e7e-8f04-d23dcf24da04
2026-03-30 12:14:07.784  7022-7750  OkHttp                  io....stant.companion.android.debug  I  --> POST https://ha.example.com/api/mobile_app/registrations
2026-03-30 12:14:07.785  7022-7750  OkHttp                  io....stant.companion.android.debug  I  Content-Type: application/json; charset=UTF-8
2026-03-30 12:14:07.785  7022-7750  OkHttp                  io....stant.companion.android.debug  I  Content-Length: 567
2026-03-30 12:14:07.785  7022-7750  OkHttp                  io....stant.companion.android.debug  I  Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJlMDdkZTk2NDhlZTU0OTMwYTcxNWUwMmYxM2ZkOTZjZSIsImlhdCI6MTc3NDg2NTY0NSwiZXhwIjoxNzc0ODY3NDQ1fQ.2Srt2vP2Dzp_P_YLop6vL9hbsf9TQxDo00U8xLiwbHo
2026-03-30 12:14:07.786  7022-7750  OkHttp                  io....stant.companion.android.debug  I  
2026-03-30 12:14:07.788  7022-7750  OkHttp                  io....stant.companion.android.debug  I  {"app_id":"io.homeassistant.companion.android","app_name":"Home Assistant","app_version":"2026.3.7-SNAPSHOT (2)","device_name":"Galaxy Watch4 (QP9H)","manufacturer":"samsung","model":"SM-R860","os_name":"Android","os_version":"36","supports_encryption":false,"app_data":{"push_websocket_channel":false,"push_url":"https://europe-west1-ha-fcm-push.cloudfunctions.net/androidV1","push_token":"cNqJYWW3SW-PKkN7pfex7A:APA91bFB0j4xPpGSgx2B_MEkLjDA0HHVwX2BDU51v4VOUTpwFWeJn9cyq3vaYldq2apKbMt7UGO0PprPyMBdsngpSl5WWTgwePUehX2pm5SdoaEp1MJVEyc"},"device_id":"a38cb84a713f3361"}
2026-03-30 12:14:07.788  7022-7750  OkHttp                  io....stant.companion.android.debug  I  --> END POST (567-byte body)
2026-03-30 12:14:07.796  7022-7750  OkHttp                  io....stant.companion.android.debug  I  <-- HTTP FAILED: java.io.IOException: Canceled. https://ha.example.com/api/mobile_app/registrations (6ms)

Comment on lines +82 to +87
private val serviceScope: CoroutineScope = CoroutineScope(Dispatchers.IO + SupervisorJob())

override fun onDestroy() {
serviceScope.cancel()
super.onDestroy()
}
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be here that we cancel but it means that the service is not alive very long.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It means that we were leaking the service before.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpelgrom maybe we could use the WorkManager here ? 🤔 with the risk that it is not executed right away? Otherwise we might need to do this listening in the app itself and forbid it to go to sleep so we keep listening (I don't know if it's possible) until we are setup.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might have leaked before (and/or Samsung might be limiting more here than technically should happen), but it does work with the leaking code. Clean code but it not working is worse.

@jpelgrom maybe we could use the WorkManager here ? 🤔 with the risk that it is not executed right away?

This is in response to the user trying to sign in so "not executed right away" is undesirable. It also ends with launching an activity which cannot be done from a job. However early cancellation in the other functions is also bad, changes may not be saved. Maybe it is completing early because of launch - can we make it wait until done? (I didn't find it in a quick research, maybe also need experimentation/checking the code of the GMS listener service.)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was pretty sure I did the onboarding but on the emulator maybe it changes.
Do we really need a service? Is there something to listen within an activity so we have a better control on the lifecycle?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can do it in the activity. However, onboarding on the phone takes a bit of time and watches very quickly turn off by default / if you put your wrist down so we need to be sure something is alive to send the onboarding data back to.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can keep the watch on on this activity? (I'm not sure it's possible)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, something to try :)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the end I decided to revert this part and document it. I kept the changes needed to properly catch the error so it doesn't crash the app. If you can double check.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double checked and can confirm it now works again while there is progress to move towards coroutines :). Good enough for now in my opinion, otherwise this PR scope increases a lot.

@TimoPtr TimoPtr requested a review from jpelgrom March 31, 2026 14:20
@TimoPtr TimoPtr merged commit cf43f35 into main Apr 1, 2026
39 of 40 checks passed
@TimoPtr TimoPtr deleted the fix/crash_on_timeout branch April 1, 2026 07:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants