Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Subscriptions for Matter devices for user is unnecessarily delayed until next scheduled resub retry in some cases #32241

Open
ndyck14 opened this issue Feb 21, 2024 · 6 comments
Assignees
Labels

Comments

@ndyck14
Copy link

ndyck14 commented Feb 21, 2024

Reproduction steps

Topology: 10x Nanoleaf A19s, 1x HPM

Steps: Power down 5x Nanoleaf devices (running 3.6.x with CASE sub resume)
Power cycle HPM (unclear if strict precondition)
Wait ~1.5 hours
Power up 5x Nanoleaf A19s

Results: 4 devices successfully reconnect. 1 device is left uncontrollable for ~an hour:

  1. initial CASE attempt fails (for unknown reasons), rescehdule to 1 hour based on how long the device has been gone
  2. device proactively re-establishes CASE 15 seconds later
  3. SDK still waits for an hour to do resub on active session

Bug prevalence

With correct preconditions it will be 100%. Otherwise it seems to depend on how many devices are booted up at a time

GitHub hash of the SDK that was being used

tvOS 17.3

Platform

darwin

Platform Version(s)

No response

Anything else?

No response

@ndyck14 ndyck14 added bug Something isn't working needs triage labels Feb 21, 2024
@bzbarsky-apple
Copy link
Contributor

In particular, we only retrigger subscription on receiving a ReportData, not on CASE establishment from the other side or the other side sending any other IM message.

@jtung-apple

@jtung-apple
Copy link
Contributor

I'm taking a look but my first thought is that currently the logic for re-subscription is only triggered when the IME gets called on OnUnsolicitedReportData. I could look into the changes needed to have CASE establishment plumb through to trigger re-subscribe and we can discuss if that's what's needed here.

@ndyck14 Do you happen to be able to readily reproduce this? If so could you upload logs from both devices?

I'm wondering if there's something else wrong / a bug that's causing this, that should be fixed first.

@ndyck14
Copy link
Author

ndyck14 commented Feb 22, 2024

Power cycle HPM (unclear if strict precondition)

I did this with the intent of clearing CASE resume contexts (guesswork) so as to test worst case. Basically my brief mental model of resume sub is its a best effort by device. Otherwise CASE is re-established because we're looking for OTA i think.

I can try to reproduce, but is there evidence to suggest that this onus should not be on the subscriber to ensure its done as swiftly as possible? I guess in case the sub is already active, we don't want to double up? Is doubling up even possible?

@ndyck14
Copy link
Author

ndyck14 commented Feb 22, 2024

note that my test steps were in done directed after already observing this previously without logs installed, so I've seen this happen multiple times. I've also been tracking CASEs (pun intended) for a year or more where things take too long to reconnect (eg #25091, which Boris reported on my behalf )

@bzbarsky-apple
Copy link
Contributor

We should consider triggering resubscribe on both CASE establishment (using the new session, not creating a new one), and on any IM message received, not just ReportData.

@woody-apple woody-apple self-assigned this Feb 26, 2024
@woody-apple
Copy link
Contributor

Assigning to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Todo
Development

No branches or pull requests

4 participants