You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What is supposed to happen if the connection to a device is lost right at the moment when doing a (wildcard) Read subscription request ?
We got multiple reports from Nanoleaf devices not being available in either our ecosystem (Home Assistant), Apple Home or both. Digging into the issue I noticed that for these devices setting up the subscription just never returned. It was just stuck there without an exception, error, just nothing, waiting forever.
I took me a whole bit of wrestling to finally reproduce the issue but I managed to do so and the device communication seems to get disturbed with a "CHIP Error 0x00000021: End of TLV" during the transmission of the attributes from the Read request and then restored but then crashes again, over and over. The call to do the read never returns (well, as it didn't complete technically) but there is also no timeout. I left it for hours and hours and it never exits out of this state.
So basically its already triggering the auto resubscribe logic while the initial subscription has not yet been setup. In fact its not even past the point where we assign the callback functions for the various events.
In my opinion we are dealing with - some special case here, look at the log I shared below, there's a very distinct pattern in the loop of retries. Is this a device issue or Thread level issue perhaps ? Fragmentation maybe ?
Also, maybe I'm wrong in this, but to me the subscription should not auto resubscribe yet at this stage, it should throw an exception that setting up the exception failed. It should only start doing auto resubscribes if the initial read request succeeded.
Also, to complete the info: This issue seems to be triggered when the device has a somewhat bad reception or picks a border router with a bad wifi connection. At least that is my theory. I could reproduce it with wrapping a lighbulb in tin foil to disturb its communication but also when I had a somewhat distant Apple Homepod Mini that picked the wrong access point so had a flaky wifi signal.
As reproducing is so hard and we have no idea if only Nanoleaf devices are affected or maybe in combination with particular Border routers, we decided to add some very visible logging to our project so we can track issue reports by users in a bit more structured manner: home-assistant-libs/python-matter-server#623
Catched some logging, log Start is the Read request (from the python c bindings);
Reproduction steps
What is supposed to happen if the connection to a device is lost right at the moment when doing a (wildcard) Read subscription request ?
We got multiple reports from Nanoleaf devices not being available in either our ecosystem (Home Assistant), Apple Home or both. Digging into the issue I noticed that for these devices setting up the subscription just never returned. It was just stuck there without an exception, error, just nothing, waiting forever.
I took me a whole bit of wrestling to finally reproduce the issue but I managed to do so and the device communication seems to get disturbed with a "CHIP Error 0x00000021: End of TLV" during the transmission of the attributes from the Read request and then restored but then crashes again, over and over. The call to do the read never returns (well, as it didn't complete technically) but there is also no timeout. I left it for hours and hours and it never exits out of this state.
So basically its already triggering the auto resubscribe logic while the initial subscription has not yet been setup. In fact its not even past the point where we assign the callback functions for the various events.
In my opinion we are dealing with - some special case here, look at the log I shared below, there's a very distinct pattern in the loop of retries. Is this a device issue or Thread level issue perhaps ? Fragmentation maybe ?
Also, maybe I'm wrong in this, but to me the subscription should not auto resubscribe yet at this stage, it should throw an exception that setting up the exception failed. It should only start doing auto resubscribes if the initial read request succeeded.
Also, to complete the info: This issue seems to be triggered when the device has a somewhat bad reception or picks a border router with a bad wifi connection. At least that is my theory. I could reproduce it with wrapping a lighbulb in tin foil to disturb its communication but also when I had a somewhat distant Apple Homepod Mini that picked the wrong access point so had a flaky wifi signal.
Bug prevalence
We got a few reports now from productions setups
GitHub hash of the SDK that was being used
v1.2.0.1 (181b0cb)
Platform
python
Platform Version(s)
No response
Anything else?
As reproducing is so hard and we have no idea if only Nanoleaf devices are affected or maybe in combination with particular Border routers, we decided to add some very visible logging to our project so we can track issue reports by users in a bit more structured manner: home-assistant-libs/python-matter-server#623
Catched some logging, log Start is the Read request (from the python c bindings);
qrghqkYg.txt
The text was updated successfully, but these errors were encountered: