Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lumi.ctrl_neutral2 switches still lost in v2.05.39 #798

Closed
ebaauw opened this issue Sep 23, 2018 · 11 comments
Closed

lumi.ctrl_neutral2 switches still lost in v2.05.39 #798

ebaauw opened this issue Sep 23, 2018 · 11 comments

Comments

@ebaauw
Copy link
Collaborator

ebaauw commented Sep 23, 2018

My lumi.ctrl_neutral2 end-device switches are still somewhat lost after restarting deCONZ v2.05.39:

$ ph get /lights/421
{
  "etag": "2b585d2c016bfd665ba27a8fdad28670",
  "hascolor": false,
  "manufacturername": "LUMI",
  "modelid": "lumi.ctrl_neutral2",
  "name": "Living Room Fan 1",
  "state": {
    "alert": "none",
    "on": false,
    "reachable": true
  },
  "swversion": "11-11-2016",
  "type": "Smart plug",
  "uniqueid": "00:15:8d:00:02:39:14:5e-02"
}
$ ph put /lights/421/state '{"on": true}'
ph put: error: 3 resource, /lights/421, not available

The switches are active on the network, and deCONZ is receiving and answering the requests for the Time cluster. However, that doesn't trigger the REST API plugin to make the resource available. Reading the Basic cluster attributes on endpoint 0x01 doesn't make a difference either. Reading the OnOff cluster attributes on endpoint 0x02 makes available the corresponding light resource. The light resource corresponding to endpoint 0x03 is still unavailable, though, until I read the OnOff cluster attributes on that endpoint as well.

Note that these switches are mains powered end-devices, but report Receiver on when idle as false in the Node info panel. I don't think that's correct, though: reading cluster attributes always seems to give an answer immediately. Also note that these are the newer models without a neutral wire connection, despite the name.

@manup
Copy link
Member

manup commented Sep 23, 2018

Note that these switches are mains powered end-devices, but report Receiver on when idle as false in the Node info panel. I don't think that's correct, though: reading cluster attributes always seems to give an answer immediately. Also note that these are the newer models without a neutral wire connection, despite the name.

Can you please make a screenshot of the Node Info Panel, maybe the Node Descriptor wasn't fetched yet.Hopefully the correct rxOnWhenIdle flag is available otherwise this will for sure cause problems in various cases.

The rxOnWhenIdle flag can be extracted from Device Announce and Node Descriptor, the former can be improved by putting the macCapabilities into Node Descriptor even if it is not fetched yet, I'm not sure this is currently done in every case.

@ebaauw
Copy link
Collaborator Author

ebaauw commented Sep 23, 2018

Can you please make a screenshot of the Node Info Panel

untitled

There's definitely some improvement in v2.05.39. It does find more clusters than previous versions, notably the Groups, Scenes and 0x0010 in endpoints 0x02 and 0x03. Also the device types are reported correctly now.

maybe the Node Descriptor wasn't fetched yet

I doubt it, the other values seem correct. It's probably just another Xiaomi special... Have a look at the Basic cluster, which reports that it's on DC power.
untitled 2

The rxOnWhenIdle flag can be extracted from Device Announce and Node Descriptor, the former can be improved by putting the macCapabilities into Node Descriptor even if it is not fetched yet, I'm not sure this is currently done in every case.

Sorry, I have no clue what this means, let alone how to do this. Is it possible to whitelist the model (and version?) and force-set rxOnWhenIdle?

@manup
Copy link
Member

manup commented Sep 23, 2018

There's definitely some improvement in v2.05.39. It does find more clusters than previous versions, notably the Groups, Scenes and 0x0010 in endpoints 0x02 and 0x03. Also the device types are reported correctly now.

Yes deCONZ core now fetches incomplete Simple Descriptors as soon as possible when commands
or mac data requests are received, should bring a bit more insight in the Xiaomi world :)

Sorry, I have no clue what this means, let alone how to do this. Is it possible to whitelist the model (and version?) and force-set rxOnWhenIdle?

Yes but sadly this will not solve all problems (but might be good enough). With some fixes we can ensure that the device works as expected but only if it is joined directly to the gateway.

The problem is that every parent node uses the rxOnWhenIdle to decide if a frame can be directly send to the child or if it should be queued until the next mac data request is received from the child.

So depending on how (if) often the device polls the parent where might be a delay when reading zcl attributes or sending unicast commands to it.

@ebaauw
Copy link
Collaborator Author

ebaauw commented Oct 5, 2018

(Continued from #806)

The REST API seems to be waiting for a message from the OnOff cluster before it makes the /lights resource available.

Or maybe just from the corresponding endpoint. The Basic server and Time client clusters are on endpoint 0x01; the OnOff clusters on endpoints 0x02 and 0x03. I'll try reading the Groups clusters (also on 0x02 and 0x03) after the next restart.

It's the endpoint; reading the Groups or Scenes cluster attributes does also activate the resource.

@ebaauw
Copy link
Collaborator Author

ebaauw commented Oct 7, 2018

I updated the handling of the Xiaomi special attribute for the lumi.curtain (see PR #836) to update /lights resources as well. For the lumi.ctrl_neutral and lumi.ctrl_ln swicthes, state.on is now updated as well. I also set the node's rx()on receiving the special attribute. Effectively this works around the issue of the ctrl_neutral not being available after restart.

@ebaauw ebaauw closed this as completed Oct 11, 2018
@ebaauw
Copy link
Collaborator Author

ebaauw commented Oct 22, 2018

With .42 the power descriptor can be read again! After reading it, the lumi.ctrl_neutral reports On When Idle and Mains in the power descriptor (as I suspected).

@manup
Copy link
Member

manup commented Oct 23, 2018

Currently the rxOnWhenIdle Info is only extracted from device announce and node descriptor, since they are most reliable source. We can consider using the power descriptor as source for rxOnWhenIdle for the lumi. devices.

@ebaauw
Copy link
Collaborator Author

ebaauw commented Oct 23, 2018

I want to check my other lumi devices first, but I’m having a hard time reading them. When I press the reset button, the device’s led blinks blue, and the indicator on the node in the GUI blinks blue as well. Still, reading the Power descriptor fails, and the node blinks red. On other devices of the same type the descriptor is read alright and the node only blinks blue. This smells suspiciously like routing issues, but I still need to check the logs and/or sniff the traffic to verify this.

I fear the routing issues might be far more widespread, but go largely unnoticed, because of the intermittent nature (route is potentially updated on each link status message?), attribute reporting, and use of group commands. That could also explain the increased number of issues on devices not wanting to pair, the requests to read the descriptors or the Basic cluster simply don’t arrive at the device. Looking forward to test 0x26290500.

@manup
Copy link
Member

manup commented Oct 23, 2018

In case of the Xiaomi sensors I found out that after the setup phase they poll only once per hour — before and after the special report is send! All the other reports won't cause a mac data request (poll) and that's why request which are not scheduled no more than 7 seconds before the special report get lost.

Other end devices have different unique flavors of poll behavior :) Philips is easy because they poll very often.

Currently there is no logic to handle polling scheduling and it basically is a russian-roulette to query end-devices. I'll going to put my collected data in the Wiki and once we have a larger picture we can design something to deal more reliably with it, which will also prevent putting hopeless requests in the queues.

@ebaauw
Copy link
Collaborator Author

ebaauw commented Oct 23, 2018

Do I understand correctly that the poll is a MAC unicast? How would deCONZ know when the end device is reachable? Does the parent send word to the coordinator when the end device is polling? Or does the parent cache the request to the end device for over an hour causing deCONZ to receive an ack, for a request that has long timed out and been removed from the queue?

I would hope the Xiaomi devices poll when the reset button is pressed? Or was I just lucky on the device for which the power descriptor could be read? A 14 in 3600 chance if I understand correctly (worse than 1:250).

@manup
Copy link
Member

manup commented Oct 23, 2018

Do I understand correctly that the poll is a MAC unicast? How would deCONZ know when the end device is reachable?

Yes it's a mac level unicast to the parent, and only the parent will be aware of it. So if an end-device is connected not directly but to some router deCONZ won't know when the end-device polls.

Currently deCONZ assumes the end-device is reachable when a report is received but this is wrong in many cases.

Does the parent send word to the coordinator when the end device is polling?

Sadly no.

Or does the parent cache the request to the end device for over an hour causing deCONZ to receive an ack, for a request that has long timed out and been removed from the queue?

Requests to an end-device are only cached for ~7.5 seconds and then get dropped.

Some routers like Ikea send a mac transaction expired message to the gateway when the message gets dropped, Philips and innr routers just drop the message silently. But they send a APS success confirm right after the message was received and stored.

And this is the difficult part messages to an end-device should be send before it polls. In case of Xiaomi this is a few seconds before the special report is received.

Almost all devices do repeated fast polling in the setup phase for a few seconds/minutes, this is the right time to do all the configuration work and this can be improved in deCONZ.

I would hope the Xiaomi devices poll when the reset button is pressed? Or was I just lucky on the device for which the power descriptor could be read? A 14 in 3600 chance if I understand correctly (worse than 1:250).

Maybe, I haven't checked that, but sniffer will tell.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants