-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Let device finish All-Link cleanup process #84
Conversation
Sending commands to a device while it is performing the All-Link cleanup process will at best abort that cleanup. In the case of battery powered devices, they may competely ignore the requests Changes will wait for All-Link cleanup report (success or failure) before sending queued commands to battery powered devices. Resolves pyinsteon#83
Maybe I am not understanding this PR but it would seem to me the idea would be to run the queued commands if the current topic is not a clean up topic. The current code has the queued command run only when it is a clean-up topic. So I would think the Do you have a real-world example of this or is this problem only theoretical? The problem statement feels inconsistent with the solution. Battery-powered devices can only receive a command when they are awake, which means the device had to wake up first which would produce some kind of message like an on/off command. The initial message would never be a clean-up message. Once the original message fires, this will trigger the queued commands to process which would now run until complete. The cleanup commands would be a result of the original command rather than an original command. What would seem more appropriate here is to allow a sleep time between the wakeup time and the running of the queued commands. |
That is not the intent. The intent is to never run the queued commands when I know the device is trying to run it's cleanup process. The library should never send anything if it knows a device is in the cleanup process, not even a message for another device, since the PLM transmitted message will abort the device cleanup unfinished. I quoted that from the Insteon Developers Guide in the Issue I opened for this PR. The problem here is the library doesn't track the device state machine so I have no way to know if the device is or is not in the clean up process. A simple analog for that process ending is to rely on the final message, an all-link cleanup report broadcast. It is possible to miss this broadcast but then the next time the device is triggered gives another roll of the dice.
The motion I reference in the Issue is a real world example. When I first switched to your new library every motion event was banging on this motion sensor and it never once responded. After the fix I moved on to other issues with this motion sensor. I would also add that there is no point in retrying the 2e, 0x00, 0xff command after 2-3 seconds. The device is asleep again by then. If the first try doesn't work, retries will not either.
You are correct about the initial message, however the device sends multiple messages during the cleanup process and remains awake the whole time. When a device is triggered (e.g. motion sensed, button pushed, etc.) it sends a series of messages in the all-link cleanup process. This is what makes Insteon both "instant on" and reliable. 1st it sends an all-link broadcast which will activate every reponder that hears the broadcast. 2nd it sends an all-link direct message to every device for which it is a controller. 3rd it finally sends an all-link clean up report. There is an example of this cleanup process in the whitepaper on page 49-50. Unfortunately they omitted the example of the cleanup report but the text on page 50 does describe it. The PR waits for that last message in the process. The device is still awake at that point. This is a reliable way to talk to a battery powered device. All you need to do is trigger the device, wait for the all-link cleanup process to finish, then send it some commands. On the rare occasion where that last broadcast is lost, then simply try triggering again.
A delay is a possible fix and in fact that is exactly how I got past this issue initially while learning the library. However the required time is non-deterministic, exactly how long to wait? 2.6 seconds worked for me initially. Wait just 100-200ms too long and the device is asleep again. The ALDB was empty so the device only sent the initial and final broadcast taking about 2.6 seconds. However once the library added the default links (with a controller entry for the PLM), device now needed to send the all-link direct command to the PLM; and the timer needed to be extended. Link more devices to this motion and more time is needed. Also the all-link cleanup messages are acked by a responder. So the device might retransmit adding even more variability to the process. The correct solution is to wait until the device declares that it has finished the process (i.e. the all-link cleanup report). |
Thanks for the explanation and my apologies for not reading the original error you posted.
I agree with this statement so I propose we address this. Here is my recommendation:
Thoughts? |
Thinking about the third point, this could be in |
I really like your idea to create a state that tracks the cleanup process for each device. For point 2 though, the PLM will only see two or three messages below. The PLM never sees the direct cleanup messages to from a device to other linked devices.
I think the state should be set true when 1 occurs and set false when 3 occurs. I don't think 2 should have any influence on the state. We should then have a timeout that will set the state to false after a period of time to catch the case when the broadcast cleanup status report is missed. Note this can happen if there is any other traffic on the network because the device will abort the clean up process. |
Good call on the All-Link broadcast clean up. That would be the notification for the end of the cleanup process. Yes, the timeout is important in case we miss a message. Broadcast messages are notorious for being lost. They do not repeat like direct messages. The timeout can likely be set based on the number of links in the ALDB for the group. Another option for the state tracker is an |
How are you coming on this? I want to include it in release 1.1 which I am hoping is out in about 4 weeks. This will be a quality focused release with some feature enhancements. |
@mstovenour Any update on this? |
Sending commands to a device while it is performing the All-Link
cleanup process will at best abort that cleanup. In the case
of battery powered devices, they may competely ignore the requests
Changes will wait for All-Link cleanup report (success or failure)
before sending queued commands to battery powered devices.
Resolves #83