New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HAP over CoAP #52
HAP over CoAP #52
Conversation
Branch where HA work is being done: https://github.com/roysjosh/core/tree/hap-over-coap |
This is looking good. I'll try and find time to give it a quick review this weekend. The first thing that catches my eye is, can we share code for the pair-setup and pair-verify stuff with the existing pairing code? The existing code is already decoupled from TCP ("sans io") and has been used with BLE successfully, but can be refactored further if needed. |
It is very likely they can be combined. One of the few differences is the creation of a 3rd encryption context for events at the end of pair-verify. I'll find a way to make it work. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like the direction this is going in. I'll probably wait for you to de-dupe the crypto bits before I have another look.
In general try to follow Home Assistant's coding style and naming conventions and best practices (https://developers.home-assistant.io/docs/development_guidelines). Old code from homekit_python doesn't, though thats only until I have time to refactor it.
Go wild with dataclasses, enums and typing generally. A lot of the ones in aiohomekit were generated automatically and are rubbish. Feel free to fix as needed.
There is a lot of legacy cleanup in here to do still, so if something is annoying feel free to refactor it - especially if its an internal bit not exposed to Home Assistant (if it does impact Home Assistant, we'll need a PR to both). Any cleanups you want, can you put them in seperate PR's to keep this one clean.
Hmm I need to find a way to mark the bulb as unavailable when the CoAP comms time out + invalidate the subscriptions whenever the CoAP session needs to be recreated (pair-verify re-executed) as that generates new push event keys. |
I've never done anything with CoAP. Is there a maximum age for a CoAP session? Or is it based on the time since the last exchange? The super simple way would just be to use a That said, I would like to eventually be smarter about polling (it's pretty hard on the battery of my Eve Degree 😢). So alternatively you could start an asyncio task that will close the session after a timeout, and cancel it and reschedule it every time there is a succesful communication with the device? This is kind of annoying in asyncio because you'd need to spawn a coroutine, but then something needs to collect the results (or exceptions) of the coroutine, otherwise you leak expections and things can end up silently broken. I don't think we've got anything like that in aiohomekit atm that you can reuse, though. Or did you have something nicer in mind? |
I'll squash all of this at the end. For now I'll try to commit chunks of enhancements under brief but descriptive commits to help you see what has changed. I'll also comment on your open review items. Big item in this batch was the crypto de-dupe which somehow went without incident. Yay for good specs. |
I'm guessing the CoAP behavior is going to be implementation-specific where the spec isn't clear. Sessions are likely tied to IP & port. I haven't left my HA dev instance up more than a half of a day yet, plus the NL Thread network occasionally has a hiccup which tends to reset the comms so I'm not sure if these bulbs have a maximum session lifetime yet. I'm aiming to make the crypto bits robust to desynchronization or full context resets. It's a lot better than the first code but likely still more to go. |
Sounds good! |
Codecov Report
@@ Coverage Diff @@
## main #52 +/- ##
==========================================
- Coverage 86.48% 77.98% -8.50%
==========================================
Files 48 54 +6
Lines 3470 4302 +832
==========================================
+ Hits 3001 3355 +354
- Misses 469 947 +478
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
@Jc2k okay, I think this is ready for another round of reviews, please. I've had my dev HA up for most of 3 days without any major hiccups. |
RE: polling, do you have any feel for if devices generally support events properly? iOS very rarely polled my fake bulb app's |
RE: Polling and events. Good question. We don't have any telemetry so I don't have any solid answers, here are some observations. Regardless of reliability, I know some characteristics are poll only. These days this is less of a problem but I think in R1 of the HAP spec maybe temperature was poll only (it has events these days for sure). I know we have some TCP HAP devices that refuse subscription requests, so we actually have a path that disables events if it detects this has happened. (It toasts the secure session, so we have to stop trying to even send the subscription request). This is probably a bug on our end but it's not been a big enough problem to look into further yet. I know we have some devices that either have really flaky WiFi or they are just on the edge of their owners range. So we'd get packet loss related failures often. This has caused lots of tickets which are mostly just more errors to silence. Connections dropping without a TCP RST was common. This includes from devices crashing. IIRC one of my devices seems to have a maximum TCP uptime of about 17 hours before the connection needs rotating. While not about events reliability exactly we have observed that the implementations vary wildly in quality. Some favourite bugs:
Which has made me very mistrustful of them. All that said, the events implementation on Hue has been really solid. I have a Hue Remote that i use in my office (i.e. multiple times daily for a year) and i've never noticed it drop an event. Just got an outdoor Friends of Hue switch that so far that is also pretty solid. I would like to rely on events more in the future. I have an Eve Degree and it eats batteries because of the polling. But I probably won't be just switching off polling completely given the vast range of devices we are dealing with. It would be good to slowly back off the poll rate if events are seen to be working for a connection maybe. But with some sort of ping to stop the TCP connection timing out. I don't think the |
RE: Review - i'll make it my next priority, though might be next week now. How easy would it be for you to pull some bits out into seperate PR's? E.g. the changes to tlv8./tlv.py would be a good candidate, they are quite self contained and we can easily add tests for them. (It's quite large, so if we can get the easy bits landed then there is less noise when im re-reviewing). This is what I do to help the Home Assistant reviewers with my larger changes upstream, its a bit of a PITA with all the rebasing but a big help. And i reckon things like the tlv changes on their own I can review sooner. |
I'd like to look at pulling out the TLV structs into I'd also like to see if we can reuse I'm sort of imagining |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also in parallel, you could look at getting the parsed_address
change made in home assistant zeroconf. I'm not going to be the one reviewing that and you might as well get thee ball rolling.
if response.code != Code.CHANGED: | ||
logger.warn(f"CoAP POST returned unexpected code {response}") | ||
|
||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Theres no i/o in the decrypt retry code. So can we pull it out of post
so thats its more testable, and add some tests for it.
RE: Tests, IP currently has a mock accessory that it uses. I hate this approach. It's slow and fails at random. Open to ideas on how you'd like to proceed there. I don't want to drown you in test writing, but it would be good to have some coverage of the happy path as I am in the midst of a (admittedly quite slow) cleanup exercise and could very easily break this. We could break the i/o out, so for def serialize_read_characteristics_request(ids: list[tuple[int, int]]) -> bytes:
pdu = b"".join(
[
hap_pdu(
HAP_PDU_OPCODES.HAP_CHARACTERISTIC_READ, idx, int(aid_iid[1]), b""
)
for (idx, aid_iid) in enumerate(ids)
]
)
def deserialize_read_characteristics_response(ids: list[tuple[int, int]], payload: bytes) -> dict[str, Any]:
offset = 0
results = dict()
while True:
pduControl, pduTid, pduStatus, pduBodyLen = struct.unpack(
"<BBBH", payload[offset : offset + 5]
)
aid_iid = ids[pduTid]
logger.debug(
"Read value for aid %d iid %d: %s, len %d"
% (
int(aid_iid[0]),
int(aid_iid[1]),
pduStatusMap[pduStatus],
pduBodyLen,
)
)
if pduStatus == 0:
# decode TLV to get byte value
value = (
decode_pdu_03(payload[offset + 5 : offset + 5 + pduBodyLen])
if pduBodyLen > 0
else b""
)
# find characteristic so we can get the data type
characteristic = self.info.find_characteristic_by_iid(int(aid_iid[1]))
# if we found it & have a non-empty value...
if characteristic is not None and pduBodyLen > 0:
# set the raw bytes
characteristic.raw_value = value
# and get the decoded value
value = characteristic.value
# add result to dict
results[aid_iid] = {
"value": value,
}
else:
results[aid_iid] = {
"description": pduStatusMap[pduStatus],
"status": pduStatus,
}
offset += 5 + pduBodyLen
if offset >= len(payload):
break
async def read_characteristics(self, ids):
pdu = serialize_read_characteristics_request(ids)
payload = await self.encCtx.post(pdu)
results = deserialize_read_characteristics_response(payload)
logger.debug(f"Read characteristics: {results!r}")
return results Both of the 2 helpers are "pure", they don't need to know about the connection classes or any of the state on them. They could even live in a different module. So they are super easy to test. We could also choose to mock |
9ea2a33
to
b6a4590
Compare
Rebased w/o changes, more commits on the way to address comments |
Awesome. Your work has inspired me to sort BLE out (had it working in homekit_controller aaages ago, but ripped it out when I did the asyncio port). So i'm expecting to be able to reuse bits of this there. |
(Sorry for close, child helping 😵💫 ) |
BLE would be great! One day we could set up devices completely in HA. |
I've got pairing and get/put/identify/list-pairings working already via the CLI. Just cleaning it up a bit and adding some tests. Not sure i'll add events straight away, there are 2 different event systems (connected and disconnected) and i need to have a background task constantly monitoring the advertisement data to pick up disconnected events. I think I want to clean up aiohomekit a lot more first. Hopefully we can find something in the Thread Control Point for managing the thread network. Draft is here: #60. |
Would it be OK with you if we made a I want to rework the discovery interface a bit, and I need to change all 3 backends to do it, so will be easier this way and you can still work on the other bits in parallel. |
I think that makes sense. I'm refactoring a few things still & then I'll work on some tests. |
Good one. It'll do it tomorrow then (hopefully). |
e16591b
to
998a502
Compare
Squashed this PR |
998a502
to
cc102f1
Compare
I'll open up tickets with a coap label in the 1.0.0 milestone and probably tag you as well where there is something I want to change or clarify. Feel free to raise things in 1.0.0 either to ask me or just to keep track of our progress. I'll open PR's and assign them to you where I touch coap stuff, at least while we are both working on this. My area of focus is going to be around zeroconf and BLE discovery for the time being. One thing I want to change is the weird lifecycle zeroconf has:
It's making that code a bit harder to reason about. Cheers for all the work you've done on this. Hopefully nearly there with it.. |
Awesome, thanks! It's been a fun project to work on. Looking forward to getting it over the line! |
WIP, looking for initial feedback