Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when trying to add device to Google Home #102

Closed
jasta opened this issue Sep 25, 2023 · 11 comments
Closed

Error when trying to add device to Google Home #102

jasta opened this issue Sep 25, 2023 · 11 comments

Comments

@jasta
Copy link
Contributor

jasta commented Sep 25, 2023

I got the onoff_light example working on my ESP32-C3-DevKitM-1 (after some tweaking to remove IPv6 support, see #100), but I'm having a couple of issues getting the device to appear and be controllable from Google Home.

First, trying to scan the QR code doesn't work and immediately produces an error as soon as I move my phone camera nearby to the QR code (see [1] below). If I switch to manually entering the pairing code things progress and I see a lot of log spamming from the device (which all looks successful generally and nothing appears to timeout).

Second, after seemingly trying to pair with the new device and bring it into Google Home I'm presented with [2] and I can't do anything to try again or proceed, only back out.

Full logs from boot-up attached here: https://gist.github.com/jasta/ed826fed780f9cc6109499b3fbe4f852

Is this expected? If no, anything I can try to do to debug and figure out what's wrong? I was able to successfully add a device using the main C++ matter repository and the all-clusters-app example and it seemingly worked well even to control the device. This was however a few months ago with a beta of Google Home.

[1]
[2]

EDIT: Apologies I just realized the screenshots are in Spanish. [1] is saying that it's not a matter QR code, and [2] is saying that the device isn't certified for use with Matter.

@jasta
Copy link
Contributor Author

jasta commented Sep 25, 2023

Oops, looks like I need to do something in the Google Home developer console to make this work: https://developers.home.google.com/matter/integration/create

After doing this however, I got a new, more vague error message ("An error occurred", roughly translated to English) in Google Home, and an error logged on the esp32 running rs-matter as well: https://gist.github.com/jasta/7718387af24bde6462d7da54c10f7163

Looks like the relevant error is:

I (98667) rs_matter::data_model::sdm::general_commissioning: Handling command ARM Fail Safe
E (98677) rs_matter::data_model::sdm::failsafe: Received Fail-Safe Arm with different session modes; current Pase, incoming Case(CaseDetails { fab_idx: 1, cat_ids: [65537, 0, 0] })

Not sure what any of this really means though?

@ssnover
Copy link
Contributor

ssnover commented Sep 26, 2023

It looks like Google Home controller might be arming a fail-safe expiration during the commissioning process (PASE) and then attempting to do so with CASE before completing commissioning? It's normal for a fail-safe to be armed during the commissioning process when the controller is talking to the device over a PASE session. However, setting it over a CASE session is usually done for NOC management.

Looking through your logs, it looks like Commissioning is never completed (I don't see a message saying "Handling command Commissioning Complete"). So it looks like the controller is emitting an Arm Fail Safe command when it shouldn't.

@jasta
Copy link
Contributor Author

jasta commented Sep 26, 2023

Thanks that's useful context but I'm not sure how we would go about fixing this. I'm interested in adding sufficient support to get a pet project off the ground but I'm not sure where to start fixing the commissioning flow.

Fwiw I'm also interested in adding BTP (for BLE commissioning) support, but I want to make sure the core flow works first.

@ssnover
Copy link
Contributor

ssnover commented Sep 26, 2023

To be clear, I'm not a maintainer of this crate, just an interested party! I suspect that more information is needed in the logs in order to debug. I've primarily been testing with a SmartThings hub and filing issues where they come up, but I can also try out my Google Home with some additional logging.

@ivmarkov
Copy link
Contributor

I got the onoff_light example working on my ESP32-C3-DevKitM-1 (after some tweaking to remove IPv6 support, see #100), but I'm having a couple of issues getting the device to appear and be controllable from Google Home.

You can't get the commissioning to complete - for Google Home specifically - without support for IPv6. You can do IPv4 only with chiptool though.

I'll comment more in #100.

@jasta
Copy link
Contributor Author

jasta commented Sep 30, 2023

Interesting, I dropped back to esp-idf 4.4.x and indeed I do not get the broadcast send error anymore (I haven't investigated why exactly but the error was when calling send_to here: https://github.com/project-chip/rs-matter/blob/main/rs-matter/src/transport/udp.rs#L146). I'll investigate why that would fail with ESP_IDF_VERSION="release/v5.0" soon.

But, more importantly, even with 4.4.x and IPv6 working I still can't connect to Google Home. First, the UX is different with IPv6 and the device is discovered automatically instead of making me force Google Home to try through Matter. But the error in the logs seems to be the same, but this time with some new errors spamming around Incorrect TLV list:

https://gist.github.com/jasta/a6644468ee77e0f5e10266576599064d

It looks like Matter itself is generating packets it's parser can't print? Note that this behaviour is observed before and after updating to the latest main branch (from maybe a week or two ago when I did the first test). The logs are from the latest main as of this writing.

@jasta
Copy link
Contributor Author

jasta commented Sep 30, 2023

Hmm, well this is odd. I just reran a test and while it still didn't work (i.e. adding the device failed with generic error screen from Google Home), this time I didn't receive any "Incorrect TLV list" errors. Something is starting to feel like a race condition either on the wire or in rs-matter code. @ivmarkov, do you have logs from a working session with Google Home that I can use for comparison?

I'm starting to think maybe #103 might be the answer as well. I noticed that we don't have structured logging of incoming or outgoing TLV structs so the logs are missing quite a bit of context if you suspect one particular message might not be getting handled (or sent) correctly.

@jasta
Copy link
Contributor Author

jasta commented Sep 30, 2023

Oh and also I tried disabling the error return in FailSafe::arm when the session modes differ but Google Home still presents an error immediately after issuing the second arm command:

E (51603) rs_matter::data_model::sdm::failsafe: IGNORING!!! Received Fail-Safe Arm with different session modes; current Pase, incoming Case(CaseDetails { fab_idx: 1, cat_ids: [65537, 0, 0] })
I (51623) rs_matter::transport::proto_hdr: [encode] ExId: 48006, Proto: 1, Opcode: 9, Flags: R|A|
I (51633) rs_matter::transport::packet: Sending packet IM:InvokeResponse: 
I (51643) rs_matter::tlv::parser: TLV list:
I (51643) rs_matter::tlv::parser: S8(64)
I (51653) rs_matter::tlv::parser: ---------
I (51733) rs_matter::transport::core: Transport: waiting for incoming packets
I (51733) rs_matter::transport::plain_hdr: [decode] flags: (empty), session type: Encrypted, sess_id: 2, ctr: 123089286
I (51743) rs_matter::transport::proto_hdr: [decode] ExId: 48006, Proto: 0, Opcode: 16, Flags: A|I| 
I (51753) rs_matter::transport::packet: Got packet SC:MRPStandAloneAck: 
I (51753) rs_matter::tlv::parser: TLV list:
I (51763) rs_matter::tlv::parser: ---------
I (51763) rs_matter::transport::core: Handler 0: Exchange completed

And then nothing else with the Google Home app in a "something went wrong" type of UI state.

@ivmarkov
Copy link
Contributor

ivmarkov commented Oct 2, 2023

Some updates:

  • I can reproduce your errors on Linux, so the problem is not esp32 specific
  • Either Google's Matter SDK was updated on their controllers, or our code had regressed due to recent changes (perhaps, the support for the new version of the Apple's Matter controllers?)

In any case, this needs debugging. I'll try to figure out what is going on in the next days.

UPDATE: I doubt there is any racing error though. It is all single threaded.

UPDATE 2: The "incorrect TLV list" might be a false alarm. I'll follow up on this as well.

ivmarkov added a commit to ivmarkov/rs-matter that referenced this issue Oct 2, 2023
ivmarkov added a commit to ivmarkov/rs-matter that referenced this issue Oct 6, 2023
ivmarkov added a commit to ivmarkov/rs-matter that referenced this issue Jan 2, 2024
ivmarkov added a commit to ivmarkov/rs-matter that referenced this issue Jan 8, 2024
ivmarkov added a commit to ivmarkov/rs-matter that referenced this issue Jan 13, 2024
ivmarkov added a commit to ivmarkov/rs-matter that referenced this issue Jan 18, 2024
ivmarkov added a commit to ivmarkov/rs-matter that referenced this issue Feb 10, 2024
@ivmarkov
Copy link
Contributor

@kedars I don't know what exactly had changed in my Google Controller, but I'm no longer seeing this. Shall we close?

@kedars
Copy link
Collaborator

kedars commented Feb 14, 2024

makes sense, can be reopened if we see it again. I suspect, maybe the unordered fix that Andrei did might have cleared this, since that's what changed in the last few weeks.

@kedars kedars closed this as completed Feb 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants