Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[1.1] Connection failure error during DFU over Matter (nRF52840) #27460

Closed
KumhoElectricLJH opened this issue Jun 26, 2023 · 7 comments
Closed
Labels
bug Something isn't working needs triage

Comments

@KumhoElectricLJH
Copy link

Reproduction steps

Hi

I have a question because a connection failure error occurred during the DFU over Matter guide.

(Guide URL : developer.nordicsemi.com/.../nrfconnect_examples_software_update.html))

The PC using Ubuntu OS(with nRF52840 dongle) is configured as the OTA provider and the nRF52840DK board is configured as the OTA requester. (OTBR is configured as Docker)

I completed the installation of ota-provider-app and chip-tool on the Linux PC and commissioned the Linux PC as an OTA provider

And I sent Announce OTA Provider command using Chip tool, but an error appears in the nRF52840DK board log

Log >>

I: 1515622 [EM]>>> [E:60665r S:0 M:182234332] (U) Msg RX from 0:EBA082CFD0009BF0 [0000] --- Type 0000:30 (SecureChannel:CASE_Sigma1)
I: 1515634 [IN]CASE Server received Sigma1 message . Starting handshake. EC 0x20004e60
I: 1515641 [IN]CASE Server disabling CASE session setups
I: 1515647 [EM]<<< [E:60665r S:0 M:209788448 (Ack:182234332)] (U) Msg TX to 0:0000000000000000 [0000] --- Type 0000:10 (SecureChannel:StandaloneAck)
I: 1515660 [IN] (U) Sending msg 209788448 to IP address 'UDP:[fd11:db8:1::1]:46385'
I: 1515668 [SC]Received Sigma1 msg
I: 1515676 [SC]CASE matched destination ID: fabricIndex 1, NodeID 0x0000000000000002
I: 1515885 [EM]<<< [E:60665r S:0 M:209788449 (Ack:182234332)] (U) Msg TX to 0:0000000000000000 [0000] --- Type 0000:31 (SecureChannel:CASE_Sigma2)
I: 1515898 [IN] (U) Sending msg 209788449 to IP address 'UDP:[fd11:db8:1::1]:46385'
I: 1515907 [SC]Sent Sigma2 msg
I: 1516027 [EM]>>> [E:60665r S:0 M:182234333 (Ack:209788449)] (U) Msg RX from 0:EBA082CFD0009BF0 [0000] --- Type 0000:32 (SecureChannel:CASE_Sigma3)
I: 1516040 [EM]<<< [E:60665r S:0 M:209788450 (Ack:182234333)] (U) Msg TX to 0:0000000000000000 [0000] --- Type 0000:10 (SecureChannel:StandaloneAck)
I: 1516053 IN Sending msg 209788450 to IP address 'UDP:[fd11:db8:1::1]:46385'
I: 1516062 [SC]Received Sigma3 msg
E: 1516070 [SC]The device does not support GetClock_RealTimeMS() API: 3. Falling back to Last Known Good UTC Time
E: 1516267 [DL]Long dispatch time: 187 ms, for event type 2
I: 1516368 [EM]<<< [E:60665r S:0 M:209788451 (Ack:182234333)] (U) Msg TX to 0:0000000000000000 [0000] --- Type 0000:40 (SecureChannel:StatusReport)
I: 1516381 IN Sending msg 209788451 to IP address 'UDP:[fd11:db8:1::1]:46385'
I: 1516391 [SC]SecureSession[0x20003838]: Moving from state 'kEstablishing' --> 'kActive'
D: 1516400 [IN]SecureSession[0x20003838]: Activated - Type:2 LSID:38704
D: 1516406 [IN]New secure session activated for device <000000000001B669, 1>, LSID:38704 PSID:14355!
I: 1516419 [IN]CASE Session established to peer: <000000000001B669, 1>
I: 1516425 [IN]CASE Server enabling CASE session setups
D: 1516431 [IN]SecureSession[0x200038f0]: Allocated Type:2 LSID:38706
E: 1516437 [DL]Long dispatch time: 165 ms, for event type 2
I: 1516443 [EM]>>> [E:60666r S:38704 M:131944375] (S) Msg RX from 1:000000000001B669 [4661] --- Type 0001:08 (IM:InvokeCommandRequest)
I: 1516456 [SWU]OTA Requestor received AnnounceOTAProvider
D: 1516461 [SWU] FabricIndex: 1
D: 1516464 [SWU] ProviderNodeID: 0x0000000000000001
D: 1516469 [SWU] VendorID: 0x0
D: 1516472 [SWU] AnnouncementReason: 0
D: 1516475 [SWU] Endpoint: 0
I: 1516479 [EM]<<< [E:60666r S:38704 M:3219666 (Ack:131944375)] (S) Msg TX to 1:000000000001B669 [4661] --- Type 0001:09 (IM:InvokeCommandResponse)
I: 1516492 IN Sending msg 3219666 on secure session with LSID: 38704
I: 1516500 [EM]>>> [E:60665r S:0 M:182234334 (Ack:209788451)] (U) Msg RX from 0:EBA082CFD0009BF0 [0000] --- Type 0000:10 (SecureChannel:StandaloneAck)
I: 1516514 [SWU]Stopping the Periodic Query timer
I: 1516519 [SWU]Starting the watchdog timer, timeout: 21600 seconds
D: 1516525 [SWU]Establishing session to provider node ID 0x0000000000000001 on fabric index 1
D: 1516535 [DIS]OperationalSessionSetup[1:0000000000000001]: State change 1 --> 2
I: 1516543 [DIS]Resolving DFAABB27E9334661:0000000000000001 ...
I: 1516551 [EM]>>> [E:60666r S:38704 M:131944376 (Ack:3219666)] (S) Msg RX from 1:000000000001B669 [4661] --- Type 0000:10 (SecureChannel:StandaloneAck)
I: 1516635 [DIS]Node ID resolved for DFAABB27E9334661:0000000000000001
D: 1516641 [DIS] Hostname: E0D55EB9F3140000
D: 1516645 [DIS] IP Address #1: fd11:db8:1::1
D: 1516649 [DIS] Port: 5540
D: 1516652 [DIS] Mrp Interval idle: 300 ms
D: 1516656 [DIS] Mrp Interval active: 2000 ms
D: 1516660 [DIS] TCP Supported: 1
D: 1516663 [DIS]Lookup clearing interface for non LL address
I: 1516668 [DIS]UDP:[fd11:db8:1::1]:5540: new best score: 4
I: 1516674 [DIS]Checking node lookup status after 131 ms
I: 1516679 [DIS]Keeping DNSSD lookup active
I: 1516743 [DIS]Checking node lookup status after 200 ms
D: 1516748 [DIS]OperationalSessionSetup[1:0000000000000001]: Updating device address to UDP:[fd11:db8:1::1]:5540 while in state 2
D: 1516760 [DIS]OperationalSessionSetup[1:0000000000000001]: State change 2 --> 3
D: 1516767 [IN]SecureSession[0x200039a8]: Allocated Type:2 LSID:38707
I: 1516774 [SC]Initiating session on local FabricIndex 1 from 0x0000000000000002 -> 0x0000000000000001
I: 1516924 [EM]<<< [E:31162i S:0 M:209788452] (U) Msg TX to 0:0000000000000000 [0000] --- Type 0000:30 (SecureChannel:CASE_Sigma1)
I: 1516936 IN Sending msg 209788452 to IP address 'UDP:[fd11:db8:1::1]:5540'
I: 1516944 [SC]Sent Sigma1 msg
D: 1516947 [DIS]OperationalSessionSetup[1:0000000000000001]: State change 3 --> 4
I: 1517852 IN Sending msg 209788452 to IP address 'UDP:[fd11:db8:1::1]:5540'
I: 1518686 IN Sending msg 209788452 to IP address 'UDP:[fd11:db8:1::1]:5540'
I: 1519741 IN Sending msg 209788452 to IP address 'UDP:[fd11:db8:1::1]:5540'
I: 1521217 IN Sending msg 209788452 to IP address 'UDP:[fd11:db8:1::1]:5540'
E: 1523365 [EM]Failed to Send CHIP MessageCounter:209788452 on exchange 31162i sendCount: 4 max retries: 4
E: 1525648 [SC]CASESession timed out while waiting for a response from the peer. Current state was 1
D: 1525657 [IN]SecureSession[0x200039a8]: Released - Type:2 LSID:38707
D: 1525664 [DIS]OperationalSessionSetup[1:0000000000000001]: State change 4 --> 2
D: 1525671 [DIS]OperationalSessionSetup[1:0000000000000001]: State change 2 --> 1
E: 1525679 [SWU]Failed to connect to node 0x0000000000000001: 32
I: 1525686 [SWU]Unknown idle state reason so set the periodic timer for a next attempt
I: 1525694 [SWU]Stopping the watchdog timer
I: 1525698 [SWU]Starting the periodic query timer, timeout: 86400 seconds

When you send the Query command from the nRF52840DK board, the error log appears as shown below.

Log >>
matter ota query
uart:~$ I: 1682151 [SWU]Triggered immediate OTA query for fabric: 0x1
I: 1682159 [SWU]Stopping the Periodic Query timer
I: 1682163 [SWU]Starting the watchdog timer, timeout: 21600 seconds
D: 1682169 [SWU]Establishing session to provider node ID 0x0000000000000001 on fabric index 1
D: 1682177 [DIS]OperationalSessionSetup[1:0000000000000001]: State change 1 --> 2
I: 1682185 [DIS]Resolving DFAABB27E9334661:0000000000000001 ...
I: 1682284 [DIS]Node ID resolved for DFAABB27E9334661:0000000000000001
D: 1682290 [DIS] Hostname: E0D55EB9F3140000
D: 1682294 [DIS] IP Address #1: fd11:db8:1::1
D: 1682298 [DIS] Port: 5540
D: 1682301 [DIS] Mrp Interval idle: 300 ms
D: 1682305 [DIS] Mrp Interval active: 2000 ms
D: 1682309 [DIS] TCP Supported: 1
D: 1682312 [DIS]Lookup clearing interface for non LL address
I: 1682317 [DIS]UDP:[fd11:db8:1::1]:5540: new best score: 4
I: 1682323 [DIS]Checking node lookup status after 138 ms
I: 1682328 [DIS]Keeping DNSSD lookup active
I: 1682385 [DIS]Checking node lookup status after 200 ms
D: 1682390 [DIS]OperationalSessionSetup[1:0000000000000001]: Updating device address to UDP:[fd11:db8:1::1]:5540 while in state 2
D: 1682402 [DIS]OperationalSessionSetup[1:0000000000000001]: State change 2 --> 3
D: 1682409 [IN]SecureSession[0x200039a8]: Allocated Type:2 LSID:38709
I: 1682416 [SC]Initiating session on local FabricIndex 1 from 0x0000000000000002 -> 0x0000000000000001
I: 1682567 [EM]<<< [E:31164i S:0 M:209788454] (U) Msg TX to 0:0000000000000000 [0000] --- Type 0000:30 (SecureChannel:CASE_Sigma1)
I: 1682578 IN Sending msg 209788454 to IP address 'UDP:[fd11:db8:1::1]:5540'
I: 1682587 [SC]Sent Sigma1 msg
D: 1682590 [DIS]OperationalSessionSetup[1:0000000000000001]: State change 3 --> 4
I: 1683417 IN Sending msg 209788454 to IP address 'UDP:[fd11:db8:1::1]:5540'
I: 1684297 IN Sending msg 209788454 to IP address 'UDP:[fd11:db8:1::1]:5540'
I: 1685441 IN Sending msg 209788454 to IP address 'UDP:[fd11:db8:1::1]:5540'
I: 1686890 IN Sending msg 209788454 to IP address 'UDP:[fd11:db8:1::1]:5540'
E: 1688790 [EM]Failed to Send CHIP MessageCounter:209788454 on exchange 31164i sendCount: 4 max retries: 4
E: 1691290 [SC]CASESession timed out while waiting for a response from the peer. Current state was 1
D: 1691299 [IN]SecureSession[0x200039a8]: Released - Type:2 LSID:38709
D: 1691305 [DIS]OperationalSessionSetup[1:0000000000000001]: State change 4 --> 2
D: 1691313 [DIS]OperationalSessionSetup[1:0000000000000001]: State change 2 --> 1
E: 1691320 [SWU]Failed to connect to node 0x0000000000000001: 32
I: 1691327 [SWU]Unknown idle state reason so set the periodic timer for a next attempt
I: 1691334 [SWU]Stopping the watchdog timer
I: 1691338 [SWU]Starting the periodic query timer, timeout: 86400 seconds

How can I solve this problem?

Bug prevalence

etc

GitHub hash of the SDK that was being used

Platform

nrf connect

Platform Version(s)

No response

Type

Platform Issue

Anything else?

No response

@KumhoElectricLJH KumhoElectricLJH added bug Something isn't working needs triage labels Jun 26, 2023
@bzbarsky-apple
Copy link
Contributor

@KumhoElectricLJH can you please attach, not paste, the following logs:

  1. Log from chip-tool showing all commands (especially including the commands that commissioned the two devices).
  2. Log from the ota-provider-app involved, including the exact command used to start it.
  3. Corresponding log from the OTA requestor involved.

Because what the above is showing is that the ota-provider-app never responded to the CASE Sigma1 message, which strongly suggests it never received it.

@chuckrui
Copy link

@bzbarsky-apple Hi,I encountered the same problem, here are the corresponding steps and log.

Command Step:
1.sudo ot-ctl dataset init new
2.sudo ot-ctl dataset networkkey 00112233445566778899aabbccddeeff
3.sudo ot-ctl dataset extpanid 1111111122222222
4.sudo ot-ctl dataset panid 0x1234
5.sudo ot-ctl dataset channel 15
6.sudo ot-ctl dataset commit active
7.sudo ot-ctl dataset ifconfig up
8.sudo ot-ctl thread start
9.sudo ot-ctl dataset active -x
OTBR.txt

10../out/chiptool/chip-tool pairing ble-thread 2 hex:000300000f02081111111122222222051000112233445566778899aabbccddeeff01021234 20202021 3841
11. ./out/provider/chip-ota-provider-app -f ../ota_fw/ota/3841-v4.ota
provider.txt

  1. ./out/chip-tool/chip-tool pairing onnetwork 1 20202021

  2. ./out/chiptool/chip-tool otasoftwareupdaterequestor write default-ota-providers '[{"fabricIndex": 1, "providerNodeID": 1, "endpoint": 0}]' 2 0
    default-otaproviders.txt

  3. ./out/chiptool/chip-tool accesscontrol write acl '[{"fabricIndex": 1, "privilege": 5, "authMode": 2, "subjects": [112233], "targets": null}, {"fabricIndex": 1, "privilege": 3, "authMode": 2, "subjects": null, "targets": null}]' 1 0
    write_acl.txt

  4. ./out/chiptool/chip-tool otasoftwareupdaterequestor announce-ota-provider 1 0 0 0 2 0

zephyr log:
zephyr.txt

@bzbarsky-apple
Copy link
Contributor

@chuckrui Thank you for attaching the logs. From https://github.com/project-chip/connectedhomeip/files/13631594/zephyr.txt:

00> I: 120563 [DIS]Node ID resolved for BBAD7B63CB241A3E:0000000000000001
00> D: 120563 [DIS]  Hostname: F46ADD6CD60F0000
00> D: 120564 [DIS]  IP Address #1: ::
00> D: 120564 [DIS]  Port: 5540
00> D: 120564 [DIS]  Mrp Interval idle: 300 ms
00> D: 120564 [DIS]  Mrp Interval active: 2000 ms
00> D: 120565 [DIS]  TCP Supported: 1

That is ... not a useful IP address, so it gets ignored. And then there are no more results. So operational discovery fails.

Looking at https://github.com/project-chip/connectedhomeip/files/13631583/provider.txt, I can't make sense of it. I has logs from the provider, but also from chip-tool... It's not even clear to me whether the ota-provider app is running by the end of that log.

Looking at https://github.com/project-chip/connectedhomeip/files/13631551/write_acl.txt, when chip-tool is resolving the OTA provider it gets:

[1702273164.961475][685609:685611] CHIP:DIS: Node ID resolved for 090C6C5295576406:0000000000000001
[1702273164.961478][685609:685611] CHIP:DIS: 	Hostname: F46ADD6CD60F0000
[1702273164.961482][685609:685611] CHIP:DIS: 	IP Address #1: 172.17.0.1
[1702273164.961484][685609:685611] CHIP:DIS: 	Port: 5540
[1702273164.961488][685609:685611] CHIP:DIS: 	Mrp Interval idle: 300 ms
[1702273164.961490][685609:685611] CHIP:DIS: 	Mrp Interval active: 2000 ms
[1702273164.961493][685609:685611] CHIP:DIS: 	TCP Supported: 1
[1702273164.961496][685609:685611] CHIP:DIS: Lookup clearing interface for non LL address
[1702273164.961525][685609:685611] CHIP:DIS: UDP:172.17.0.1%docker0:5540: new best score: 2
[1702273164.961528][685609:685611] CHIP:DIS: Checking node lookup status after 1 ms
[1702273164.961531][685609:685611] CHIP:DIS: Keeping DNSSD lookup active
[1702273165.161244][685609:685611] CHIP:DIS: Checking node lookup status after 200 ms
[1702273165.161298][685609:685611] CHIP:DIS: OperationalSessionSetup[1:0000000000000001]: Updating device address to UDP:172.17.0.1:5540 while in state 2
[1702273165.161310][685609:685611] CHIP:DIS: OperationalSessionSetup[1:0000000000000001]: State change 2 --> 3

so it only finds an IPv4 address. That's not going to work for resolution from the Thread side, where an ipv6 address is needed.

So that's the first thing I would check: using something that actually shows you everything that's being advertised, is the ota-provider advertising any sort of IPv6 address?

@chuckrui
Copy link

@bzbarsky-apple Thanks for your help :)
I noticed that it seems my router does not have IPV6 enabled.
But when i opened it that show the bus error.Its show as below log.

Is there a problem with my ota firmware?
bus error.txt

@bzbarsky-apple
Copy link
Contributor

You shouldn't need IPv6 enabled in the router, if you're using link-local addresses, for what it's worth.

For the rest, I have no idea. The log says:

00> E: Failed to get device for area '"mcuboot_secondary"'

which seems like a possible problem, but I don't know where that problem would lie.

@L1amaGod
Copy link

L1amaGod commented Jan 2, 2024

I have encountered the same problem. Did you figure it out?

@KumhoElectricLJH
Copy link
Author

Currently, our team is temporarily suspending the development of Matter.
It is difficult to check this issue continuously, so we will close it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage
Projects
None yet
Development

No branches or pull requests

4 participants