Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discard all pending transactions when starting OTW update, do additional checks not to communicate with the controller #5360

Closed
2 of 3 tasks
alexruffell opened this issue Jan 25, 2023 · 33 comments · Fixed by #5368
Assignees
Labels
bug Something isn't working
Projects

Comments

@alexruffell
Copy link

alexruffell commented Jan 25, 2023

Checklist

  • I am not using Home Assistant. Or: a developer has told me to come here.
  • I have checked the troubleshooting section and my problem is not described there.
  • I have read the changelog and my problem is not mentioned there.

Deploy method

Docker

Z-Wave JS UI version

8.8.0

ZwaveJS version

10.4.0

Describe the bug

I initiated an upgrade of my Aeotec S7 using the new OTW functionality and soon after starting the update I got these errors:

image

Driver: Failed to recover from bootloader. Please flash a new firmware...

Problem is that neither HA or Zwave PC Controller see the stick as connected. It sees the COM port but not the Zwave stick so I cannot flash a new firmware or even restore the NVM I backed up just before testing this.

image

To Reproduce

Update to 7.19.0 using Zwave JS UI OTW. The file I used was BRD4206A as it is the same board type as the one on Aeotec's site.

Expected behavior

n/a

Additional context

No response

@alexruffell alexruffell added the bug Something isn't working label Jan 25, 2023
@Daniel-dev22
Copy link

Daniel-dev22 commented Jan 25, 2023

What firmware did you use? There's one for zooz and one for aeotec.

@alexruffell
Copy link
Author

I used: zwave_ncp_serial_api_controller_BRD4206A.gbl version 7.19.0

@alexruffell
Copy link
Author

@Daniel-dev22 Is there any tool that I can use to flash the stick that doesn't expect to find a zwave controller? Zwave PC Controller won't let me even try and the same goes with ZWave JS UI. I searched Simplicity Studio but did not see anything... but I am also not sure where to look.

@robertsLando robertsLando transferred this issue from zwave-js/zwave-js-ui Jan 25, 2023
@zwave-js-bot zwave-js-bot added this to Needs triage in Triage Jan 25, 2023
@AlCalzone
Copy link
Member

I waited for this to happen 🙈
Will make a small utility later today or tomorrow.

@robertsLando
Copy link
Member

robertsLando commented Jan 25, 2023

Try to enable boot loader only mode under zwave settings and then trigger the fwupdate

2023-01-25_18-11

@alexruffell
Copy link
Author

alexruffell commented Jan 25, 2023

@robertsLando I can't get to the OTW Update function as nothing shows up in the list of devices:
image

After remembering to save the changed bootloader setting:

image

@robertsLando
Copy link
Member

Shit 😅 the only way would be to do it using a driver function, the file should be placed on store dir and the function should read the file from there... @AlCalzone could you provide that function code? Otherwise I will do it tomorrow, not on my pc right now

@alexruffell
Copy link
Author

alexruffell commented Jan 25, 2023

@robertsLando That may be an issue as well because ZWave JS UI is installed as an add-on to Home Assistant so I don't believe I have any access to the docker it is installed in.

I ordered a new s7 arriving tomorrow and am hoping I can restore the NVM to the new one to get back up and running. I'll need to research the exact process to avoid messing that one up too. Then I can set up some other system to restore this one if at all possible.

I also purchased a Zooz 800 series controller. If easy to migrate I might just do that.

@AlCalzone
Copy link
Member

You can upload a file using the built in store browser, but I have no idea how to access it from a driver function.

@alexruffell
Copy link
Author

@AlCalzone - Is there a flasher tool I can use in a Linux VM or even better in Windows?

@AlCalzone
Copy link
Member

In an hour or so. Will ping you when it is done

@alexruffell
Copy link
Author

In an hour or so. Will ping you when it is done

I restored the NVM to a Zooz 700 series stick I had and my network is back up. I will likely migrate t the 800 series stick when it arrives this weekend since it was insanely easy (Thank you devs!!!). When the tool is ready, I will try to restore the s7 stick, but there is no rush. Either way, thank you!

@AlCalzone
Copy link
Member

Unfortunately, migrating to 800 series won't be as easy, since the NVM backup/restore doesn't work there (yet?).
As for the flasher utility, this is now live:
https://github.com/zwave-js/node-zwave-js/tree/master/packages/flash#usage

For example:

npx @zwave-js/flash /dev/ttyUSB0 /path/to/firmware.otz

All you need is a somewhat up to date version of Node.js (prefer LTS): https://nodejs.org/en/

@alexruffell
Copy link
Author

alexruffell commented Jan 25, 2023

I will need to learn how to use node.js :(

image

Edit: I saw my typo and repeated the command but I clearly don't understand how to use node.js :)

Edit2: I think I figured it out:
image

@robertsLando
Copy link
Member

@alexruffell YOu don't need to know how to use nodejs but just to install it and run the command @AlCalzone provided to you with the path to the file

@AlCalzone
Copy link
Member

AlCalzone commented Jan 26, 2023

yeah, Node.js just needs to be installed to have the npx executable and to be able to execute the flasher utility (behind the scenes).

That command is meant to be run in the terminal directly, no manually executing node involved.

@robertsLando
Copy link
Member

Zwave-JS UI 8.8.1 will have the action on main advanced actions dialog so the problem would be fixed. Releasing it now...

@alexruffell
Copy link
Author

alexruffell commented Jan 26, 2023

@AlCalzone As seen in the screenshot from yesterday, I figured out that I first needed to install the zwave-js driver using npm and then run the npx command from the powershell and not within the node.js window. I don't know if my syntax is correct but the tool says "Flashing firmware..." and then stops. I left it all night as I had to leave my computer and it did not do anything further. I tried again this morning with the same result. I'll research this a bit more to verify syntax.

@AlCalzone In the syntax you suggested in a post above the firmware has extension otz however the files I have have gbl. Is that ok? The flashing tool gives no feedback on the process and just hangs on "Flashing firmware" no matter what I try. I also tried the previous firmware version file I used when I upgraded to 7.18.3.

image

@AlCalzone
Copy link
Member

You didn't have to install zwave-js separately. Simply run the npx ... command.

To debug what's going on, you can add --verbose to the end of the command. This will show the driver logs in the console.

@alexruffell
Copy link
Author

alexruffell commented Jan 26, 2023

Ok, it is not recovering from bootloader:

image

@AlCalzone This could be due to something I did when trying to recover the controller using Silabs tools. In one of them you can edit basic settings on the serial chip and while I was not trying to do so, I may have reset them. Today I should get a new one and I can see what those settings are and correct them on this controller to see if it fixes the issue... unless you know why it may be failing for some other reason.

image

@AlCalzone
Copy link
Member

Hmm so it's seeing the bootloader menu but apparently the flashing process does not start. Unfortunately that communication is hidden in the logs. I will see if I can show that too in verbose mode.

@AlCalzone
Copy link
Member

AlCalzone commented Jan 26, 2023

Ok new version is out. Please use this command to make sure you get the newest version:

npx @zwave-js/flash@latest ... --verbose

(replace ... with your port and filename).

Now we should see what is happening.

@alexruffell
Copy link
Author

image

@AlCalzone
Copy link
Member

damn, that's not much...
let me try to replicate this situation - I already ended up there once

@AlCalzone
Copy link
Member

Ok, can reproduce :)
Next release should solve this.

@AlCalzone
Copy link
Member

Try again in 20 minutes or so when the release is done.

@alexruffell
Copy link
Author

alexruffell commented Jan 26, 2023

image

image

@AlCalzone You did it! The s7 is now recognized as a zwave controller in Zwave PC Controller 5 and it shows me the node info etc. I have not tested it on HA as I have already moved to a different controller I had but I could plug it into a test instance of HA if you need me to do further verification. I am really tempted to migrate to the 800 series Zooz coordinator I purchased when this one failed (arriving Saturday). Even if just a "yes" or "no" do I have anything to gain doing it now?

Thanks a lot for all your help!

@AlCalzone
Copy link
Member

Honestly I'm not sure what you gain with 800 series right now. If you really want to, you'll have to migrate the old fashioned way by excluding devices on the old controller and including them on the new one.

@alexruffell
Copy link
Author

alexruffell commented Jan 26, 2023

I am always looking to improve range and network bandwidth to eliminate issues that have plagued me for years. I am just not clear on whether a subset of 800 series improvements can benefit older series as well, or whether all improvements require the nodes to be 800 series as well. I'll do my research... Thanks again!

Triage automation moved this from Needs triage to Closed Jan 26, 2023
@alexruffell
Copy link
Author

@AlCalzone - Regarding the cause of this issue, could it be that HA was flooding the controller with zwave traffic during the OTW update? I recall seeing a timeout error before the bootloader one. If so, then would there be some mechanism to stop HA from sending traffic to the controller during an OTW update other than manually disabling the websocket?

@AlCalzone
Copy link
Member

The zwave-js driver pauses the send queue when doing OTW updates, since those messages can't be sent anyways while in the bootloader.

Maybe Z-UI did something that caused an error to be thrown and killed the process, but I guess we won't know what unless you reproduce it while making log files.
The way I got my Stick stuck in the bootloader was by pulling it out during the update.

@alexruffell
Copy link
Author

alexruffell commented Jan 26, 2023

@AlCalzone Turns out I was saving logs to file...

[redacted]

While I can't make sense of it, I did find the spot where it failed the OTW. Hopefully it helps you troubleshoot what happened. Please remove the file, or let me know to do so, if it has info that should not be on here.

BTW - is having it log to file bogging things down?

@AlCalzone
Copy link
Member

Uhh that is indeed helpful. I'll leave the gist here for my later reference, then delete your file:

Firmware update starts in the middle of an API command

2023-01-25T15:23:19.912Z DRIVER » [Node 120] [REQ] [SendDataBridge]
                                  │ source node id:   1
                                  │ transmit options: 0x24
                                  │ callback id:      0
                                  └─[Security2CCMessageEncapsulation]
                                    │ sequence number: 17
                                    └─[SupervisionCCReport]
                                        session id:          63
                                        more updates follow: false
                                        status:              Success
                                        duration:            0s
2023-01-25T15:23:19.919Z SERIAL « [ACK]                                                                   (0x06)
2023-01-25T15:23:19.919Z SERIAL « 0x010401a90152                                                       (6 bytes)
2023-01-25T15:23:19.920Z SERIAL » [ACK]                                                                   (0x06)
2023-01-25T15:23:19.920Z DRIVER « [RES] [SendDataBridge]
                                    was sent: true
2023-01-25T15:23:25.943Z SERIAL « 0x010c00a80001a103250300009a44                                      (14 bytes)
2023-01-25T15:23:25.944Z SERIAL » [ACK]                                                                   (0x06)
2023-01-25T15:23:25.945Z CNTRLR   [Node 161] [~] [Binary Switch] currentValue: false => false       [Endpoint 0]
2023-01-25T15:23:25.945Z DRIVER « [Node 161] [REQ] [BridgeApplicationCommand]
                                  │ RSSI: -102 dBm
                                  └─[BinarySwitchCCReport]
                                      current value: false
2023-01-25T15:23:29.982Z SERIAL « 0x010c00a80001a103250300009947                                      (14 bytes)
2023-01-25T15:23:29.983Z SERIAL » [ACK]                                                                   (0x06)
2023-01-25T15:23:29.985Z CNTRLR   [Node 161] [~] [Binary Switch] currentValue: false => false       [Endpoint 0]
2023-01-25T15:23:29.986Z DRIVER « [Node 161] [REQ] [BridgeApplicationCommand]
                                  │ RSSI: -103 dBm
                                  └─[BinarySwitchCCReport]
                                      current value: false
2023-01-25T15:23:34.058Z SERIAL « 0x010c00a80001a103250300009846                                      (14 bytes)
2023-01-25T15:23:34.059Z SERIAL » [ACK]                                                                   (0x06)
2023-01-25T15:23:34.059Z CNTRLR   [Node 161] [~] [Binary Switch] currentValue: false => false       [Endpoint 0]
2023-01-25T15:23:34.060Z DRIVER « [Node 161] [REQ] [BridgeApplicationCommand]
                                  │ RSSI: -104 dBm
                                  └─[BinarySwitchCCReport]
                                      current value: false
2023-01-25T15:23:41.376Z CNTRLR   Entering bootloader...
2023-01-25T15:23:41.376Z CNTRLR   Performing soft reset...
2023-01-25T15:23:41.379Z SERIAL » 0x01030008f4                                                         (5 bytes)
2023-01-25T15:23:41.380Z DRIVER » [REQ] [SoftReset]
2023-01-25T15:23:41.383Z SERIAL « [ACK]                                                                   (0x06)
2023-01-25T15:23:41.385Z CNTRLR   Waiting for the controller to reconnect...
2023-01-25T15:23:42.632Z SERIAL « 0x0119000a03000102010f5e86727322855970565a989f6c557400f3            (27 bytes)
2023-01-25T15:23:42.633Z SERIAL » [ACK]                                                                   (0x06)
2023-01-25T15:23:42.633Z DRIVER « [REQ] [SerialAPIStarted]
                                    wake up reason:        WatchdogReset
                                    watchdog enabled:      false
                                    generic device class:  0x02
                                    specific device class: 0x01
                                    always listening:      false
                                    supports Long Range:   false
2023-01-25T15:23:42.635Z CNTRLR   reconnected and restarted
2023-01-25T15:23:42.635Z CNTRLR   Enabling Smart Start listening mode...
2023-01-25T15:23:42.637Z SERIAL » 0x01030027db                                                         (5 bytes)
2023-01-25T15:23:42.638Z SERIAL » 0x0105004a4900f9                                                     (7 bytes)
2023-01-25T15:23:42.638Z DRIVER » [REQ] [AddNodeToNetwork]
                                    action: Enable Smart Start listening mode
2023-01-25T15:23:42.651Z SERIAL « 0x06                                                                 (1 bytes)
2023-01-25T15:23:42.653Z CNTRLR   [BOOTLOADER] version 1.7.1
2023-01-25T15:23:42.653Z CNTRLR   Entered bootloader
2023-01-25T15:23:42.653Z CNTRLR   Beginning firmware upload
2023-01-25T15:23:42.654Z SERIAL » 0x31                                                                 (1 bytes)
2023-01-25T15:23:42.658Z CNTRLR   [BOOTLOADER] begin upload
2023-01-25T15:23:42.663Z SERIAL « 0x43                                                                 (1 bytes)
2023-01-25T15:23:42.663Z CNTRLR   [BOOTLOADER] awaiting input...
2023-01-25T15:23:42.664Z SERIAL » 0x0101feeb17a603080000000000000301010000fa0606fa100000005b740200fc (133 bytes)
                                  8a2392405a567d8f5e7e1ef90707f924000000434541a19c1c3c1e26ca4509425877c1d2e181bd
                                  dc51c7038c8d75f3b6f3473dc156f3e8f90707f937740200edb2f82883a63d625f94eb641e8180
                                  3b2723e49087a544f24fb2a85541adab3e3b164aedf5a9
2023-01-25T15:23:42.708Z SERIAL « 0x06                                           
...

API command times out:
2023-01-25T15:23:43.640Z CNTRLR   Failed to execute controller command after 1/3 attempts. Scheduling next try i
                                  n 100 ms.
2023-01-25T15:23:43.641Z SERIAL « 0x06                                                                 (1 bytes)
2023-01-25T15:23:43.641Z SERIAL » 0x012ed1c71b84076ccaf7509a24e8d4e3d312e6429c36eea329a5f54579751d63 (133 bytes)
                                  32299fff5f67e057cce725f2564c848ca850aa8beb4c566ddd713d5fb4a61dd9178df180866c7d
                                  a2072ee64dd18c5abea6a66eb026a46d307c9cb9288afe97a623e639bef2bce70848f21fec3877
                                  85171f447f9ddb1a654643173aa4cff7935591503103b9

...
retry messes up the update:
2023-01-25T15:23:43.738Z SERIAL » 0x0133ccc0061acdeeceb2bb696b717d3f0b6f73cb7a985feeab8f74107432ef75 (133 bytes)
                                  499da8cbc2441682ed82589871b3567a1f5d6cb16b9f4b27b0d7c2e1644dc69349e718aa228521
                                  0c14eb4b9143f75a7eccbcb04c1fa58f30fc019a574af6e0fc9561307760532f3ea2dc4f541f9c
                                  048d179e9d8114c4de7be85f91d6fd258fee7c982844a3
2023-01-25T15:23:43.742Z SERIAL » 0x0105004a4900f9                                                     (7 bytes)
2023-01-25T15:23:43.742Z DRIVER » [REQ] [AddNodeToNetwork]
                                    action: Enable Smart Start listening mode
2023-01-25T15:23:43.756Z SERIAL « 0x06                                                                 (1 bytes)
2023-01-25T15:23:43.757Z SERIAL » 0x0134cbfd07d4dbcf0c32854f6a4f49180e71b698fe1407b93cda18c951b93c33 (133 bytes)
                                  db2f19a0317bc91b9c9feeef8e4557a3311e883597b1f0ea40320a63c70c0bca51b1c1d5d6cf94
                                  9f730f634c2e9e0e6f05b462d2df727f90c4745e3ed3bfb7a659dd316bf4ab2e03ab8cf9089d68
                                  5ff84caa3402ffefadc774a2f9b9283b22c7a1bdde3b33
2023-01-25T15:23:43.772Z SERIAL « 0x15                                                                 (1 bytes)
2023-01-25T15:23:43.772Z SERIAL « 0x18                                                                 (1 bytes)
2023-01-25T15:23:43.772Z SERIAL « 0x18                                                                 (1 bytes)
2023-01-25T15:23:43.772Z SERIAL « 0x18                                                                 (1 bytes)
2023-01-25T15:23:43.772Z SERIAL » 0x0134cbfd07d4dbcf0c32854f6a4f49180e71b698fe1407b93cda18c951b93c33 (133 bytes)
                                  db2f19a0317bc91b9c9feeef8e4557a3311e883597b1f0ea40320a63c70c0bca51b1c1d5d6cf94
                                  9f730f634c2e9e0e6f05b462d2df727f90c4745e3ed3bfb7a659dd316bf4ab2e03ab8cf9089d68
                                  5ff84caa3402ffefadc774a2f9b9283b22c7a1bdde3b33
2023-01-25T15:23:43.785Z CNTRLR   [BOOTLOADER] Serial upload aborted
2023-01-25T15:23:43.785Z CNTRLR   [BOOTLOADER] XModem block error 0x21
2023-01-25T15:23:44.743Z CNTRLR   Failed to execute controller command after 2/3 attempts. Scheduling next try i
                                  n 1100 ms.
2023-01-25T15:23:44.773Z CNTRLR   OTW update failed: The bootloader did not acknowledge the start of transfer.
2023-01-25T15:23:44.773Z CNTRLR   Leaving bootloader...
2023-01-25T15:23:44.774Z SERIAL » 0x32                                                                 (1 bytes)
2023-01-25T15:23:44.774Z CNTRLR   Restarting driver after OTW update...
2023-01-25T15:23:44.774Z DRIVER   destroying driver instance...
2023-01-25T15:23:44.833Z DRIVER   driver instance destroyed

@AlCalzone AlCalzone reopened this Jan 26, 2023
Triage automation moved this from Closed to Needs triage Jan 26, 2023
@AlCalzone AlCalzone changed the title Updating Aeotec S7 to 7.19.0 using Zwave JS UI disabled my controller Discard all pending transactions when starting OTW update, do additional checks not to communicate with the controller Jan 26, 2023
@AlCalzone AlCalzone self-assigned this Jan 26, 2023
@AlCalzone AlCalzone moved this from Needs triage to High priority in Triage Jan 26, 2023
Triage automation moved this from High priority to Closed Jan 27, 2023
@higs76 higs76 mentioned this issue Aug 4, 2023
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
No open projects
Triage
Closed
Development

Successfully merging a pull request may close this issue.

4 participants