Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Zooz 700 series) controller shows no neighbors #3810

Closed
3 of 11 tasks
Eriner opened this issue Nov 30, 2021 · 35 comments
Closed
3 of 11 tasks

(Zooz 700 series) controller shows no neighbors #3810

Eriner opened this issue Nov 30, 2021 · 35 comments
Projects

Comments

@Eriner
Copy link

Eriner commented Nov 30, 2021

Is your problem within Home Assistant (Core or Z-Wave JS Integration)?

NO, my problem is NOT within Home Assistant or the ZWave JS integration

Is your problem within ZWaveJS2MQTT?

NO, my problem is NOT within ZWaveJS2MQTT

Checklist

Describe the bug

Issue is similar to #2335: flat graph and controller has no neighbors.

Device information

"manufacturer": "Silicon Labs",
"manufacturerId": 0,
"label": "ZST10-700",
"description": "700 Series-based Controller",

How are you using node-zwave-js?

  • zwavejs2mqtt Docker image (latest)
  • zwavejs2mqtt Docker image (dev)
  • zwavejs2mqtt Docker manually built (please specify branches)
  • ioBroker.zwave2 adapter (please specify version)
  • HomeAssistant zwave_js integration (please specify version)
  • pkg
  • node-red-contrib-zwave-js (please specify version, double click node to find out)
  • Manually built from GitHub (please specify branch)
  • Other (please describe)

Which branches or versions?

Homeassistant zwavejs2mqtt (addon docker wrapper thing).

zwavejs 8.8.2
zwavejs2mqtt 6.0.2

Did you change anything?

yes (please describe)

If yes, what did you change?

Previously was using an older aeotec 500 stick, no issues.

Excluded devices, added new Zooz 700 series stick, reincluded devices starting with the powered devices in closest physical proximity. I understand that RF doesn't behave in a straight line, but I did the best I could. This went fine.

Some point down the line, when adding the battery powered devices, the controller seems to have "lost" its neighbors. The network seems to (mostly?) function, but I've stopped adding devices pending investigation of this issue.

I've tried enabling and disabling soft reset. I've tried healing the controller. I've tried re-interviewing the controller. I've tried soft-resetting the controller. I've tried healing the nodes in proximity to the controller manually (rather than batched "heal all"). I've tried different USB ports. I've tried a 6ft extension cable (which my aoetec stick never used or needed).

Log attached includes one false-start (I fatfingered) and then a full start.

Did this work before?

Yes (please describe)

If yes, where did it work?

No problems when I was using the Aeotec 500 series stick.

Attach Driver Logfile

zwavejs_2021-11-30.log

@zwave-js-bot zwave-js-bot added this to Needs triage in Triage Nov 30, 2021
@Eriner
Copy link
Author

Eriner commented Nov 30, 2021

Here is the controller's JSON. As mentioned above, the controller shows no neighbors. 😢

{
  "id": 1,
  "name": "",
  "loc": "",
  "values": [],
  "groups": [],
  "neighbors": [],
  "ready": true,
  "available": true,
  "hassDevices": {},
  "failed": false,
  "inited": true,
  "hexId": "0x0000-0x0004-0x0004",
  "dbLink": "https://devices.zwave-js.io/?jumpTo=0x0000:0x0004:0x0004:7.15",
  "manufacturerId": 0,
  "productId": 4,
  "productType": 4,
  "deviceConfig": {
    "filename": "/data/db/devices/0x0000/700_series_controller.json",
    "isEmbedded": true,
    "manufacturer": "Silicon Labs",
    "manufacturerId": 0,
    "label": "ZST10-700",
    "description": "700 Series-based Controller",
    "devices": [
      {
        "productType": 4,
        "productId": 4
      }
    ],
    "firmwareVersion": {
      "min": "0.0",
      "max": "255.255"
    }
  },
  "productLabel": "ZST10-700",
  "productDescription": "700 Series-based Controller",
  "manufacturer": "Silicon Labs",
  "firmwareVersion": "7.15",
  "protocolVersion": 3,
  "nodeType": 0,
  "endpointsCount": 0,
  "endpointIndizes": [],
  "isSecure": false,
  "security": "None",
  "supportsSecurity": false,
  "supportsBeaming": true,
  "isControllerNode": true,
  "isListening": true,
  "isFrequentListening": false,
  "isRouting": true,
  "keepAwake": false,
  "maxDataRate": 100000,
  "deviceClass": {
    "basic": 2,
    "generic": 2,
    "specific": 1
  },
  "deviceId": "0-4-4",
  "status": "Alive",
  "interviewStage": "Complete",
  "lastActive": 1638267648970,
  "isBatteryPowered": false,
  "statistics": {
    "messagesTX": 10,
    "messagesRX": 130,
    "messagesDroppedRX": 0,
    "NAK": 0,
    "CAN": 2,
    "timeoutACK": 0,
    "timeoutResponse": 0,
    "timeoutCallback": 0,
    "messagesDroppedTX": 0
  },
  "_name": "NodeID_1"
}

@Eriner
Copy link
Author

Eriner commented Nov 30, 2021

The devices that can see the controller at the end of the node list here are all battery powered (and thus won't forward messages):

2021-11-30 06:14:31.074 INFO ZWAVE: Success zwave api call refreshNeighbors {
'1': [ [length]: 0 ],
'4': [ 73, 74, 77, [length]: 3 ],
'6': [
11, 12, 14,
15, 16, 18,
28, 32, 36,
39, 43, 48,
49, 51, 56,
57, 59, 71,
73, 74, 77,
83, [length]: 22
],
'11': [ 6, 73, [length]: 2 ],
'12': [ 6, 73, 74, 77, 81, [length]: 5 ],
'14': [ 6, 15, 19, 32, 35, 39, 59, 71, 73, 74, 77, [length]: 11 ],
'15': [ 6, 14, 19, 28, 32, 39, 59, 71, 73, 74, 77, [length]: 11 ],
'16': [ 6, 39, 73, 74, 77, [length]: 5 ],
'18': [ 6, 28, 71, 73, 74, 77, 81, [length]: 7 ],
'19': [ 14, 15, 31, 32, 68, 69, 71, 73, 74, 77, 81, [length]: 11 ],
'27': [ 74, 77, 81, [length]: 3 ],
'28': [ 6, 15, 18, 32, 62, 63, 73, 74, 81, [length]: 9 ],
'29': [ 39, 74, 77, 81, [length]: 4 ],
'31': [ 19, 74, 77, 81, [length]: 4 ],
'32': [
6, 14,
15, 19,
28, 39,
60, 63,
71, 73,
74, 77,
79, 81,
83, [length]: 15
],
'36': [ 6, 39, 73, 74, [length]: 4 ],
'39': [
1, 6, 11,
14, 15, 16,
29, 32, 36,
43, 48, 49,
51, 56, 57,
59, 71, 73,
74, 77, 81,
[length]: 21
],
'43': [ 6, 34, 39, 71, 73, 74, 77, 83, [length]: 8 ],
'48': [ 6, 39, 73, [length]: 3 ],
'49': [ 6, 34, 39, 53, 71, 73, 74, 77, [length]: 8 ],
'51': [ 6, 39, 53, 73, [length]: 4 ],
'56': [ 6, 39, 71, 73, 83, [length]: 5 ],
'57': [ 34, 55, 71, 73, 74, 77, [length]: 6 ],
'59': [ 14, 15, 35, 71, 73, 74, 77, 81, [length]: 8 ],
'60': [ 32, [length]: 1 ],
'61': [ [length]: 0 ],
'62': [ 28, [length]: 1 ],
'63': [ 28, 32, [length]: 2 ],
'66': [ [length]: 0 ],
'67': [ 1, [length]: 1 ],
'68': [ 19, [length]: 1 ],
'69': [ 19, [length]: 1 ],
'70': [ 1, 14, [length]: 2 ],
'71': [ 1, 14, 15, 18, 35, 39, 43, 49, 56, 57, 59, 72, 73, [length]: 13 ],
'72': [ 1, 71, [length]: 2 ],
'73': [
6, 11, 12,
14, 15, 16,
18, 19, 28,
32, 35, 36,
39, 43, 48,
49, 51, 56,
57, 59, 71,
74, [length]: 22
],
'74': [
4, 6, 12,
14, 15, 16,
18, 19, 27,
28, 29, 31,
32, 35, 36,
39, 49, 57,
59, 71, 73,
[length]: 21
],
'77': [
4, 12,
14, 15,
16, 18,
19, 27,
29, 31,
32, 43,
49, 57,
59, 74,
78, 79,
[length]: 18
],
'78': [ 15, 74, 77, [length]: 3 ],
'79': [ 32, 71, 74, 77, [length]: 4 ],
'81': [
12, 14,
15, 18,
19, 27,
28, 29,
31, 32,
39, 49,
57, 59,
77, 84,
[length]: 16
],
'82': [ 73, [length]: 1 ],
'83': [ 6, 14, 15, 32, 43, 49, 56, 57, 59, 71, 73, 74, [length]: 12 ],
'84': [ 71, 73, 74, 77, 81, [length]: 5 ],
'85': [ 71, 74, [length]: 2 ],
'86': [ 71, 73, 74, 77, 81, [length]: 5 ]
}

@AlCalzone
Copy link
Member

If the controller doesn't "see" any neighbors (or rather won't tell us), I'm afraid there's not much we can do.
If you make an NVM backup and send it to me via E-Mail, I could take a look if that contains other data than the one it tells the driver.

Other than that, you'll have to talk to the Zooz and/or Silicon Labs (who make the firmware).

@kars85
Copy link

kars85 commented Nov 30, 2021

I have the same controller @Eriner - mine seems to be working fine. If you need me to be a control for any testing, let me know.

image

@Eriner
Copy link
Author

Eriner commented Nov 30, 2021

That is unfortunate.

And I presume it's not helpful, but for the sake of completeness here is the log of the controller failing to heal:

2021-11-30T18:43:06.587Z CNTRLR [Node 001] Healing node...
2021-11-30T18:43:06.588Z CNTRLR [Node 001] healing node...
2021-11-30T18:43:06.588Z CNTRLR » [Node 001] refreshing neighbor list (attempt 1)...
2021-11-30T18:43:06.594Z DRIVER » [Node 001] [REQ] [RequestNodeNeighborUpdate]
callback id: 157
2021-11-30T18:43:51.471Z DRIVER « [REQ] [RequestNodeNeighborUpdate]
callback id: 157
update status: UpdateFailed
2021-11-30T18:43:51.476Z CNTRLR « [Node 001] refreshing neighbor list failed...
2021-11-30T18:43:51.476Z CNTRLR » [Node 001] refreshing neighbor list (attempt 2)...
2021-11-30T18:44:36.637Z DRIVER « [REQ] [RequestNodeNeighborUpdate]
callback id: 159
update status: UpdateFailed
2021-11-30T18:44:36.639Z CNTRLR « [Node 001] refreshing neighbor list failed...
2021-11-30T18:44:36.639Z CNTRLR » [Node 001] refreshing neighbor list (attempt 3)...
2021-11-30T18:44:36.642Z DRIVER » [Node 001] [REQ] [RequestNodeNeighborUpdate]
callback id: 160
2021-11-30T18:45:21.671Z DRIVER « [REQ] [RequestNodeNeighborUpdate]
callback id: 160
update status: UpdateFailed
2021-11-30T18:45:21.674Z CNTRLR « [Node 001] refreshing neighbor list failed...
2021-11-30T18:45:21.675Z CNTRLR » [Node 001] refreshing neighbor list (attempt 4)...
2021-11-30T18:45:21.677Z DRIVER » [Node 001] [REQ] [RequestNodeNeighborUpdate]
callback id: 161
2021-11-30T18:46:51.731Z DRIVER « [REQ] [RequestNodeNeighborUpdate]
callback id: 162
update status: UpdateFailed
2021-11-30T18:46:51.733Z CNTRLR « [Node 001] refreshing neighbor list failed...
2021-11-30T18:46:51.734Z CNTRLR [Node 001] failed to update the neighbor list after 5 attempts, healing failed
2021-11-30 13:46:51.734 INFO ZWAVE: Success zwave api call healNode false

@AlCalzone I'm going to futz with this a bit more. Will send the nvm backup file along this evening, thanks.

@AlCalzone
Copy link
Member

UpdateFailed

hints at connectivity issues. Z-Wave sticks in particular are prone to interference by USB ports, especially by USB3 ports. We recommend putting the stick in a suitable location:

  • on an USB extension cord (this works wonders!)
  • away from other USB ports
  • away from metallic surfaces
  • and especially not in the back of a server rack

@Eriner
Copy link
Author

Eriner commented Nov 30, 2021

As I mentioned in my original issue, i've tried a 6ft extension cord, ultimately moving the device into the middle of my basement. Just for kicks, I just pulled out a powered hub on an extension, now have it half-way up the staircase giving it verticality. Right above the staircase is an invoelli switch (just behind the thin ply door).

However my observation is that the device seems to adequately perform. For example, I was able to upload firmware to one of my close-proximity nodes without issue, whereas the firmware updates fail from further-in-physical-proximity nodes, as expected.

And in all fairness, my previous Aeotec 500 series stick performed great in these specific conditions 😈 :

  • away from other USB ports
  • away from metallic surfaces
  • and especially not in the back of a server rack

I understand that the new 700 series devices don't have the same filters as the 500 series controllers.

@Eriner
Copy link
Author

Eriner commented Nov 30, 2021

One other difference between this configuration and the previous configuration (with the Aeotec stick) is the whole reason I shelled out for the zooz stick: S2. All of my powered nodes are connected via Authenticated S2 except for one, node 4, which is not in close physical proximity to the controller (it's a zen15 - I moved it outside after setting up the inovelli switches).

@AlCalzone
Copy link
Member

One other difference between this configuration and the previous configuration (with the Aeotec stick) is the whole reason I shelled out for the zooz stick: S2

I hate to tell you, but you didn't have to switch for that. S2 is handled by the host software (Z-Wave JS). The only thing that was missing until recently is SmartStart support, but Aeotec included that with the 1.02 firmware update.

whereas the firmware updates fail from further-in-physical-proximity nodes, as expected.

It is certainly not an ideal scenario, but firmware updates are not expected to fail on further away nodes.

@Eriner
Copy link
Author

Eriner commented Nov 30, 2021

firmware updates are not expected to fail on further away nodes

Then this may indeed indicate a degraded network. My understanding was that firmware upgrades couldn't be relayed, and that the device had to be in close proximity to the controller for (direct) upgrades. Regardless, my controller can perform high-throughput firmware upgrades to some nodes that are also not listed as neighbors.

you didn't have to switch for that

IIRC, the stick I had was the older gen5 aeotec stick, not the gen5+. IDK if that would have mattered, but something I read must have made me think I needed something newer. Maybe a firmware update would have sufficed... regardless here I am.

@AlCalzone
Copy link
Member

My understanding was that firmware upgrades couldn't be relayed

They are using the same communication method as all the other commands. The only downside is that you're sending roughly 2-3000 commands in quick succession which can get problematic for the rest of the communication in the network while it is going on. Being closer means this is over quicker.

@Eriner
Copy link
Author

Eriner commented Dec 1, 2021

Ah, good to know, thanks!

@kpine
Copy link
Contributor

kpine commented Dec 1, 2021

@Eriner I was reading your last comment in the other issue... How many nodes are in your network? Do you think you have a busy network?

I came across a Reddit post from the HomeSeer developers, and they claim that the 700-series are broken for "networks that have a lot of traffic". They are currently not recommending 700-series yet, even their own G3 controller.

That latter statement is kind of hard to quantify (what is "a lot" of traffic?), but since you've said you didn't have any problems with the 500-series controller, I'm wondering if you're hitting that problem. Not sure how to even confirm if that's the case.

I personally have no issues with my 700 controller, but my network is pretty small (less than 20 nodes) and mostly idle.

@AlCalzone
Copy link
Member

claim that the 700-series are broken for "networks that have a lot of traffic"

That's consistent with some of the things I've seen. Earlier firmwares failed to discover neighbors after a certain size (part of healing), maybe this is a different manifestation of the same problem.

@jmgiaever your network is pretty large too, right?

@Eriner
Copy link
Author

Eriner commented Dec 1, 2021

I would think the amount of traffic would vary with connectivity - more devices that require relaying means more traffic.

To emphasize my plight, when using my aeotec stick in the back of my server rack, in my basement, connected without an extension cable straight into a USB3 port.... all but a few devices were neighbors of the controller and connected directly, no hops.

@AlCalzone After using multiple extensions, powered hubs, etc, I can get my zooz 700 controller to see the invovelli switches that are now only ~15 ft away. But it can't see/neighbor with the switches that are 20ft away. Nothing about this seems right - as I've stated before my aeotec stick worked great a worse environment (no extensions, server rack, basement) and this zooz stick is just eating dirt.

Again, a difference here being that the inovelli switches (my powered relays) are now all using authenticated S2. My only other powered relay (the zen15) is connected insecurely.

To directly answer your "large or not" question, I have:

5x BE469ZP Schlage locks
5x DMSV1/DMWS1 Dome water leak sensors and shutoff
22x LZW30-SN Inovelli Red On/Off switches
5x LZW31-SN Inovelli Red Dimmer switches
5x Ecolink Motion Sensors
2x Ecolink Tilt Sensors
8x ZCOMBO smoke detectors
1x Zen15 Zooz power switch

53 total devices.

@AlCalzone
Copy link
Member

difference here being that the inovelli switches (my powered relays) are now all using authenticated S2

That doesn't make a difference. Encryption happens one level higher than we're looking at here.

53 total devices

That's on the larger side already (definitely larger than the majority of Z-Wave JS users).

@Eriner
Copy link
Author

Eriner commented Dec 1, 2021

That's on the larger side already (definitely larger than the majority of Z-Wave JS users).

Interesting. I'm a pretty early adopter, have grown the network over two moves 😅

Does the number of nodes really matter that much if the majority are powered, zwave plus nodes? With my aeotec stick everything was working great, zero problems ever.

@AlCalzone
Copy link
Member

Does the number of nodes really matter that much if the majority are powered, zwave plus nodes?

It shouldn't, but apparently it does.

@Eriner
Copy link
Author

Eriner commented Dec 1, 2021

~90 second video showing my layout that should explain why things are surely amiss somewhere...

https://noagendatube.com/videos/watch/f5a38945-8b33-4e6d-8a80-6123a78fee02

@kpine
Copy link
Contributor

kpine commented Dec 1, 2021

If it were me, I would consider migrating back to the Aeotec, and see if the problem is solved. If you upgrade the firmware to 1.02 first, you'll get SmartStart support. At that point, there's not much difference in functionality between it and a 700-series controller. If that works, then you'll have a better user experience and it will probably prove that there is an issue with the 700-series. If it doesn't work, it was a big waste of time. It's no picnic to migrate a network of your size, so I feel a little bad even suggesting it. 😢 At least SmartStart makes it easier.

Otherwise, if it really is a problem with the controller, you'll have to wait for some firmware upgrade with no ETA. If you can tolerate the occasional reset of the controller, then maybe it's worth it to avoid the migration and tough it out for now.

Unless there are any new revelations, those seem like the options for now.

You could also try to remove some nodes and make the network smaller, and see if the problem goes away at some certain node count. Of course, you can't use those nodes unless you include them on the Aeotec in a separate network.

I've watched the product page for the HomeSeer stick bump release dates from October, to November, and now December 14th. Maybe wait until then to make a decision. If they go on sale, then perhaps the problem is fixed. If it's delayed again, I would assume they are waiting for a fix. You can signup for in-stock notifications. 😄

@Eriner
Copy link
Author

Eriner commented Dec 1, 2021

At least SmartStart makes it easier.

Yeah, I have 37 smartstart entries. I assume there's a JSON file/section somewhere I can pull to back that up locally just in case... THAT part (getting those entries) isn't something I'd like to do again, as it involves removing switchplates.

you'll have to wait for some firmware upgrade with no ETA

I vaguely remember reading somewhere that zooz isn't/won't offer firmware upgrades for controllers, but that may or may not be accurate. Or it could have been for the 500 series controller. C'mmon zooz, giving the same model number to your 500 and 700 series controller? Seriously?

You can signup for in-stock notifications.

I was going to wait for the Aeotec stick, after I had such great success with their Gen5 stick. But a friend said "just buy the zooz one on amazon, it'll be fine..." and I didn't listen to my gut, caved, and here I am.

I'm going to stick around to see if @AlCalzone's revert of some things suddenly fix it. If not, assuming I can back-up the smartstart entries, I'll try the following (in order):

  • hard reset the Zooz stick, in case some coronal mass ejection caused NVM bitflips or something
  • switch back to the Aeotec stick, firmware upgrade, retry
  • buy a new zooz stick to see if I got a lemon
  • wait for an aeotec 700 stick to become available, try with that.

@AlCalzone
Copy link
Member

AlCalzone commented Dec 1, 2021

I assume there's a JSON file/section somewhere I can pull to back that up locally just in case...

that's in the <homeid>.json file, at the top under the "controller" key. And I think zwavejs2mqtt lets you export/import it.

zooz isn't/won't offer firmware upgrades for controllers

All 700-series controllers use the same firmware, which is available from Silabs.

see if @AlCalzone's revert of some things suddenly fix it.

You can try building your own container (or just run zwavejs2mqtt natively and update zwave-js there):
https://zwave-js.github.io/zwavejs2mqtt/#/development/custom-docker?id=building-a-container-using-dockerfilecontrib
Use the revert-send-queue-changes branch or install the zwave-js@8.8.3-0-pr-3830-852dbcf version.

@Eriner
Copy link
Author

Eriner commented Dec 1, 2021

Great on all counts, thanks @AlCalzone.

I'll get that backup handled and see if that branch makes a difference.

@jmgiaever
Copy link

@jmgiaever your network is pretty large too, right?

Yeah, 56 nodes and I have still maybe 20 to add, but I've waited to do that. Seems to me that the battery devices is causing most problems....

@kpine
Copy link
Contributor

kpine commented Dec 1, 2021

I was going to wait for the Aeotec stick, after I had such great success with their Gen5 stick. But a friend said "just buy the zooz one on amazon, it'll be fine..." and I didn't listen to my gut, caved, and here I am.

I meant use the "in stock notification" as a signal that maybe the issue has been fixed. I'm not suggesting you buy a new stick. If HomeSeer starts selling their 700-series stick, then that would indicate they think the issue is fixed. Apparently it was for sale earlier, but not now.

I found the post in the HomeSeer forums, here is the start of the conversation from the developer. There is a lot of information in that and following posts, it's nice they are providing updates. Here is the latest status where the issue has been confirmed by SL.

After more discussion with Silicon Labs, the issue is that they now listen to the network and do not transmit until the network is clear. I don't have details on the algorithm they are using. However, there should be a max delay so delays longer than a few seconds should not happen, but it does. I have seen it wait up to 60 seconds on a busy network. Our plugin has a default transmit timeout of 8 seconds. If the transmit takes longer than 8 seconds we issue a transmit abort. However, the transmit abort is not working and the interface can go into a bad state, sometimes locking up totally. To stop this from happening you can set the transmit timeout to 60 seconds (config page in the plugin). While this is too long to wait it does stop the interface from locking up.

If your network is not busy, you most likely will not see any issues. A busy network is one where you have many other devices randomly transmitting, like power reporting devices, temp devices, motion sensors, etc. But only if they transmit often.

We are still working with SIL to get this resolved.

@Eriner Are you seeing command timeouts when your problem starts?

Do one of the Z-Wave JS timeout driver settings correspond to the HomeSeer transmit timeout?

I'm not sure what any of this has to do with the neighbor listings though... is that another symptom resulting from aborted commands?

@Eriner
Copy link
Author

Eriner commented Dec 1, 2021

I'm not sure what any of this has to do with the neighbor listings though... is that another symptom resulting from aborted commands?

It could be related, not sure. The heal of the controller often simply times out. Sometimes it tries all 5 neighbor updates (actually tries), other times it tries once, then spams the log saying it tried 4 more times (but clearly didn't in ~20ms). Could be the stick is just locked.

Seems to me that the battery devices is causing most problems....

@jmgiaever what makes you say that? I set up all my powered devices first, without issue. Everything worked. Then I stared adding battery devices and things became a problem. Have you had an identical experience?

@jmgiaever
Copy link

I set up all my powered devices first, without issue. Everything worked. Then I stared adding battery devices and things became a problem. Have you had an identical experience?

Same. I also had no big issues with healing until then, but now I have the same problem as you. It fails instantly until the z2m is restarted. Sometimes I have to unplug the stick too, but that might be just on a heavy load.

I do have many devices, but now barely a drop (or duplicate command) on a «normal» network. It happens every now and then during heal or interview, but I guess that is normal.

@AlCalzone
Copy link
Member

It could be related, not sure. The heal of the controller often simply times out. Sometimes it tries all 5 neighbor updates (actually tries), other times it tries once, then spams the log saying it tried 4 more times (but clearly didn't in ~20ms). Could be the stick is just locked.

To issue the healing commands, the controller must be able to send. If it can't (or thinks it can't), it will call back with a failure.

I think both issues are pretty related.

@Eriner
Copy link
Author

Eriner commented Dec 1, 2021

@jmgiaever on my old network I had all of my devices connected as insecure. For the 700 stick rebuild, I tried adding all of my battery devices as S0. It "worked" but their inclusion state always returned "added node XX as undefined", followed by a message "added node XX with security None" (or something like that).

I'm not sure if that ^ is related to the issue at hand at all or not, just trying to find more potential commonalities.

Edit: Oops, I tagged jamarzka but meant to tag @jmgiaever (which GitHub doesn't tab complete, for some reason). Sorry jamarzka!

@AlCalzone
Copy link
Member

@Eriner That seems to me like the communication issues were already present directly after inclusion. The exchange of security info is pretty sensitive to dropped messages.

Side note: we don't recommend using S0 unless required by the device type.

@Eriner
Copy link
Author

Eriner commented Dec 1, 2021

@AlCalzone interesting.

we don't recommend using S0 unless required by the device type.

These devices are all zwave plus without a DSK. I'm adding all of the devices as S0 because I want to force encryption. Is this an incorrect approach?

@AlCalzone
Copy link
Member

It's your choice. S0 adds Encryption but triples the messages. If a network isn't 100% reliable, you'll see a higher rate of message failures or longer delays.

@Eriner
Copy link
Author

Eriner commented Dec 1, 2021

Understood. Even if it triples the messages, given my large number of powered plus nodes (which are well distributed) I wouldn't think it would be much of an issue for my use-case. I can try adding them as default/insecure when I rebuild to see if it matters.

@Eriner
Copy link
Author

Eriner commented Dec 6, 2021

I've followed what's been posted in #3842 and #2545, cheers @kars85 and @kpine.

Anyway, I updated to the latest release (with the revert) and no dice - nothing changed. Tried pulling the stick, healing, etc. Same symptoms of the stick becoming unresponsive.

I upgraded to 7.16.3... firmware upgrade worked, but the network is remains borked. Tried manual healing of devices, no dice. :(

@AlCalzone
Copy link
Member

Duplicate of #3906

@AlCalzone AlCalzone marked this as a duplicate of #3906 Dec 15, 2021
Triage automation moved this from Needs triage to Closed Dec 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Triage
Closed
Development

No branches or pull requests

5 participants