Skip to content
This repository has been archived by the owner on Feb 24, 2021. It is now read-only.

[bug] Struggling to migrate from HA #172

Closed
2 tasks
Dinth opened this issue Dec 13, 2019 · 28 comments
Closed
2 tasks

[bug] Struggling to migrate from HA #172

Dinth opened this issue Dec 13, 2019 · 28 comments
Assignees
Labels
bug Something isn't working

Comments

@Dinth
Copy link
Contributor

Dinth commented Dec 13, 2019

Version

Build/Run method

  • [ x ] Docker
  • PKG
  • Manually built (git clone - npm install - npm run build )

Zwave2Mqtt version: 2.0.6-dev
Openzwave Version: 1.6-962

Describe the bug
I decided that i will try change my Zwave controller from Home Assistant to Zwave2Mqtt docker version 2.0.6-dev. I didnt want to migrate zwcfg, as one of the main reasons i wanted to change the controller is that half of my devices dont have or have incorrect config in OZW1.4. I assumed that Zwave2Mqtt will detect the devices included in the mesh and fetch their latest OZW1.6 configs anyway.

Unfortunately, half of my devices are not detected at all, just showing "Unknown: type=0000, id=0000 (Unknown: id=0000)", sometimes with a correct "Device type" field, sometimes with empty "Device type".

Its completely random, for example half of my radiator valves was detected correctly, but other half not. Some mains powered devices were not detected either.

Im also getting a lot of OZW errors in my log.

2019-12-13T18:24:27.278Z z2m:Zwave Notification from node 26: Notification - TimeOut (1)
OpenZWave Error, Node026, ERROR: Dropping command, expected response not received after 1 attempt(s)
2019-12-13T18:24:37.279Z z2m:Zwave Notification from node 27: Notification - TimeOut (1)
OpenZWave Error, Node027, ERROR: Dropping command, expected response not received after 1 attempt(s)
OpenZWave Error, Node028, ERROR: ZW_SEND_DATA could not be delivered to Z-Wave stack
2019-12-13T18:24:47.279Z z2m:Zwave Notification from node 28: Notification - TimeOut (1)
OpenZWave Error, Node028, ERROR: Dropping command, expected response not received after 1 attempt(s)
OpenZWave Error, Node029, ERROR: ZW_SEND_DATA could not be delivered to Z-Wave stack
2019-12-13T18:24:50.781Z z2m:Zwave Notification from node 29: Notification - NoOperation (2)
OpenZWave Warning, Node029, WARNING: Device is not a sleeping node.
2019-12-13T18:24:52.095Z z2m:Zwave Notification from node 31: Notification - NoOperation (2)
2019-12-13T18:24:52.236Z z2m:Zwave Notification from node 32: Notification - NoOperation (2)
2019-12-13T18:24:53.635Z z2m:Zwave Notification from node 34: Notification - NoOperation (2)
2019-12-13T18:24:53.786Z z2m:Zwave Notification from node 35: Notification - NoOperation (2)
2019-12-13T18:24:53.852Z z2m:Zwave Notification from node 36: Notification - NoOperation (2)
2019-12-13T18:24:53.936Z z2m:Zwave Notification from node 37: Notification - NoOperation (2)
2019-12-13T18:24:55.458Z z2m:Zwave Notification from node 42: Notification - NoOperation (2)
2019-12-13T18:24:55.516Z z2m:Zwave Notification from node 44: Notification - NoOperation (2)
2019-12-13T18:24:59.032Z z2m:Zwave Notification from node 48: Notification - NoOperation (2)
OpenZWave Warning, Node048, WARNING: Device is not a sleeping node.

Additional context
Im not sure if this is relevant, but none of my nodes show up as Secure, although the majority of them was paired in a secure fashion. This being said, my secure key in Zwave2mqtt config is set correctly i believe - changing it creates even more errors related to encoding.

Screenshot from 2019-12-13 18-14-11
Screenshot from 2019-12-13 18-14-33

@Dinth Dinth added the bug Something isn't working label Dec 13, 2019
@Dinth
Copy link
Contributor Author

Dinth commented Dec 13, 2019

here is a full log https://paste.ubuntu.com/p/X535FtyJJB/

@robertsLando
Copy link
Member

robertsLando commented Dec 14, 2019 via email

@Dinth
Copy link
Contributor Author

Dinth commented Dec 14, 2019

Hey Daniel and thanks for a quick reply.

The secure flag is not properly working with all nodes, it rely on a command class that is not supported by all devices so don’t worry if you don’t see it to true.

I need to investigate this further, but Home Assistant was seeing most of my devices as secure.
Just an exempt from zwcfg generated with HA (forked OZW1.4).

<Node id="60" name="" location="" basic="4" generic="17" specific="1" roletype="5" devicetype="1536" nodetype="0" type="Multilevel Power Switch" listening="true" frequentListening="false" beam
ing="true" routing="true" max_baud_rate="40000" version="4" secured="true" query_stage="Complete">

After moving to Zwave2MQTT literally none of my nodes shows as secure in Z2M, not even a door lock.

About the error with nodes config, if you have paired them using a different version of openzwave there could be a configuration conflict, have you tried to remove and add a node to see if that is paired correctly?

I haven't moved zwcfg across, hoping it will OZW will regenerate configuration using the latest configs. Ive done this before in my mesh through HA and always all the devices were detected correctly. Also, im not sure if reinclusion of all the nodes should be necessary. If i take out the stick from my docker machine and stick it to a Windows computer with just Aeotec Zwave tool running, no configs, etc, it still sees and can talk to all the devices ive got in a mesh, with correct device type and models detected.

Also its worth noting, that if i remove zwcfg generated by Z2M forcing it to regenerate configs, it starts detecting devices correctly until some point when suddenly all the devices which havent been detected yet are going "Unknown node" and errors in the log start showing up.

Reinclusion of all the nodes would be problematic as many of my mains powered devices are inaccessible.

@Fishwaldo
Copy link
Member

Did you copy the network key over? If not that would be the reason.

@Dinth
Copy link
Contributor Author

Dinth commented Dec 14, 2019

@Fishwaldo yes, i have and it is correct - changing it to incorrect key actually generates a lot of additional errors regarding encryption.

I have started up from scratch with :latest docker instead of :2.0.6-dev and it works much better.

  • All the devices are still showing up as paired non securely, which is weird. I stoped Z2M container and restarted HA with Zwave component just to be sure and majority of my devices show up as paired securely in HA.
  • Discovery/zwcfg building process still gets interrupted at some point, but devices which were not discovered don't get "Unknown: type=0000; id=0000" label anymore, instead there's no label at all. But most importantly, running "Refresh node info" on those nodes now works and fetches correct node info, so i can say that "It works!".

Unfortunately, :latest version is still on OZW 1.4 which doesnt have config for multiple of my devices.

@Dinth
Copy link
Contributor Author

Dinth commented Dec 15, 2019

Just wanted to note something i just found out. When im generating config in OZW1.6 it detects few nodes and then stops detecting the rest, even with Refresh Node info. So i have run Refresh node info on a node which was detected correctly in the first running manually refresh node info it's no longer detected correctly.

Original run, node detected correctly as EUR_SPIRITZ Wall Radiator Thermostat: https://paste.ubuntu.com/p/tRgsZWnvzp/
Running refresh node info after point of time when OZW1.6 stops discovering devices correctly.
https://paste.ubuntu.com/p/nVSvqhmVgg/

@Dinth
Copy link
Contributor Author

Dinth commented Dec 15, 2019

I have logged this issue upstream.
@robertsLando Could you please update Z2M 2.0.6-dev to OZW 1.6-992-g76e21d80. Apparently the process of discovery has been changed recently and it would greatly help with troubleshooting.
Also, Ive been advised that OZW log shows that many of my nodes are paired with S0. Therefore its weird why Z2M doesnt see that.

@Dinth
Copy link
Contributor Author

Dinth commented Dec 16, 2019

with a help of you and people on OZW github im finally getting somewhere.
I can still see an issue with Z2M not detecting Secure status correctly during initial node interview process (regenerating OZW cache file). If i manually run "Refresh Node Info" after that, the Secure status changes to Yes.
Here's the latest log: https://paste.ubuntu.com/p/xB3RZZ2bCC/.
First part of it is the initial interview process, but at the end of the log, i have run "Refresh Node Info" on two nodes and Secure status was detected properly.

@robertsLando
Copy link
Member

@Dinth does #174 could be a good option to add to fix your problems?

@Dinth
Copy link
Contributor Author

Dinth commented Dec 16, 2019

Yes, that would be a good workaround. Alternatively maybe upgrading OZW to the latest build would actually fix this, as apparently the interview process (which is obviously failing here) in OZW has been recently enhanced?

@robertsLando
Copy link
Member

robertsLando commented Dec 16, 2019 via email

@Dinth
Copy link
Contributor Author

Dinth commented Dec 18, 2019

That's fine @robertsLando, i will wait, im not really proficient with building own docker containers.
Thanks and no hurry!

@robertsLando
Copy link
Member

robertsLando commented Dec 20, 2019

@Dinth @Fishwaldo Could someone point me to latest stable commit of Openzwave for production env? Latest release on http://old.openzwave.com/downloads/ is outdated now

@robertsLando
Copy link
Member

@Dinth could you try latest commit?

@petergebruers
Copy link

petergebruers commented Dec 21, 2019

Could someone point me to latest stable commit of Openzwave for production env? Latest release on http://old.openzwave.com/downloads/ is outdated now

Yeah, build server issues (Justin is migrating stuff) - edit: only wiht pull requests, the releases are fine. I saw your issue about the zipped releases being old . Sorry about the delay, it is a busy time of year and I am only a volunteer...

OpenZWave/open-zwave#2042

"[feat] Tag stable commits more frequently with OZW version"

I think it is worth trying current master (edit: december 9), it has bugfixes and improvements (commit 76e21d80a "Add Kwikset Convert")

Interesting commits on master are:

  • Don't request Instances if Multichannel is "after mark"
  • Refuse to add 255 as a NodeID to any Association. Issue #1883 and Serialize the Initilization Sequence, and Bail out if we have a invalid HomeID or Controller NodeID (fixes a rare bug)
  • Make flock(m_hSerialController... actually do something (#2026) - refuses to start if another app has flock-ed the controller.

BTW I develop on macOS Catalina and I also tested the build(s) on Rpi 2 on a 10 node network. I occasionally develop and test on windows as well.

As Justin said... http://old.openzwave.com/downloads/ is your be#t source and it is not outdated

@Dinth
Copy link
Contributor Author

Dinth commented Jan 15, 2020

Hey. Thanks for all the help with this.
Ive been advised to rebuild my Zwave network from scratch and so did i. I have replaced some older nodes with new ones, bought few more, got a Fibaro HC to update firmware on my Fibaro nodes and started from scratch.
So far i have just done about 1/3 of my house, starting from the side where controller is, so all the nodes are fairly close to each other and there's almost no walls between them. Unfortunately im already starting having problems. Im only getting about 1 CAN received error per day so far, so this shouldnt be an issue.
I was wondering if someone could have another look on my OZW log please, those Zwave problems are really killing me mentally.

The time it takes too properly find all the nodes (showing "Product" field in Z2M) has greatly increased after adding last 4 nodes, With a dozen of nodes it was taking just few seconds, after adding last 4 it increased to several minutes. But this is a minor one, i shouldnt restart Z2M very often in theory.
The time it takes nodes to respond when i use a switch, varies greatly. Sometimes its instant, sometimes it takes several seconds.
But the biggest issue is that whole whole Zwave network completely stops working about few hours (usually around 24).
In the log pasted its visible on 15th of Jan around 22:05. It seems that the network worked fine until then. At 22:08 i just wanted to switch off one of the lights several times (Node006) but it didnt work. At 22:09 i run the network heal. I have restarted the Z2M container at 22:30 and there was literally no new messages received by OZW until that time. After restarting the Z2M container everything worked fine, so i dont think its a hardware issue
http://s000.tinyupload.com/index.php?file_id=26366815847530911540

@robertsLando
Copy link
Member

robertsLando commented Jan 16, 2020

@Fishwaldo Is this a known issue? @Dinth With latest version 2.1.0 you can set an auto heal task that runs daily at a specific time, you could give that a try?

I was wondering if someone could have another look on my OZW log please, those Zwave problems are really killing me mentally.

I know that feel, I have made an entire building with 200+ zwave devices and I had to use 20 raspberry as gateways running z2m. The main reason I made this decision was the distance of devices but also because I noticed that a 10+ devices network was very slow and sometimes stops responding at all like you said.

In my case I'm running OZW 1.4 on all raspberry so this may be fixed in 1.6 (?) but I will never update them as it would took me at least 3/4 days with the risk that stuff could stop working at all.

@Dinth
Copy link
Contributor Author

Dinth commented Jan 16, 2020

Ah yes, I have already updated to 2.1.0 and enabled autoheal.
I will try to switch it off and see what happens.

@Fishwaldo
Copy link
Member

Auto heal is possibly the worst thing todo. Please disable it and remove the option all together.

You should only heal when adding/removing devices or physically moving a mains powered device.

@Dinth
Copy link
Contributor Author

Dinth commented Jan 17, 2020

Sorry, i have just noticed that i never actually saved my config after enabling autoheal, so it was never enabled. I will keep it off as per @Fishwaldo advice. Autoheal shouldnt make much difference for me, as according to the below diagram, every node sees all other nodes anyway.
Screenshot from 2020-01-17 08-09-03

@robertsLando
Copy link
Member

robertsLando commented Jan 17, 2020

@Dinth Love that mesh graph 😍 . Dunno why but on all my instances all devices only see the controller even if they are close (no battery devices). @Dinth would you mind to send a PR to update the mesh graph screenshot? Or just add it here and I will do it my own... Just send the same image with the entire interface

@Fishwaldo
Copy link
Member

@Dinth
Copy link
Contributor Author

Dinth commented Jan 17, 2020

Thanks @Fishwaldo. Could you please say little bit more about those extended statistics? Thanks sounds like something what could help me troubleshoot my issues and Z2M is already based on OZW1.6 so those should be accessible somehow?

@Fishwaldo
Copy link
Member

You have to have a fairly new stick (now 700 series and not the zstick S2). And you should see in the logs a message about TxStatus very often. The actual details are published as statistics for each node. (The route a packet took, speed, rssi levels etc).

See http://www.openzwave.com/dev/classOpenZWave_1_1Manager.html#a588a7e060c0aa312b00082d5c2683f73 and http://www.openzwave.com/dev/structOpenZWave_1_1Node_1_1NodeData.html for details.

I don’t know if Z2m exposes it, but it’s in your OZW logs.

@robertsLando
Copy link
Member

robertsLando commented Jan 18, 2020 via email

@Dinth
Copy link
Contributor Author

Dinth commented Jan 23, 2020

Thanks for all your help @robertsLando @Fishwaldo. I think this issue can be closed, i have completed migration from HA, after rebuilding my network it works decently. To carry on with troubleshooting my networks stability problems i need to wait for Aeotec to release a new 700 series controller and i have opened separate issues for bugs i have encountered.
Thanks again!

@Dinth Dinth closed this as completed Jan 23, 2020
@darkbasic
Copy link
Contributor

darkbasic commented Oct 25, 2020

Regarding nodes showing Unknown: type=0000, id=0000 (Unknown: id=0000) I think this is due to some communication error occurred when openzwave queried for the node. The solution is to stop zwave2mqtt, remove the node from ozwcache_xxx.xml and restart it to let it requery the node. Maybe "Refresh node info" could help as well, but I didn't try.

Regarding secure nodes not showing as such, instead, "Refresh node info" didn't help at all.

@robertsLando
Copy link
Member

Hopefully this will no more happen after the transition to the new lib

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants