Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[question] ZwaveJS2MQTT looses connection to controller #2081

Closed
baahver opened this issue Dec 19, 2021 · 84 comments · Fixed by zwave-js/node-zwave-js#3954
Closed

[question] ZwaveJS2MQTT looses connection to controller #2081

baahver opened this issue Dec 19, 2021 · 84 comments · Fixed by zwave-js/node-zwave-js#3954
Assignees
Labels
question Further information is requested

Comments

@baahver
Copy link

baahver commented Dec 19, 2021

zwavejs2mqtt: 6.1.1
zwave-js: 8.9.0-beta.1
running in docker on Ubuntu 21.04
I have a strange problem since an earlier update of the docker container.
ZWaveJS2MQTT looses its connection to the controller.
I can not switch any device in ZwaveJS2MQTT web interface nor in HomeAssistant.

I can´t find anything in the docker log, nor in the ZWaveJS2mqtt log. It just looks like everything is running.
When I open the zwave settings in the webinterface and save the page with the device: /dev/serial/by-id/usb-0658_0200-if00
without changing anything, the connection is made again.
Another way to solve this is by: docker stop zwavejs2mqtt, docker start zwavejs2mqtt.

I'm not sure if it really is a connection lost to the controller, but it has that effect to me.
Does this sound familiar to someone? How do I solve that?

@baahver baahver added the question Further information is requested label Dec 19, 2021
@baahver
Copy link
Author

baahver commented Dec 19, 2021

I just found something about softreset that was implemented short time ago. And I see the new setting: Soft Reset. That can me disabled. The info there:
Soft Reset is required after some commands like changing the RF region or restoring an NVM backup. Because it may cause problems in Docker containers with certain Z-Wave sticks, this functionality may be disabled.
Is that the problem I have?
I just disabled it, and now wait what will happen.

@geekofweek
Copy link

I seem to be having the same issue and was about to open an issue myself. Devices just randomly stop responding, mostly some of the binary switches and locks. I haven't been able to find anything in the debug logs and a restart of the container seems to solve it. I did disable soft reset but just a bit ago I was having the issue again and notice it was re-enabled. I've been really dumbfounded by this issue the last week or so.

@geekofweek
Copy link

@baahver What z stick do you have? I have the Aeotec Z-Stick 7, just looking for some sort of correlation.

@doug62
Copy link

doug62 commented Dec 20, 2021

I have EXACT same issues, I have Aeotec Gen 5, this started occurring around the time the below update might have occurred, other notes:

I can most often see neighbors but at some point they go away.
Works ok, perhaps slow after a server reboot - slowly degrades and looses touch w devices.
I node will report a change to HA and HA will update it in the UI, However - HA can't control a device.
Direct device updates from Z-wave JS DON'T update devices
Devices don't respond to PINGS
Yet devices report status to HA/UI/Z-Wave JS properly.
ll/Most z-wave functionality
zwavejs2mqtt: 6.1.1
zwave-js: 8.9.0-beta.1
home id: 3741021741
home hex: 0xdefb762d

@geekofweek
Copy link

@doug62 do you have soft reset enabled?

@doug62
Copy link

doug62 commented Dec 20, 2021

@geekofweek - I have that enabled, the description for that feature is vague so not sure what it would acheive/affect - thoughts?

@mkurilov
Copy link

Same issue here with Aeotec Z-Stick 7. Ready to diagnose if someone can tell me what to try.

@baahver
Copy link
Author

baahver commented Dec 20, 2021

@baahver What z stick do you have? I have the Aeotec Z-Stick 7, just looking for some sort of correlation.

I have an Aeotec Gen 5, disabling SoftReset seems to work. This morning it was still working.

@geekofweek
Copy link

@doug62 agreed what it does seems vague but I know it can cause issues within a container. I disabled it to see if I still have the issue.

@baahver
Copy link
Author

baahver commented Dec 20, 2021

. I did disable soft reset but just a bit ago I was having the issue again and notice it was re-enabled.

@geekofweek Did you forgot to hit the save button ;)

@jmgiaever
Copy link
Contributor

Hi, yes there's an issue with the 700 series. There's a pinned post on it in the issue tracker of zwavejs.

@jmgiaever
Copy link
Contributor

@baahver
Copy link
Author

baahver commented Dec 20, 2021

Are you sure this is related to: AEON Labs | Z‐Stick Gen5 USB Controller | ZW090 | Z-Stick Gen5 too?
usb-0658_0200-if00
That is the stick I use and have trouble with.

@robertsLando
Copy link
Member

I think it could be due to soft reset, try to disable soft reset from zwave options

@baahver
Copy link
Author

baahver commented Dec 20, 2021

I think it could be due to soft reset, try to disable soft reset from zwave options

That is what I said and have done already. Till now disabling SoftReset does work.

@robertsLando
Copy link
Member

Yeah that was for the other user :) Some sticks simply don't like the soft reset on startup (even if it's defined in specs), disabling it doesn't cause so much problems, in order to fix it you should check if you arte using serial by id path for the controller and not path like /dev/ttyACM0

@baahver
Copy link
Author

baahver commented Dec 20, 2021

I'm using /dev/serial/by-id/usb-0658_0200-if00
And even then I have this problems. Will there be a solution to this problem later on?

@robertsLando
Copy link
Member

@baahver unfortunally I don't know, it also could be something docker related as docker does some things to map the devices to the container, the softreset causes a temporary disconnection from the device and this could be the reason why docker looses the reference to it and cannot connect again.

@baahver
Copy link
Author

baahver commented Dec 20, 2021

Would it be a solution to start zwavejs2mqtt directly as packaged version? like:

cd ~
mkdir zwavejs2mqtt
cd zwavejs2mqtt
# download latest version
curl -s https://api.github.com/repos/zwave-js/zwavejs2mqtt/releases/latest  \
| grep "browser_download_url.*zip" \
| cut -d : -f 2,3 \
| tr -d \" \
| wget -i -
unzip zwavejs2mqtt-v*.zip
./zwavejs2mqtt

I copied the content of my ealier config to store folder. And start zwavejs2mqtt. Everything runs well.
But all the log is thrown to the terminal.
Is there a way to start this via systemd in the background and log to file?

@robertsLando
Copy link
Member

robertsLando commented Dec 20, 2021

@baahver I usually use pm2, but you could also create a custom systemd service --> https://medium.com/@benmorel/creating-a-linux-service-with-systemd-611b5c8b91d6

Just use /path/to/app/zwavejs2mqtt as exec command

@baahver
Copy link
Author

baahver commented Dec 20, 2021

I use pkg version now. And will see of the problems exist when I enable SoftReset.
I start zwavejs2mqtt via systemd:

 /etc/systemd/system/zwavejs2mqtt.service                                                                           
[Unit]
Description=zwavejs2mqtt server
After=network-online.target

[Service]
Type=simple
User=baahver
StandardOutput=append:/home/baahver/bin/zwavejs2mqtt/zwavejs2mqtt.log
StandardError=append:/home/baahver/bin/zwavejs2mqtt/zwavejs2mqtt-error.log
WorkingDirectory=/home/baahver/bin/zwavejs2mqtt  ## store folder is in this folder!
ExecStart=/home/baahver/bin/zwavejs2mqtt/zwavejs2mqtt
Restart=always
RestartSec=1
StartLimitBurst=5
StartLimitIntervalSec=0

[Install]
WantedBy=multi-user.target

I've set a logrotate in this log file.

@geekofweek
Copy link

geekofweek commented Dec 20, 2021

Had the same problem occur even with Soft Reset disabled.

Up until about 2 weeks ago it was pretty stable, never had this issue appear so it seems fairly recent that it started happening.

Problem only seems to impact a handful of devices. Motion sensors don't seem impacted, it's primarily some of the switches and door locks impacted.

I'm to the point of setting up a cron to restart the container every so many hours as that seems to solve it.

@Beinhard
Copy link

I have exactly the same, it has driving me nuts. I did migrate from the old legacy 1.4 and I have fixed a few issues but this one I haven´t a clue.

i have the z-stick gen5 and started with the fw 1.0 but have upgraded to 1.2.

the soft reset doesn´t makes any different and the weird thing is that in the log it always state if it´s enabled that the stick doesn´t support it.

2021-12-19T23:25:01.238Z DRIVER Soft reset is enabled through config, but this stick does not support it.
2021-12-19T23:25:01.278Z CNTRLR querying version info...
2021-12-19T23:25:01.296Z CNTRLR received version info:
controller type: Static Controller
library version: Z-Wave 6.07
2021-12-19T23:25:01.297Z CNTRLR supported Z-Wave features:
· SmartStart

running enviroment: Virtualbox and HA operation system.
zwavejs2mqtt: 6.1.1
zwave-js: 8.9.0-beta.1

And as others say, it seems to only affect switches but motion sensor is working.

@doug62
Copy link

doug62 commented Dec 20, 2021

@Beinhard Can we clarify/revisit your statement "And as others say, it seems to only affect switches but motion sensor is working." - Can you try to manually change the state of a switch at the switch and then notice that it does update on HA? I'm noticing that devices can send to HA but that HA can't control them.

@jmgiaever
Copy link
Contributor

Is it possible for anyone of you with a problem, to update to the release with beta.3 (Z2M v6.1.1) to see if the problem is still there?

@geekofweek
Copy link

In my case I can't control them via the ZwavejstoMQTT dashboard either, I can toggle the switch but nothing happens. Same via HA, although HA I toggle the switch and it will just toggle it back to the same position it was in. If it was On it will toggle back On, if Off toggle back Off.

@jmgiaever I switched to the master branch a few mins ago and will test if that solves anything.

@jmgiaever
Copy link
Contributor

jmgiaever commented Dec 20, 2021

It was a problem few version ago. I had the same. Needed to restart and sometimes unplug to get it to send commands.

@baahver
Copy link
Author

baahver commented Dec 20, 2021

I was using:
zwavejs2mqtt: 6.1.1
zwave-js: 8.9.0-beta.1
As you can see at topic start. But I switched to pkg version. same version. See how that works.

@Beinhard
Copy link

Have we got any verification if beta 3 fix the issue?

As I using the HA addon with zwave-js: 8.9.0-beta.1 I have problem to test master branch.

what I can say is that my setup was working now for around 26h until it breaked so pretty random when it’s happen.

@geekofweek
Copy link

@Beinhard so far mine has been stable for 2 days since going to beta-3. I can't say 100% but it looks promising thus far.

@baahver
Copy link
Author

baahver commented Dec 22, 2021

No it is not stable. But much better than beta1. Mine was okay for almost 24 hours, but I just noticed that it was not working any
more. Is this the right log?
[zwavejs2mqtt_2021-12-22-bkup.log]
I'm on Ubuntu server 21.04
lsusb:
Bus 001 Device 015: ID 0658:0200 Sigma Designs, Inc. Aeotec Z-Stick Gen5 (ZW090) - UZB

zwavejs2mqtt: 6.1.1
zwave-js: 8.9.0-beta.3

SoftReset setting disabled in Z2M - settings - Zwave. Because this switch was earlier enabled and the switched 'hanged' then too, it probably has nothing to do with this setting?

Home assistant:

Versie | core-2021.12.2
Type installatie | Home Assistant Core
Ontwikkeling | false
Supervisor | false
Docker | false
Gebruiker | homeassistant
Virtuele omgeving | true
Python-versie | 3.9.5
Besturingssysteem | Linux
Versie van het besturingssysteem | 5.11.0-41-generic
CPU-architectuur | x86_64
Tijdzone | Europe/Amsterdam

These light switches are updated in home assistant when I push the 'real life' button, but I can not control the light from HA, nor from Zwavejs2mqtt.
I push the save settings button in Z2M and after that, all button come to alive. Luckily not doing all the delayed button pushes.

@AlCalzone
Copy link
Member

Is this the right log?

No. Check out the link in my previous comment.

@baahver
Copy link
Author

baahver commented Dec 22, 2021

Oh sorry about that. I' ll have to wait till another hang!

@candrea77
Copy link

candrea77 commented Dec 22, 2021

On my system with many nodes

  "productLabel": "ZW090",
  "productDescription": "Z‐Stick Gen5 USB Controller",
  "manufacturer": "AEON Labs",
  "firmwareVersion": "1.0",

8.9.0-beta3 --> KO
8.8.3--> OK

Regards

@AlCalzone
Copy link
Member

Here you are the full today's log.
It has hanged in the night, in the afternoon and in the evening (about 19.30 to 20.00).

Ok I found something else that stalls the driver. I'm not sure yet why, but that shouldn't be too hard to reproduce: zwave-js/node-zwave-js#3951

@candrea77
Copy link

sure yet why, but that shouldn't be too hard to reprodu

AlCalzone : if needed , and you can tell me how to reproduce the issue, I can switch back again on :master to prove your findings.

@AlCalzone
Copy link
Member

I think it is pretty hard to reproduce outside of a test environment. The driver must send a command and while it is being sent, the target node must wake up.

@baahver
Copy link
Author

baahver commented Dec 23, 2021

@AlCalzone This night another hanging zwave. Now I have a log file:
further same conditions as last post.

zwavejs_2021-12-23-copie.log

@AlCalzone
Copy link
Member

v8.9.1 should fix the remaining issue

@baahver
Copy link
Author

baahver commented Dec 23, 2021

@AlCalzone Thank you! I am looking forward to that.

@Beinhard
Copy link

@AlCalzone great work, will verify when out (HA-OS)

@hughc-hub
Copy link

Since one week ago I started to have the issues detailed here with latest versions of HA OS and zwaveJSUI .... using AEOTEC gen5 ... network got crazy random... with devices getting available/unavailable in loop.....
Every time I reload zwave2MQTT seems to stop the issue for a few minutes or hours... but it didn't solve it.
Then I decided to disable SOFT reset and things seem to be just stable again. I don't know why is this happening or the implications but seems to improve my network again.
Is there any new update that solve the real issue?
thanks

@robertsLando
Copy link
Member

Please make a driver log, loglevel debug and attach it here as a file (drag & drop into the text field).

@hughc-hub
Copy link

Please make a driver log, loglevel debug and attach it here as a file (drag & drop into the text field).

Hi,

I hope this helps.. Let me know if it´s correctly uploaded
zwave-js-ui-store.zip

@candrea77
Copy link

Hi @hughc-hub ,
i want to let you know that I succesfully moved from Aeotec Gen 5 to Aeotec Ztick 7 .
This change solved some problem on my network ... now the network responde more quickly and soft reset work as expected.

So please consider this as a possible solution (backup nvm + restore).
It is also a step forward in network availability.

@hughc-hub
Copy link

hughc-hub commented Dec 7, 2023 via email

@candrea77
Copy link

I've got this : 0x0000 0x0004-0x0004 (Silicon Labs - USB Controller - 700 Series)
Firmware version : 7.17.2 (updated using zwavejs)
NOTE : before update (don't remeber the fw version i've got after received) I wasn't able to do restore.

@hughc-hub
Copy link

hughc-hub commented Dec 7, 2023 via email

@candrea77
Copy link

Mine is :
FW: v7.17.2
SDK: v7.17.2

So is the best I can have wth this stick.
https://aeotec.freshdesk.com/support/solutions/articles/6000263744

@hughc-hub
Copy link

hughc-hub commented Dec 7, 2023 via email

@hughc-hub
Copy link

I've got this : 0x0000 0x0004-0x0004 (Silicon Labs - USB Controller - 700 Series) Firmware version : 7.17.2 (updated using zwavejs) NOTE : before update (don't remeber the fw version i've got after received) I wasn't able to do restore.

I've just received my new v7 stick. I backup NVM with Gen5... then I plugged new stick 7 and restore NVM. I can see the network and switch on/off some devices.. but it´s like I'm having problems gathering info as I had before.
Is it another step during migration that I forgot?

@AlCalzone
Copy link
Member

Depending on where you're located, the 700 series range can be much worse than 500 series.

Allegedly, this is a firmware issue, so an update in the (hopefully not too distant) future should be able to fix that.

@hughc-hub
Copy link

Depending on where you're located, the 700 series range can be much worse than 500 series.

Allegedly, this is a firmware issue, so an update in the (hopefully not too distant) future should be able to fix that.

I made a full restore and a complete reboot and now it seems to work again but from the stick 700 series. I'll observe it for a few days.. but I still don't know what could be the cause the issue I related (the log I've uploaded), so I'm not sure if this movement will be to the better... (I hope so)

Do you have any idea about the root problem that could cause the instability?
thank you so much!

@AlCalzone
Copy link
Member

Do you have any idea about the root problem that could cause the instability?

Seems like almost all Z-Wave controller firmwares have some issues. Z-Wave JS now tries to recover when it detects the controller hanging, but this sometimes requires soft-reset to be enabled, which doesn't play well with some combinations of hardware/OS/software.
If you started noticing the problem in the last few months, you can try disabling the "unresponsive controller recovery" in the Z-Wave JS UI settings. However, you'll likely have random dead nodes instead.

@hughc-hub
Copy link

Do you have any idea about the root problem that could cause the instability?

Seems like almost all Z-Wave controller firmwares have some issues. Z-Wave JS now tries to recover when it detects the controller hanging, but this sometimes requires soft-reset to be enabled, which doesn't play well with some combinations of hardware/OS/software. If you started noticing the problem in the last few months, you can try disabling the "unresponsive controller recovery" in the Z-Wave JS UI settings. However, you'll likely have random dead nodes instead.

thank you for your answer and the info. Sadly I don´t know why I started to have the issue.

I didn´t add any new zwave device during last months and this issue came just two weeks ago. I don´t know if because of a HA or zwaveJS update ...

Honestly I don't have enough knowledge to look for a solution reading my logs.

Now I'm using v7.17.2 fw of aeotec stick 7. Chris from aeotec told me to check logs and try to re-route red lines to working routes if at all possible. But I think I'm not doing that right because everytime I try to set a manual route, my communication gets slower... I didn´t find a good tutorial about how to do that properly (in case that could be something "root of my issue")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet