Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

supervisord-watchdog (exit status 1; not expected) #114

Closed
trebolcinco opened this issue Jun 22, 2020 · 17 comments
Closed

supervisord-watchdog (exit status 1; not expected) #114

trebolcinco opened this issue Jun 22, 2020 · 17 comments

Comments

@trebolcinco
Copy link

This morning I started receiving this WARN: "supervisord-watchdog (exit status 1; not expected)". It was preceded by: INFO " supervisord-watchdog: event was rejected". I can access the admin panel but when I try to Open the Remote Daemon it says: "Connection Refused". I tried rebooting and restarting docker, neither helped.

@Fishwaldo
Copy link
Member

What version are you running?

@trebolcinco
Copy link
Author

trebolcinco commented Jun 22, 2020 via email

@mike240se
Copy link

I installed "latest" today and it was 3 days behind, I now am running 145 and this error/issue went away.

@FreelancerJay
Copy link

FreelancerJay commented Jun 28, 2020

I'm having the same warning in my logs too. I've only just tried to set this up with docker-compose last night, and it's all I get.
Here is my log from when I start the docker from the terminal.

Attaching to ozwd
ozwd | 2020-06-28 09:36:50,223 INFO Set uid to user 0 succeeded
ozwd | 2020-06-28 09:36:50,231 INFO supervisord started with pid 1
ozwd | 2020-06-28 09:36:51,243 INFO spawned: 'supervisord-watchdog' with pid 9
ozwd | 2020-06-28 09:36:51,254 INFO spawned: 'ozwdaemon' with pid 10
ozwd | 2020-06-28 09:36:51,265 INFO spawned: 'ozw-admin' with pid 11
ozwd | 2020-06-28 09:36:51,277 INFO spawned: 'websockify' with pid 12
ozwd | 2020-06-28 09:36:51,372 INFO exited: ozwdaemon (exit status 128; not expected)
ozwd | 2020-06-28 09:36:52,282 INFO success: supervisord-watchdog entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
ozwd | 2020-06-28 09:36:52,285 INFO success: ozw-admin entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
ozwd | 2020-06-28 09:36:52,285 INFO success: websockify entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
ozwd | 2020-06-28 09:36:53,332 INFO spawned: 'ozwdaemon' with pid 13
ozwd | 2020-06-28 09:36:53,374 INFO exited: ozwdaemon (exit status 128; not expected)
ozwd | 2020-06-28 09:36:55,386 INFO spawned: 'ozwdaemon' with pid 14
ozwd | 2020-06-28 09:36:55,431 INFO exited: ozwdaemon (exit status 128; not expected)
ozwd | 2020-06-28 09:36:58,755 INFO spawned: 'ozwdaemon' with pid 15
ozwd | 2020-06-28 09:36:58,802 INFO exited: ozwdaemon (exit status 128; not expected)
ozwd | 2020-06-28 09:36:59,805 INFO gave up: ozwdaemon entered FATAL state, too many start retries too quickly
ozwd | 2020-06-28 09:37:00,823 WARN supervisord-watchdog: event was rejected
ozwd | 2020-06-28 09:37:00,920 INFO exited: supervisord-watchdog (exit status 1; not expected)
ozwd | 2020-06-28 09:37:01,929 INFO spawned: 'supervisord-watchdog' with pid 17
ozwd | 2020-06-28 09:37:03,427 INFO success: supervisord-watchdog entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
ozwd | 2020-06-28 09:37:03,438 WARN supervisord-watchdog: event was rejected
ozwd | 2020-06-28 09:37:03,537 INFO exited: supervisord-watchdog (exit status 1; not expected)
ozwd | 2020-06-28 09:37:04,547 INFO spawned: 'supervisord-watchdog' with pid 19
ozwd | 2020-06-28 09:37:06,058 INFO success: supervisord-watchdog entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)

It seems to be starting everything, and after a couple seconds ozwdaemon crashes, is relaunched and crashes again, and repeats once more at which point docker gives up because of how quickly it was looping.
supervisord-watchdog seems to fail to parse, or log, or whatever it is meant to do with that info, and crashes. It relaunches, and stays up long enough that the following crash doesn't lead to docker giving up on the relaunches, because those last 4 lines repeat every few seconds from then on until I quit it with control-c.

When I vnc in, I also can't access the remote panel like @trebolcinco, it also just gives me "Connection Refused".
If I choose to "Start Local Client", I can control the Z-Wave Network find, and have successfully included a device this way, but of course nothing gets sent to my MQTT broker by this...

Running on a Raspberry Pi 3, fully up to date Raspbian (per apt-get update). HomeAssistant is on a second Raspberry Pi on the network, not on the same hardware.
I pulled the docker image that was released a few hours ago too, using the allinone-latest tag, so I believe it is version 149.

@trebolcinco, when you launch the docker in the terminal, do you get the ozwdaemon process exiting as well before the supervisord-watchdog starts quiting on you too? If not, my issue is probably different and I'll open a separate issue

@FreelancerJay
Copy link

I've just updated the docker to the latest -allinone image, 154.

Is there anything I can do to get more information to help identify a cause for this?

@bachya
Copy link

bachya commented Jul 22, 2020

I, too, am experiencing this stack trace with the allinone-amd64-153 and allinone-amd64-154 Docker images, as well. Incidentally, the allinone-build-153 and allinone-latest seem to start up fine.

@kpine
Copy link
Contributor

kpine commented Jul 22, 2020

If your architecture is amd64, then allinone-amd64-153 , allinone-build-153 and allinone-latest are the exact same image.

Providing all of log files would be more helpful.

@bachya
Copy link

bachya commented Jul 22, 2020

@kpine I mentioned I'm getting the same stack trace as @FreelancerJay when viewing the logs of the container – where would I find application logs?

@kpine
Copy link
Contributor

kpine commented Jul 22, 2020

There is a logs directory, default location would be in the config directory.

@kpine
Copy link
Contributor

kpine commented Jul 22, 2020

Also, allinone builds > 150 are known to be broken (at least use of ozwadmin, what else?) because of changes in the QT5 version. A fix is in the works.

@bachya
Copy link

bachya commented Jul 22, 2020

@kpine Here are my logs from a clean run of allinone-amd64-154:

ozwadmin.log
websocket.log
watchdog.log
ozwdaemon.log

@mike240se
Copy link

Also, allinone builds > 150 are known to be broken (at least use of ozwadmin, what else?) because of changes in the QT5 version. A fix is in the works.

to anyone not aware: using the ozw-admin client on your desktop is a work around for the time being.

@kpine
Copy link
Contributor

kpine commented Jul 22, 2020

@bachya Your ozwdaemon.log is showing problems with MQTT.

[20200722 20:11:08.501 UTC] [ozw.mqtt.publisher] [debug]: MQTT State Change "Disconnected" 
[20200722 20:11:08.501 UTC] [ozw.mqtt.publisher] [warning]: Exiting on Failure 
[20200722 20:11:08.501 UTC] [ozw.mqtt.publisher] [warning]: MQTT Client Disconnnected 
[20200722 20:11:08.501 UTC] [ozw.mqtt.publisher] [warning]: MQTT Client Error "Transport Invalid" 

If ozwd disconnects from the broker it will restart, which seems to be happening here. Not sure if it will help or not, but you could try increasing the log levels, via a container environment variable:

QT_LOGGING_RULES="*.debug=true;qt.remoteobjects.io.debug=false;qt.remoteobjects.debug=false;qt.remoteobjects.models.debug=false;ozw.mqtt.qt2js.debug=false"

@kpine
Copy link
Contributor

kpine commented Jul 22, 2020

Here's what I see, in comparison:

[20200722 13:54:32.712 PDT] [qt.mqtt.connection] [debug]: bool QMqttConnection::sendControlConnect()
[20200722 13:54:32.712 PDT] [qt.mqtt.connection.verbose] [debug]: bool QMqttConnection::writePacketToTransport(const QMqttControlPacket&)  DataSize: 78
[20200722 13:54:32.712 PDT] [qt.mqtt.connection.verbose] [debug]: void QMqttConnection::transportReadReady()
[20200722 13:54:32.712 PDT] [qt.mqtt.connection.verbose] [debug]: Received CONNACK
[20200722 13:54:32.712 PDT] [qt.mqtt.connection.verbose] [debug]: Finalize CONNACK
[20200722 13:54:32.712 PDT] [ozw.mqtt.publisher] [debug]: MQTT State Change "Connected"
[20200722 13:54:32.712 PDT] [ozw.mqtt.publisher] [info]: MQTT Client Connected

@bachya
Copy link

bachya commented Jul 22, 2020

@kpine What MQTT broker do you use? I use VerneMQ with several clients and it works great. That MQTT Client Error "Transport Invalid" is interesting... Will investigate.

EDIT: user error. 🤦 I wasn't using the name of my MQTT container as the hostname. Everything working now; I still think I'm encountering the QT5 issue over VNC (disconnects suddenly), so I'll wait for new versions to test again.

@kpine
Copy link
Contributor

kpine commented Jul 22, 2020

I'm using mosquitto.

EDIT: user error. 🤦 I wasn't using the name of my MQTT container as the hostname. Everything working now.

😆 Most of us have done that at some point... Some better logging of connection errors might be useful.

@Fishwaldo
Copy link
Member

ozwdaemon is now logging as much info as we can get about connection failures, but its not a lot on the MQTT side unfortunately. I've added some additional logging when ozwdaemon exits about its failure and a message to "read the log files" about why it failed that hopefully will help some what.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants