Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error listening to events: Error: connect ECONNREFUSED /var/run/balena-engine.sock #227

Open
jellyfish-bot opened this issue Aug 13, 2020 · 2 comments

Comments

@jellyfish-bot
Copy link

jellyfish-bot commented Aug 13, 2020

[thgreasi] The supervisor of a RPi1 on 2.48.0+rev1 stopped working at some point.
Running balena ps was erroring with

Cannot connect to the balenaEngine daemon at unix:///var/run/balena-engine.sock. Is the balenaEngine daemon running?

Running journalctl -fu balena -a -n 100 gave:

[error]   Error listening to events: Error: connect ECONNREFUSED /var/run/balena-engine.sock
[error]         at PipeConnectWrap.afterConnect [as oncomplete] (net.js:1097:14) Error: connect ECONNREFUSED /var/run/balena-engine.sock
[error]       at PipeConnectWrap.afterConnect [as oncomplete] (net.js:1097:14)

while journalctl -f -n 100 -u resin-supervisor gave:

systemd[1]: resin-supervisor.service: Start-pre operation timed out. Terminating.
systemd[1]: resin-supervisor.service: Control process exited, code=killed, status=15/TERM
resin-supervisor[4974]: deactivating
systemd[1]: resin-supervisor.service: Control process exited, code=exited, status=3/NOTIMPLEMENTED
systemd[1]: resin-supervisor.service: Failed with result 'timeout'.
systemd[1]: Failed to start Balena supervisor.

Confirmed with stat /var/run/balena-engine.sock that it's indeed a socket and not a directory.

@jellyfish-bot
Copy link
Author

[thgreasi] This issue has attached support thread https://jel.ly.fish/50a3f6f0-e0ae-49f7-ad89-34a517e26a8c

@dt-rush
Copy link

dt-rush commented Aug 18, 2020

Updates from the thread's status hashtag about the possible cause:

systemd tried to shut down the balena engine (probably through a user command) but that timed out and systemd entered a state where it doesn't consider the engine to be running (even though it is) and tries to start a new instance anyway. This fails, but the engine's socket is probably replaced anyway, putting the whole device in an uncontrollable state. Asked the user for more details, but communicated that the best way out is a power cycle to break this stalemate.

At one point Florin was able to get the device working by restarting the balena services and killing the older stalled processes but after a quick while the device turned bad again. Permission was given to reboot. A reboot was initiated and the device appears to be functioning normally so far.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants