You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
When a consul service is defined in a declarative nix-container, the systemd service is started as soon as the networking.target is reached but the ExecStart consul process fails very quickly without any useful info in log. A default 100ms RestartSec causes the unit to fail within less then a second as 5 failed restarts are done within that time.
When RestartSec is increased to 1sec, the first 4 processes fails but the fifth starts normally:
[root@consulagent:~]# journalctl -eb -u consul
Feb 18 21:47:14 consulagent systemd[1]: consul.service: Main process exited, code=exited, status=1/FAILURE
Feb 18 21:47:14 consulagent systemd[1]: consul.service: Failed with result 'exit-code'.
Feb 18 21:47:15 consulagent systemd[1]: consul.service: Service RestartSec=1s expired, scheduling restart.
Feb 18 21:47:15 consulagent systemd[1]: consul.service: Scheduled restart job, restart counter is at 1.
Feb 18 21:47:15 consulagent systemd[1]: Stopped consul.service.
Feb 18 21:47:15 consulagent systemd[1]: Starting consul.service...
Feb 18 21:47:15 consulagent systemd[1]: Started consul.service.
Feb 18 21:47:15 consulagent systemd[1]: consul.service: Main process exited, code=exited, status=1/FAILURE
Feb 18 21:47:15 consulagent systemd[1]: consul.service: Failed with result 'exit-code'.
Feb 18 21:47:16 consulagent systemd[1]: consul.service: Service RestartSec=1s expired, scheduling restart.
Feb 18 21:47:16 consulagent systemd[1]: consul.service: Scheduled restart job, restart counter is at 2.
Feb 18 21:47:16 consulagent systemd[1]: Stopped consul.service.
Feb 18 21:47:16 consulagent systemd[1]: Starting consul.service...
Feb 18 21:47:16 consulagent systemd[1]: Started consul.service.
Feb 18 21:47:16 consulagent systemd[1]: consul.service: Main process exited, code=exited, status=1/FAILURE
Feb 18 21:47:16 consulagent systemd[1]: consul.service: Failed with result 'exit-code'.
Feb 18 21:47:18 consulagent systemd[1]: consul.service: Service RestartSec=1s expired, scheduling restart.
Feb 18 21:47:18 consulagent systemd[1]: consul.service: Scheduled restart job, restart counter is at 3.
Feb 18 21:47:18 consulagent systemd[1]: Stopped consul.service.
Feb 18 21:47:18 consulagent systemd[1]: Starting consul.service...
Feb 18 21:47:18 consulagent systemd[1]: Started consul.service.
Feb 18 21:47:18 consulagent systemd[1]: consul.service: Main process exited, code=exited, status=1/FAILURE
Feb 18 21:47:18 consulagent systemd[1]: consul.service: Failed with result 'exit-code'.
Feb 18 21:47:19 consulagent systemd[1]: consul.service: Service RestartSec=1s expired, scheduling restart.
Feb 18 21:47:19 consulagent systemd[1]: consul.service: Scheduled restart job, restart counter is at 4.
Feb 18 21:47:19 consulagent systemd[1]: Stopped consul.service.
Feb 18 21:47:19 consulagent systemd[1]: Starting consul.service...
Feb 18 21:47:19 consulagent systemd[1]: Started consul.service.
Feb 18 21:47:19 consulagent consul[524]: ==> Starting Consul agent...
Feb 18 21:47:19 consulagent consul[524]: Version: 'v1.6.0'
Feb 18 21:47:19 consulagent consul[524]: Node ID: 'c3f0b9f3-d00a-274c-86fb-75c991d85002'
Feb 18 21:47:19 consulagent consul[524]: Node name: 'consulagent'
Feb 18 21:47:19 consulagent consul[524]: Datacenter: 'dc1' (Segment: '')
Feb 18 21:47:19 consulagent consul[524]: Server: false (Bootstrap: false)
Feb 18 21:47:19 consulagent consul[524]: Client Addr: [127.0.0.1] (HTTP: 8500, HTTPS: -1, gRPC: -1, DNS: 8600)
Feb 18 21:47:19 consulagent consul[524]: Cluster Addr: 192.168.32.101 (LAN: 8301, WAN: 8302)
Feb 18 21:47:19 consulagent consul[524]: Encrypt: Gossip: false, TLS-Outgoing: false, TLS-Incoming: false, Auto-Encrypt-TLS: false
Feb 18 21:47:19 consulagent consul[524]: ==> Log data will now stream in as it occurs:
Feb 18 21:47:19 consulagent consul[524]: 2020/02/18 21:47:19 [INFO] serf: EventMemberJoin: consulagent 192.168.32.101
Feb 18 21:47:19 consulagent consul[524]: 2020/02/18 21:47:19 [INFO] agent: Started DNS server 127.0.0.1:8600 (udp)
Feb 18 21:47:19 consulagent consul[524]: 2020/02/18 21:47:19 [INFO] agent: Started DNS server 127.0.0.1:8600 (tcp)
Feb 18 21:47:19 consulagent consul[524]: 2020/02/18 21:47:19 [INFO] agent: Started HTTP server on 127.0.0.1:8500 (tcp)
Feb 18 21:47:19 consulagent consul[524]: 2020/02/18 21:47:19 [INFO] agent: started state syncer
Feb 18 21:47:19 consulagent consul[524]: ==> Consul agent running!
Feb 18 21:47:19 consulagent consul[524]: 2020/02/18 21:47:19 [WARN] manager: No servers available
Feb 18 21:47:19 consulagent consul[524]: 2020/02/18 21:47:19 [ERR] agent: failed to sync remote state: No known Consul servers
Feb 18 21:47:19 consulagent consul[524]: 2020/02/18 21:47:19 [INFO] serf: EventMemberJoin: nhost 192.168.32.1
Feb 18 21:47:19 consulagent consul[524]: 2020/02/18 21:47:19 [WARN] memberlist: Refuting a suspect message (from: consulagent)
Feb 18 21:47:19 consulagent consul[524]: 2020/02/18 21:47:19 [INFO] consul: adding server nhost (Addr: tcp/192.168.32.1:8300) (DC: dc1)
Feb 18 21:47:19 consulagent consul[524]: 2020/02/18 21:47:19 [INFO] consul: New leader elected: nhost
Feb 18 21:47:21 consulagent consul[524]: 2020/02/18 21:47:21 [INFO] agent: Synced node info
Feb 18 21:47:40 consulagent consul[524]: ==> Newer Consul version available: 1.7.0 (currently running: 1.6.0)
Feb 18 21:48:03 consulagent consul[524]: 2020/02/18 21:48:03 [INFO] agent: Synced service "consulagent"
Feb 18 21:49:19 consulagent consul[524]: 2020/02/18 21:49:19 [INFO] agent: Synced service "consulagent"
Feb 18 21:50:16 consulagent consul[524]: 2020/02/18 21:50:16 [INFO] agent: Synced service "consulagent"
Feb 18 21:51:18 consulagent consul[524]: 2020/02/18 21:51:18 [INFO] agent: Synced service "consulagent"
Feb 18 21:52:01 consulagent consul[524]: 2020/02/18 21:52:01 [INFO] agent: Synced service "consulagent"
Feb 18 21:53:13 consulagent consul[524]: 2020/02/18 21:53:13 [INFO] agent: Synced service "consulagent"
Feb 18 21:54:19 consulagent consul[524]: 2020/02/18 21:54:19 [INFO] agent: Synced service "consulagent"
Apparently there is some kind of circa 5-second race condition but so far I was unable to figure out where.
Hello, I'm a bot and I thank you in the name of the community for opening this issue.
To help our human contributors focus on the most-relevant reports, I check up on old issues to see if they're still relevant. This issue has had no activity for 180 days, and so I marked it as stale, but you can rest assured it will never be closed by a non-human.
The community would appreciate your effort in checking if the issue is still valid. If it isn't, please close it.
If the issue persists, and you'd like to remove the stale label, you simply need to leave a comment. Your comment can be as simple as "still important to me". If you'd like it to get more attention, you can ask for help by searching for maintainers and people that previously touched related code and @ mention them in a comment. You can use Git blame or GitHub's web interface on the relevant files to find them.
Describe the bug
When a consul service is defined in a declarative nix-container, the systemd service is started as soon as the networking.target is reached but the ExecStart consul process fails very quickly without any useful info in log. A default 100ms RestartSec causes the unit to fail within less then a second as 5 failed restarts are done within that time.
When RestartSec is increased to 1sec, the first 4 processes fails but the fifth starts normally:
Apparently there is some kind of circa 5-second race condition but so far I was unable to figure out where.
To Reproduce
Steps to reproduce the behavior:
1.
nixos-rebuild switch
nix-container root-login consulagent
systemctl status consul
Expected behavior
Service is active and running.
Additional context
I'm actually using
nixops
to deploy the container on a remote NixOS box but to my knowledge this shouldn't make a difference.Metadata
Output from the laptop where
nixops
is run. Note thatnixops
uses the user's stable-19.09 channel for the deployment.Maintainer information:
The text was updated successfully, but these errors were encountered: