New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
upstart: docker started event fires before /var/run/docker.sock
exists
#6647
Comments
ping @tianon ? |
The only way I know of that we could really fix this would be having Docker itself daemonize after it creates the socket, which currently isn't possible in Go. There might be some upstart-specific signal magic we could do, but I'm not fluent enough in upstart magic to say for sure there. |
I see several ugly workarounds:
The latter workaround would probably be the most efficient if the retry code was included in the docker binary itself and active for client commands only. For now, I will experiment with the first workaround and submit a pull request to bflad/chef-docker. |
The workaround of waiting for the socket to be available in 'pre-start' seems to do the trick. See below for a sample service with the workaround.
|
We actually used to document the /cc @crosbymichael |
Yes, I found the older documentation with that workaround. Even if the socket now listens earlier, it is still not early enough for upstart script (at least in those script we can work around the issue). Also, I encountered the bug quite a lot while trying to reproduce #6673 (by basically spamming |
I took a deeper look into some Upstart documentation and still didn't see anything that might help. Please let me know if you need additional help testing, but I'm +1 for |
/cc @SvenDowideit what do you think? (since you were involved in the previous removal of the |
@tianon - isn't this an @alexlarsson type thing? |
Naw, Upstart is an Ubuntu thing. ;) |
👍 on this. |
Fixes moby#6647: Other upstart jobs that depend on docker by specifying "start on started docker" would often start before the docker daemon was ready, so they'd fail with "Cannot connect to the Docker daemon" or "dial unix /var/run/docker.sock: no such file or directory". This is because "docker -d" doesn't daemonize, it runs in the foreground, so upstart can't know when the daemon is ready to receive incoming connections. (Traditionally, a daemon will create all necessary sockets and then fork to signal that it's ready; according to @tianon this "isn't possible in Go"[1]. See also [2].) Presumably this isn't a problem with systemd init with its socket activation. The SysV init scripts may or may not suffer from this problem but I have no motivation to fix them. This commit adds a "post-start" stanza to the upstart configuration that waits for the socket to be available. Upstart won't emit the "started" event until the "post-start" script completes.[3] Note that the system administrator might have specified a different path for the socket, or a tcp socket instead, by customising /etc/default/docker. In that case we don't try to figure out what the new socket is, but at least we don't wait in vain for /var/run/docker.sock to appear. If the main script (`docker -d`) fails to start, the `initctl status $UPSTART_JOB | grep -q "stop/"` line ensures that we don't loop forever. I stole this idea from Steve Langasek.[4] If for some reason we *still* end up in an infinite loop --I guess `docker -d` must have hung-- then at least we'll be able to see the "Waiting for /var/run/docker.sock" debug output in /var/log/upstart/docker.log. I considered using inotifywait instead of sleep, but it isn't worth the complexity & the extra dependency. [1] moby#6647 (comment) [2] https://code.google.com/p/go/issues/detail?id=227 [3] http://upstart.ubuntu.com/cookbook/#post-start [4] https://lists.ubuntu.com/archives/upstart-devel/2013-April/002492.html Signed-off-by: David Röthlisberger <david@rothlis.net>
I am running lxc-docker version 1.0.1 on ubuntu 14.04 amd64 host.
My upstart scripts randomly (but very often) fail to restart their containers after booting the docker host. The following message can be found in the upstart log:
This issue should have been fixed in #4168 which made docker listen to its socket as early as possible.
Is there a way to have the docker upstart script wait for the socket to appear before emitting the started event that the container services rely on?
Reproducing the issue
Boot a VM from the following
Vagrantfile
:vagrant ssh and
wget -qO- https://get.docker.io/ubuntu/ | sudo bash -s
Create the following upstart service, save it as
/etc/init/docker-socket.conf
:Run
shutdown -r now
, log on the VM again and check the content of/var/log/upstart/docker-socket.log
:vagrant@docker-ubuntu-1404:~$ sudo cat /var/log/upstart/docker-socker.log 2014/06/24 17:06:56 PRE ls: cannot access /var/run/docker.sock: No such file or directory 2014/06/24 17:06:56 NOW ls: cannot access /var/run/docker.sock: No such file or directory
If the docker socket is not present when this test upstart service runs, it also means that container services that will also
start on started docker
will also fail because the docker client won't be able to communicate with the docker server listening on the socket.The text was updated successfully, but these errors were encountered: