Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker does not start - Unit docker.socket failed to load #25098

Closed
averri opened this issue Jul 26, 2016 · 56 comments
Closed

Docker does not start - Unit docker.socket failed to load #25098

averri opened this issue Jul 26, 2016 · 56 comments

Comments

@averri
Copy link

averri commented Jul 26, 2016

Uninstalled Docker using:

yum remove docker-engine.x86_64 docker-engine-selinux.noarch

Installed again using:

curl -fsSL https://get.docker.com/ | sh
chkconfig docker on
service docker start

The result of last command is:

Redirecting to /bin/systemctl start  docker.service
Failed to start docker.service: Unit docker.socket failed to load: No such file or directory.

Running on CentOS 7.1.

@kylejw1
Copy link

kylejw1 commented Jul 27, 2016

I was able to get around this issue temporarily by copying /usr/lib/systemd/system/docker.socket from a different vm which was still running rc4.

@sempr
Copy link

sempr commented Jul 27, 2016

@kylejw1 I did the same thing and it works.

@emopinata
Copy link

I'm assuming the build process is similar for Fedora, since it's also missing in the package for 24.

@vdemeester
Copy link
Member

/cc @crosbymichael

@thaJeztah
Copy link
Member

This should be fixed on master, and will be included in 1.12-GA; to fix this issue, change the docker unit file as is done in this PR: #25094

@crosbymichael
Copy link
Contributor

Yes, this was fixed via #25094. Thanks for the report.

@adrianovieira
Copy link

For me this heppened after updating from RC4 to RC5. Even after I commented/removed like on PR: #25094.

It was necessary to do like #issuecomment-235450116, and it works.

@crosbymichael
Copy link
Contributor

@adrianovieira did you run systemctl daemon-reload after changing the file?

@adrianovieira
Copy link

yes!

@adrianovieira
Copy link

Actually I only "comment" that line. Now I removed it and everything is ok.

@crosbymichael
Copy link
Contributor

@adrianovieira nice, thanks for double checking

@adrianovieira
Copy link

adrianovieira commented Jul 27, 2016

ok!

may be the update process could fix that.. instead of removing by hand. What do you think?

e.g.: sed -i -e 's/Requires=docker.socket//g' /usr/lib/systemd/system/docker.service

:D

@thaJeztah
Copy link
Member

@adrianovieira if the file has not been modified, update should replace it automatically

@theonlydoo
Copy link

theonlydoo commented Jul 29, 2016

I've got the same issue after this morning's release. I've tried to uninstall docker :

# curl -sSL get.docker.io | sh
+ sh -c 'sleep 3; dnf -y -q install docker-engine'
setsebool:  SELinux is disabled.
Re-declaration of boolean virt_sandbox_use_fusefs
Failed to create node
Bad boolean declaration at /var/lib/selinux/targeted/tmp/modules/100/virt/cil:148
/usr/sbin/semodule:  Failed!

If you would like to use Docker as a non-root user, you should now consider
adding your user to the "docker" group with something like:

  sudo usermod -aG docker your-user

Remember that you will have to log out and back in for this to take effect!

I'm on Fedora 24, with selinux disabled

@thaJeztah
Copy link
Member

@theonlydoo was this on a fresh install or an upgrade? If it was an upgrade did you have any modifications made to your docker systemd unit file? (causing the installer to skip updating that file)

@theonlydoo
Copy link

@thaJeztah First it was an upgrade, with a systemd unit file edited to enable the -s overlay flag, but I suspected that my modification had somehow fucked the upgrade so I deleted everything, systemd unit file, /var/lib/docker/ path. Afterwards I tried the method chown before #25098 (comment) and it wasn't successful either.

@crosbymichael
Copy link
Contributor

@theonlydoo is your system still messed up? Could you run something like journalctl -xe and see if anything related comes up in the logs? Thanks

@shawnwhit
Copy link

I just updated from rc4 to rc5 and got the same issue. I am running Centos7

@thaJeztah
Copy link
Member

@shawnwhit the "socket failed to load" is resolved in 1.12.0, we're looking into the SELinux issue; appears to be an update in SELinux upstream that causes issues

@asmialoski
Copy link

I just updated from rc4 to rc5 and got the same issue. I am running RHEL7.
Workaround: I have copied /usr/lib/systemd/system/docker.socket from another machine with RC4 and it´s work.

@thaJeztah
Copy link
Member

@asmialoski see #25098 (comment), also 1.12.0 has been released and contains various bug fixes since RC5

@lcamilo15
Copy link

lcamilo15 commented Aug 1, 2016

Fixed by just adding the Unit.

/usr/lib/systemd/system/docker.socket

[Unit]
Description=Docker Socket for the API
PartOf=docker.service

[Socket]
ListenStream=/var/run/docker.sock
SocketMode=0660
SocketUser=root
SocketGroup=docker

[Install]
WantedBy=sockets.target

https://github.com/docker/docker/blob/master/contrib/init/systemd/docker.socket

@michelvocks
Copy link

michelvocks commented Aug 1, 2016

Had also today the same issue after upgrading from 1.10.2 to 1.12 on Redhat. Solution from @lcamilo15 fixed the issue.

Edit: Also now happening on fresh server: Red Hat Enterprise Linux Server release 7.2 (Maipo)

@theonlydoo
Copy link

theonlydoo commented Aug 1, 2016

Upgrade to the latest this morning, the package still doesnt provides the socket that @lcamilo15 mentioned and docker still wont start. @crosbymichael the issue mentioned in journalctl is the exact same that is workaround by @lcamilo15. For me the immediate workaround was to rollback to the previous version of docker.

@vdemeester
Copy link
Member

vdemeester commented Aug 1, 2016

@theonlydoo the socket file is no more provided in the docker package (1.12 and later) – it should look like this : https://github.com/docker/docker/blob/master/contrib/init/systemd/docker.service.rpm.

You can look at #24804 for reason what it was removed.

However, there is another motivation for removing socket activation from
docker's systemd files and that is because when you have daemons running
with --restart always whenever you have a host reboot those daemons
will not be started again because the docker daemon is not started by
systemd until a request comes into the docker API.

The immediate workaround should be to remove the Requires=docker.socket line in the docker.service file if it's still there 👼.

@theonlydoo
Copy link

@vdemeester so there is no planned "migration" handle on package upgrade for this ?

@vdemeester
Copy link
Member

@theonlydoo I'm not sure how fedora/rpm-based distribution handle updates of these files (invoking @runcom here). I thought it would have update/overwrite the docker.service file as it's done on Archlinux for example, but I guess not.

@theonlydoo
Copy link

# systemctl status docker
● docker.service - Docker Application Container Engine
   Loaded: loaded (/etc/systemd/system/docker.service; disabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since lun. 2016-08-01 10:24:32 CEST; 12s ago
     Docs: https://docs.docker.com
  Process: 18984 ExecStart=/usr/bin/docker daemon -H fd:// -s overlay (code=exited, status=1/FAILURE)
 Main PID: 18984 (code=exited, status=1/FAILURE)

août 01 10:24:32 meow systemd[1]: Starting Docker Application Container Engine...
août 01 10:24:32 meow docker[18984]: time="2016-08-01T10:24:32.838119437+02:00" level=fatal msg="no sockets found via socket activation: make sure the service was started by systemd"
août 01 10:24:32 meow systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
août 01 10:24:32 meow systemd[1]: Failed to start Docker Application Container Engine.
août 01 10:24:32 meow systemd[1]: docker.service: Unit entered failed state.
août 01 10:24:32 meow systemd[1]: docker.service: Failed with result 'exit-code'.

here is what I got when I restart the docker w/o the Requires line and ofc I removed docker.socket from the After line too. Here is my systemd service file :

[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network.target

[Service]
Type=notify
# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
ExecStart=/usr/bin/docker daemon -H fd:// -s overlay
MountFlags=slave
LimitNOFILE=1048576
LimitNPROC=1048576
LimitCORE=infinity
TimeoutStartSec=0
# set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes

[Install]
WantedBy=multi-user.target

@abdeleon
Copy link

abdeleon commented Aug 1, 2016

Fyi, I just used the old versions after running into the issues noted above.
yum install docker-engine-selinux-1.11.2-1.el7.centos.noarch
yum install docker-engine-1.11.2-1.el7.centos.x86_64

Works fine for me.
uname -r
3.10.0-327.22.2.el7.x86_64

@Illydth
Copy link

Illydth commented Aug 1, 2016

Running into a (Potentially?) related issue.

Installing 1.12 over the top of 1.11.2 on RHEL7 (up to date) leads to the docker engine being unable to start.

Removing "-H fd://" from the run line does tend to fix the issue. Sounds like there's a break in the Require/After config between 1.11 and 1.12?

HOWEVER doing this leaves the daemon unable to be accessed by anyone but root and then only with DOCKER_HOST set. (Groups file doesn't work either).

I'll see about rolling out all related docker files and doing a clean install from yum.

NOTE: I don't THINK i've modified the systemd service file, but I'll check that also.

@Illydth
Copy link

Illydth commented Aug 1, 2016

Update: So the base docker systemd files work just fine, however when I drop the following systemd drop in file in place I get:

[root@test docker.service.d]# docker ps
Cannot connect to the Docker daemon. Is the docker daemon running on this host?

The Systemd drop in file (As per documentation here: https://docs.docker.com/engine/admin/systemd/) looks like the following:

[root@stldbldtst01 docker.service.d]# cat `pwd`
cat: /etc/systemd/system/docker.service.d: Is a directory
[root@stldbldtst01 docker.service.d]# cat ./docker-service.conf
[Service]
EnvironmentFile=-/etc/sysconfig/docker
EnvironmentFile=-/etc/sysconfig/docker-storage
EnvironmentFile=-/etc/sysconfig/docker-network
ExecStart=
ExecStart=/usr/bin/docker daemon $OPTIONS \
    $DOCKER_STORAGE_OPTIONS \
    $DOCKER_NETWORK_OPTIONS \
    $BLOCK_REGISTRY \
    $INSECURE_REGISTRY

Had to remove the "-H fd://" from the documented drop in file option or I'd get the daemon fail to start again.

Help? What's wrong with the systemd drop in file that's causing the system not to be accessible from "docker ps" at the command line?

The docker environment files referenced are as follows:

/etc/sysconfig/docker:
OPTIONS="-g /docker"
/etc/sysconfig/docker-storage
DOCKER_STORAGE_OPTIONS="-s btrfs"
/etc/sysconfig/docker-network
DOCKER_NETWORK_OPTIONS="-H tcp://0.0.0.0:2375 -H unix:///docker/docker.sock"

The final command line for dockerd comes out as follows:
dockerd -g /docker -s btrfs -H tcp://0.0.0.0:2375 -H unix:///docker/docker.sock

(NOTE: Yes I am aware this is a totally open / unsecured installation, no i'm not planning on going to production with this, yes this is on an isolated sandbox.)

I've tracked the problem down to the Networking Options...is there something about 1.12's ability to use -H tcp://0.0.0.0:2375 -H unix:///docker/docker.sock?

Update:

So I've tracked this down: Apparently with the removal of "fd://" as an option on RHEL we're down to 3 methods to connect to docker:

  • Whatever method local users connect to docker with by default if you specify no networking options.
  • TCP://
  • UNIX://

The problem seems to be that turning on ANY Networking options (Such as TCP://) DISABLES whatever the local connection method is...if dockerd is loaded with -H tcp://0.0.0.0:2375, all users will get "Cannot connect to the Docker daemon. Is the docker daemon running on this host?" unless they have "DOCKER_HOST" set.

This was NOT the case when docker was capable of being used with the "fd://" option to use systemd sockets, but since this has been removed, significant functionality for RHEL has been impared.

Is there a solution to this other than "too bad, you should have been using Ubuntu"?

@vdemeester
Copy link
Member

@Illydth by default, the client look on unix socket in /var/run/docker.sock, that's why with -H unix:///docker/docker.sock it won't work without setting DOCKER_HOST. If you use -H unix:///var/run/docker.sock it will work fine.

The breaking change here is that using fd:// doesn't work anymore if Requires=docker.socket (which is now by default). I think we should update the documentation to reflect that. @crosbymichael @thaJeztah wdyt ?

@thaJeztah
Copy link
Member

@vdemeester the problem there is that we only removed socket activation for RPM based platforms (because it caused problems there), but not for other platforms

@dailyherold
Copy link

Added a Redhat/Centos specific comment to #22847 (comment) so jumping over to this issue. If there is a need for documentation updates, I'd be glad to assist.

@EtienneK
Copy link

EtienneK commented Aug 3, 2016

Getting this issue after trying to install docker on Centos 7 using docker-machine generic driver.

@thaJeztah
Copy link
Member

@EtienneK looks like there's an open issue for that; docker/machine#3632

@alfredcs
Copy link

alfredcs commented Aug 5, 2016

Try to create /usr/lib/systemd/system/docker.socket if missing and/or rm /var/lib/docker as well as /var/run/docker.

@Illydth
Copy link

Illydth commented Aug 8, 2016

Top google hit on "docker systemd" gets you this page:

https://docs.docker.com/engine/admin/systemd/

Which explicitly states in it's documentation to put "-H fd://" into your systemd drop in file. This is why so many RHEL users broke with the 1.12 install: Docker's own Docks state to put that in the file which was subsequently removed. To further compound the issue, because the file exists, it will NOT get overwritten/removed in the upgrade...meaning anyone who's followed those instructions in the past will have a broken docker after yum upgrade...

Big old warnings about "fd://" being removed for RHEL/RPM distributions should be ALL OVER that page.

Finding the documentation that this feature was removed was practically impossible as well.

@hholst80
Copy link

hholst80 commented Sep 9, 2016

This is still an issue. Why is this bug closed?

@ozbillwang
Copy link
Contributor

ozbillwang commented Sep 11, 2016

yes, I saw this issue in centos 7, maybe it is related old version 1.11.1 has been cleaned from docker-main-repo today and generate this issue.

# yum list |grep docker-engine
docker-engine.x86_64                    1.12.1-1.el7.centos            @docker-main-repo
docker-engine-selinux.noarch            1.12.1-1.el7.centos            @docker-main-repo
docker-engine-debuginfo.x86_64          1.12.1-1.el7.centos            docker-main-repo

/usr/lib/systemd/system/docker.socket is missed within installation.

# grep -r socket /etc/systemd/system/docker.service
After=network.target docker.socket
Requires=docker.socket

After manually create the /usr/lib/systemd/system/docker.socket file, the problem is fixed.

# cat docker.socket
[Unit]
Description=Docker Socket for the API
PartOf=docker.service

[Socket]
ListenStream=/var/run/docker.sock
SocketMode=0660
SocketUser=root
SocketGroup=docker

[Install]
WantedBy=sockets.target

# systemctl unmask docker.service
# systemctl unmask docker.socket
# service docker restart

The exist running instance (with version docker-engine 1.11.1) has no this issue, only see this issue in new instance.

@kaiterramike
Copy link

I just saw this on a fresh and fully updated CentOS 7 image while trying to docker-machine create -d generic on it. @lcamilo15's solution worked, and it seems to be playing nicely with docker-machine.

$ yum list installed | grep docker
docker-engine.x86_64                 1.12.1-1.el7.centos            @docker-main-repo
docker-engine-selinux.noarch         1.12.1-1.el7.centos            @docker-main-repo

@psakar
Copy link

psakar commented Sep 12, 2016

@lcamilo15 solution works for FC 22

wget https://raw.githubusercontent.com/docker/docker/master/contrib/init/systemd/docker.socket -O /usr/lib/systemd/system/docker.socket
systemctl daemon-reload
systemctl start docker.socket
systemctl start docker

@dongalex
Copy link

Hi,
If the docker daemon is booted by -H fd://, I can't use docker -H fd:// version to get the version information, can anybody tell me how to access docker daemon in this case? I'm not familiar with "fd://"

Thanks a lot

@thaJeztah
Copy link
Member

@dongalex if it's on the same host, simply docker version, or sudo docker version should work

@dongalex
Copy link

@thaJeztah , yes, i tried, it works ,thank you!

@sajfarahani
Copy link

This is still an issue?
I installed Docker 1.12 and changed the service file to have docker listen on port 2375:

[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network.target docker.socket
Requires=docker.socket

[Service]
Type=notify
#ExecStart=/usr/bin/docker daemon -H fd://
Environment="DOCKER_OPTS=-H tcp://0.0.0.0:2375 --exec-opt native.cgroupdriver=cgroupfs"
ExecStart=/usr/bin/docker daemon -H fd://  \$DOCKER_OPTS
MountFlags=slave
LimitNOFILE=1048576
LimitNPROC=1048576
LimitCORE=infinity

[Install]
WantedBy=multi-user.target

And added the docker.socket as well:

[Unit]
Description=Docker Socket for the API
PartOf=docker.service

[Socket]
ListenStream=/var/run/docker.sock
SocketMode=0660
SocketUser=root
SocketGroup=docker

[Install]
WantedBy=sockets.target

However, it still does not allow me to start docker. Is there anyway to fix this?

@thaJeztah
Copy link
Member

@sajfarahani it's only an issue if you made modifications to the unit file, or used drop-in files that use -H fd://, and are running on a system that uses RPM's for installation (CentOS, Fedora, RHEL, Oracle Linux). Please see the "IMPORTANT" message at the top of the release notes; https://github.com/docker/docker/releases/tag/v1.12.5

Some things worth mentioning based on the info you provided

  • Please be aware that configuring the daemon to listen on 0.0.0.0:2375 makes the remote API accessible by anyone that can reach port 2375 of your docker host (which could be "the internet"). Anyone that has access to the remove API, effectively has root access on your machine. See docker daemon attack surface. When exposing the API, always make sure to properly protect it; Protect the docker daemon socket.
  • You should never modify the main systemd unit file directly. Doing so prevents updates to the unit-file to be installed. If you need to customise settings in the unit file, always use an override ("drop in") file, as described in "Custom Docker daemon options"
  • For most configuration changes, it's easier to use a daemon.json daemon configuration file; the daemon configuration file does not depend on what init system you use (systemd, sysvinit, upstart), and allows you to reload certain settings without restarting the daemon. See daemon configuration file in the documentation.

@subodhp
Copy link

subodhp commented Jul 21, 2017

Just for the record, I was facing the same problem & it was fixed by just touching the file /usr/lib/systemd/system/docker.socket under RHEL7.3. Weird!

Thanks,
Subodh
thesubodh.com

@thaJeztah
Copy link
Member

@subodhp systemd socket activation for RPM based installs has been removed since docker 1.12 (see https://github.com/moby/moby/releases/tag/v1.12.6), so if the unit file is expecting that, it's possible you either have changes in your unit file, or there's a drop-in file overriding the default config; read the instructions at the top of the release notes, which may help; https://github.com/moby/moby/releases/tag/v1.12.6

@subodhp
Copy link

subodhp commented Jul 24, 2017

@thaJeztah Thanks, that provided the explanation for the behavior, I was able to reproduce the issue and rectify via the procedure.

Thanks,
Subodh Pachghare
thesubodh.com

@Oyunbold
Copy link

systemctl start docker.socket
Job for docker.socket failed. See "systemctl status docker.socket" and "journalctl -xe" for details.

@ndpackersandmovers
Copy link

just reboot the system through terminal

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests