Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overlayfs does not work with unix domain sockets #12080

Closed
analytically opened this Issue Apr 4, 2015 · 56 comments

Comments

Projects
None yet
@analytically
Copy link

analytically commented Apr 4, 2015

This issue (unix sockets not working) affects the container's root filesystem, including /var/ and /tmp folders. But not any kind of a volume (either type). So long as that volume's underlying filesystem is 'normal' for linux and will permit the creation of working sockets.

See https://github.com/analytically/docker-overlayfs-bug

@dreamcat4

This comment has been minimized.

Copy link

dreamcat4 commented Apr 4, 2015

@analytically Can you please summarize what this bug is?

@analytically

This comment has been minimized.

Copy link
Author

analytically commented Apr 4, 2015

Unix domain sockets don't seem to work when using overlayfs. When using device mapper, it works fine.

@dreamcat4

This comment has been minimized.

Copy link

dreamcat4 commented Apr 4, 2015

Ah right. Ok then. That is the newest linux kernel filesystem, to replace device mapper & aufs in the future. Sorry I was confused by the title. Because some of us use the word 'overlay' to refer to something else entirely.

"overlayfs does not work with unix domain sockets"

@cpuguy83

This comment has been minimized.

Copy link
Contributor

cpuguy83 commented Apr 6, 2015

Just doing a simple go program with a unix socket listener works for me.

@LK4D4

This comment has been minimized.

Copy link
Contributor

LK4D4 commented Apr 6, 2015

I can reproduce with image from @analytically

@analytically analytically changed the title Overlay doesn't work with unix domain sockets Overlayfs does not work with unix domain sockets Apr 7, 2015

@jessfraz jessfraz added this to the 1.7.0 milestone Apr 13, 2015

@vbatts vbatts referenced this issue Apr 14, 2015

Closed

graphdriver: promote overlay driver to first #12354

0 of 2 tasks complete
@vbatts

This comment has been minimized.

Copy link
Contributor

vbatts commented May 13, 2015

hahah, of course debugging is great.

bash-4.3# strace -o /log -fff supervisord -k -c /etc/supervisord.conf
2015-05-13 15:52:08,211 CRIT Supervisor running as root (no user in config file)
2015-05-13 15:52:08,338 INFO RPC interface 'supervisor' initialized
2015-05-13 15:52:08,339 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2015-05-13 15:52:08,341 INFO supervisord started with pid 22
2015-05-13 15:52:09,350 INFO spawned: 'test-unix-domain-socket' with pid 25
2015-05-13 15:52:10,354 INFO success: test-unix-domain-socket entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2015-05-13 15:52:10,436 INFO exited: test-unix-domain-socket (exit status 0; expected)
^C2015-05-13 15:53:24,850 WARN received SIGINT indicating exit request
bash-4.3# supervisord -k -c /etc/supervisord.conf
2015-05-13 15:53:42,591 CRIT Supervisor running as root (no user in config file)
2015-05-13 15:53:42,664 INFO RPC interface 'supervisor' initialized
2015-05-13 15:53:42,665 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2015-05-13 15:53:42,665 INFO supervisord started with pid 51
2015-05-13 15:53:43,671 INFO spawned: 'test-unix-domain-socket' with pid 54
2015-05-13 15:53:43,996 INFO exited: test-unix-domain-socket (exit status 0; not expected)
2015-05-13 15:53:45,002 INFO spawned: 'test-unix-domain-socket' with pid 58
2015-05-13 15:53:45,326 INFO exited: test-unix-domain-socket (exit status 0; not expected)
2015-05-13 15:53:47,333 INFO spawned: 'test-unix-domain-socket' with pid 62
2015-05-13 15:53:47,663 INFO exited: test-unix-domain-socket (exit status 0; not expected)
2015-05-13 15:53:50,672 INFO spawned: 'test-unix-domain-socket' with pid 66
2015-05-13 15:53:51,007 INFO exited: test-unix-domain-socket (exit status 0; not expected)
2015-05-13 15:53:52,009 INFO gave up: test-unix-domain-socket entered FATAL state, too many start retries too quickly
^C2015-05-13 15:53:56,395 WARN received SIGINT indicating exit request
bash-4.3# 

@icecrime icecrime removed this from the 1.7.0 milestone May 28, 2015

@stevenschlansker

This comment has been minimized.

Copy link

stevenschlansker commented Jun 5, 2015

Why was this removed from 1.7.0? We are currently stuck in a situation where no storage backend is able to fulfill our needs, and this is one of the blockers for us for Overlay...

@thaJeztah

This comment has been minimized.

Copy link
Member

thaJeztah commented Jun 5, 2015

@stevenbrichards we were really hoping to get overlay as the default graph driver in 1.7, but there are several issues around overlay remaining; some of them are caused by bugs in overlay itself so cannot be solved by docker, but need to be fixed in the kernel first.

Rest assured that improving overlay is top priority in docker, but getting it resolved before the 1.7 release just wasn't possible.

On a brighter note; starting with the 1.7 release, there will be an "experimental" docker release, with a nightly or weekly release (undecided yet). That release will be used to test upcoming features before they end up in the official release. So, once this is fixed, you'll be able to test it in that release.

@thaJeztah

This comment has been minimized.

Copy link
Member

thaJeztah commented Jun 5, 2015

Oops, wrong autocomplete; meant to use @stevenschlansker (apologies)

@thaJeztah thaJeztah added this to the 1.8.0 milestone Jun 5, 2015

@philips

This comment has been minimized.

Copy link
Contributor

philips commented Jun 5, 2015

I believe the kernel issue here will be fixed in Linux 4.1.

@thaJeztah

This comment has been minimized.

Copy link
Member

thaJeztah commented Jun 5, 2015

@philips thanks for the heads up; looking forward to that!

I added this to the 1.8 milestone; I must admit I'm not sure this is something to be fixed in Docker or overlay, but I'm keeping it open for now so that it's easier to find for people running into this.

@philips

This comment has been minimized.

Copy link
Contributor

philips commented Jun 5, 2015

@thaJeztah we are looking forward to it too!

I don't think there is anything that the docker engine can do differently but I agree we should absolutely keep this tracking bug issue.

@visualphoenix visualphoenix referenced this issue Jul 2, 2015

Closed

Drop support for RHEL6/CentOS6 #14365

1 of 3 tasks complete
@calavera

This comment has been minimized.

Copy link
Contributor

calavera commented Jul 15, 2015

@philips do you have a link to the kernel bug tracking where it talks about this issue? I just upgraded to 4.1.2 and it looks like this is still a problem.

@calavera calavera removed this from the 1.8.0 milestone Jul 20, 2015

@philips

This comment has been minimized.

Copy link
Contributor

philips commented Jul 28, 2015

@calavera There is no tracking issue AFAIK. I pinged Miklos.

@sandys

This comment has been minimized.

Copy link

sandys commented Jul 29, 2015

@philips - is this bug relevant : https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1214500

I'm hitting this problem as well - is there a short term solution, like switching to devicemapper or something ? we use unix sockets in our nginx and supervisord configurations

@stevenschlansker

This comment has been minimized.

Copy link

stevenschlansker commented Jul 29, 2015

You can potentially place your UNIX sockets into a volume (or a bind-mount from the host), depending on the backing storage for your volumes / host storage.

@dreamcat4

This comment has been minimized.

Copy link

dreamcat4 commented Jul 29, 2015

Right then. So just to be clear:

This issue (unix sockets not working) affects the container's root filesystem, including /var/ and /tmp folders. But not any kind of a volume (either type). So long as that volume's underlying filesystem is 'normal' for linux and will permit the creation of working sockets.

Is that a correct decription of this issue? Please confirm Y/N.

@analytically please update your issue description ^^ at top pf page accordingly. So that other users (such as myself) can understand clearly the limits of this problem in a few sentences.

@brauner

This comment has been minimized.

Copy link
Contributor

brauner commented Jun 16, 2016

@runcom

This comment has been minimized.

Copy link
Member

runcom commented Jun 16, 2016

@brauner, right that's the PR which includes the commit above, thanks for checking, we may want to close this once 4.7 lands

@brauner

This comment has been minimized.

Copy link
Contributor

brauner commented Jun 16, 2016

@runcom, yeah I know, just realized too late that you already posted. :)

@runcom

This comment has been minimized.

Copy link
Member

runcom commented Jul 26, 2016

Seems like this can be closed with 4.7

@dmcgowan

This comment has been minimized.

Copy link
Member

dmcgowan commented Jul 26, 2016

Was looking in wrong place, looks like made it into 4.7-rc4
https://lkml.org/lkml/2016/6/20/5
torvalds/linux@30402c8

Agreed, going to close. Anyone coming across this issue after close please try upgrading to 4.7 first.

@dmcgowan dmcgowan closed this Jul 26, 2016

moz-v2v-gh pushed a commit to mozilla/version-control-tools that referenced this issue Aug 19, 2016

ansible/supervisor: put domain socket on tmpfs volume
To work around a bug in the overlay filesystem where it doesn't support
UNIX domain sockets (moby/moby#12080).

MozReview-Commit-ID: CUTVjIrt2HD

--HG--
extra : histedit_source : 6053a5be4dbd0a9bba8a9939d95bbd9a86c50366

rockpapergoat added a commit to fasrc/container-samba that referenced this issue Sep 1, 2016

clebio added a commit to clebio/django-docker that referenced this issue Sep 25, 2016

@ryan-lane

This comment has been minimized.

Copy link

ryan-lane commented Oct 22, 2016

For those not wanting to track things down for ubuntu 16.04, it looks like the relevant changes have been backported into 4.4.0-35.54: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1607404

@abhikandoi2000

This comment has been minimized.

Copy link

abhikandoi2000 commented Jan 30, 2017

Anyone facing this issue on macOS Sierra (version 10.12.1) while using Docker Version 1.13.0 (15072)?

@ChiragChhatbar

This comment has been minimized.

Copy link

ChiragChhatbar commented Feb 11, 2017

I faced this issue in CentOS 7
Docker version 1.13.1, build 092cba3

@FAKERINHEART

This comment has been minimized.

Copy link

FAKERINHEART commented Mar 3, 2018

@analytically This is because the hard link to a domain socket file can not work in overlay fs.
Supervisor/supervisor#1067

@srinivassurishetty

This comment has been minimized.

Copy link

srinivassurishetty commented May 21, 2018

Issue seen on 3.10.0-327.36.3.el7.x86_64, is this back ported to any 3.10 series kernel?

@cpuguy83

This comment has been minimized.

Copy link
Contributor

cpuguy83 commented May 21, 2018

@srinivassurishetty 3.10.0-327 is quite old. Please make sure to update the kernel to the latest for the distro.

@EricMountain-1A

This comment has been minimized.

Copy link

EricMountain-1A commented May 22, 2018

@srinivassurishetty that looks like RHEL numbering. If that's the case, then it is old even by RHEL standards (something like RHEL 7.2). This bug has been fixed in more recent RHEL - make sure you use an up to date RHEL and an xfs volume that has ftype=1 (c.f. output of xfs_info <device>), especially if you are not using a dedicated volume for /var/lib/docker/overlay, as RHEL historically had ftype=0 for the root device.

@srinivassurishetty

This comment has been minimized.

Copy link

srinivassurishetty commented May 22, 2018

Thanks you @cpuguy83 @EricMountain-1A for comments.
Found the root cause.

@EricMountain-1A as you pointed out the problem with overlayfs and same kernel works with devicemapper.

root cause.

By default Ansible creates the SSH control sockets under ~/.ansible/cp directory.
Creation of socket will not work well with overlayfs (docker driver limitation given link below). So changed the control socket directory path and working fine across all storage drivers and kernels.
https://docs.docker.com/storage/storagedriver/overlayfs-driver/
Docker also recommending to not use the < 3.10.0-514 version to use overlay2 storage driver.

[ssh_connection]
control_path_dir=/vol/

blitzstern5 added a commit to GenesisKernel/quick-start that referenced this issue Jul 30, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.