New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Permission update on docker entrypoint takes a long time #3194

Closed
miguelpeixe opened this Issue May 21, 2017 · 61 comments

Comments

Projects
None yet
@miguelpeixe

miguelpeixe commented May 21, 2017

With the current approach on docker entrypoint for updating the files to the new custom UID/GID it takes forever to finish the process, which timeouts in a reasonable production container health check.

Why not just use chown -rf mastodon:mastodon /mastodon/public/system instead of finding and filtering non-mastodon files?


  • I searched or browsed the repo’s other issues to ensure this is not a duplicate.
  • This bug happens on a tagged release and not on master (If you're a user, don't worry about this).
@Gargron

This comment has been minimized.

Show comment
Hide comment
@Gargron
Member

Gargron commented May 21, 2017

@Wonderfall

This comment has been minimized.

Show comment
Hide comment
@Wonderfall

Wonderfall May 21, 2017

Contributor

The goal is not to chown /mastodon/public/system. It would take a long time to do that (believe me, I tried every combination possible). So find won't even go there with -path path -prune -o (-not -path path will exclude it but it will go there, so it will take time), it doesn't take time since it updates all permissions but not /public/system which likely contains a lot of data.

So if I understand well, this commandes takes a long time for you?
Can you run something like time docker run -ti --rm mastodon true?
Can you send me your docker info?

Contributor

Wonderfall commented May 21, 2017

The goal is not to chown /mastodon/public/system. It would take a long time to do that (believe me, I tried every combination possible). So find won't even go there with -path path -prune -o (-not -path path will exclude it but it will go there, so it will take time), it doesn't take time since it updates all permissions but not /public/system which likely contains a lot of data.

So if I understand well, this commandes takes a long time for you?
Can you run something like time docker run -ti --rm mastodon true?
Can you send me your docker info?

@fmauNeko

This comment has been minimized.

Show comment
Hide comment
@fmauNeko

fmauNeko May 21, 2017

Contributor

I'm having the same issue, here's some info:
docker info

Containers: 53
 Running: 53
 Paused: 0
 Stopped: 0
Images: 130
Server Version: 17.05.0-ce
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 477
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9048e5e50717ea4497b757314bad98ea3763c145
runc version: 9c2d8d184e5da67c95d601382adf14862e4f2228
init version: 949e6fa
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.8.0-46-generic
Operating System: Ubuntu 17.04
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 15.66GiB
Name: concorde.dissidence.ovh
ID: UEKG:ZQYV:I6EF:VIMY:TV5W:HXRD:SEDE:GPHJ:LKUV:OFGB:NQDY:VVNM
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Username: fmauneko
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

WARNING: No swap limit support

time docker run --rm -it gargron/mastodon:v1.4rc2 true

Creating mastodon user (UID : 991 and GID : 991)...
Updating permissions...
Executing process...
docker run --rm -it gargron/mastodon:v1.4rc2 true  0.01s user 0.01s system 0% cpu 1:29.87 total
Contributor

fmauNeko commented May 21, 2017

I'm having the same issue, here's some info:
docker info

Containers: 53
 Running: 53
 Paused: 0
 Stopped: 0
Images: 130
Server Version: 17.05.0-ce
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 477
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9048e5e50717ea4497b757314bad98ea3763c145
runc version: 9c2d8d184e5da67c95d601382adf14862e4f2228
init version: 949e6fa
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.8.0-46-generic
Operating System: Ubuntu 17.04
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 15.66GiB
Name: concorde.dissidence.ovh
ID: UEKG:ZQYV:I6EF:VIMY:TV5W:HXRD:SEDE:GPHJ:LKUV:OFGB:NQDY:VVNM
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Username: fmauneko
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

WARNING: No swap limit support

time docker run --rm -it gargron/mastodon:v1.4rc2 true

Creating mastodon user (UID : 991 and GID : 991)...
Updating permissions...
Executing process...
docker run --rm -it gargron/mastodon:v1.4rc2 true  0.01s user 0.01s system 0% cpu 1:29.87 total
@Wonderfall

This comment has been minimized.

Show comment
Hide comment
@Wonderfall

Wonderfall May 21, 2017

Contributor

No problem with :

# docker info
Containers: 34
 Running: 34
 Paused: 0
 Stopped: 0
Images: 51
Server Version: 17.05.0-ce
Storage Driver: btrfs
 Build Version: Btrfs v4.7.3
 Library Version: 101
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9048e5e50717ea4497b757314bad98ea3763c145
runc version: 9c2d8d184e5da67c95d601382adf14862e4f2228
init version: 949e6fa
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.11.1
Operating System: Debian GNU/Linux 9 (stretch)
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 31.16GiB
Name: drogon
ID: TGPJ:KNGK:XV7N:LHP3:BXZG:AHLJ:QOR3:AL6F:ZHME:LHFZ:YBRP:MT4M
Docker Root Dir: /docker
Debug Mode (client): false
Debug Mode (server): false
Username: wonderfall
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: true
# time docker run -ti --rm mastodon true
Creating mastodon user (UID : 991 and GID : 991)...
Updating permissions...
Executing process...
docker run -ti --rm mastodon true  0.01s user 0.01s system 0% cpu 5.303 total
Contributor

Wonderfall commented May 21, 2017

No problem with :

# docker info
Containers: 34
 Running: 34
 Paused: 0
 Stopped: 0
Images: 51
Server Version: 17.05.0-ce
Storage Driver: btrfs
 Build Version: Btrfs v4.7.3
 Library Version: 101
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9048e5e50717ea4497b757314bad98ea3763c145
runc version: 9c2d8d184e5da67c95d601382adf14862e4f2228
init version: 949e6fa
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.11.1
Operating System: Debian GNU/Linux 9 (stretch)
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 31.16GiB
Name: drogon
ID: TGPJ:KNGK:XV7N:LHP3:BXZG:AHLJ:QOR3:AL6F:ZHME:LHFZ:YBRP:MT4M
Docker Root Dir: /docker
Debug Mode (client): false
Debug Mode (server): false
Username: wonderfall
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: true
# time docker run -ti --rm mastodon true
Creating mastodon user (UID : 991 and GID : 991)...
Updating permissions...
Executing process...
docker run -ti --rm mastodon true  0.01s user 0.01s system 0% cpu 5.303 total
@fmauNeko

This comment has been minimized.

Show comment
Hide comment
@fmauNeko

fmauNeko May 21, 2017

Contributor

Only thing I see is the storage driver, and indeed on my desktop computer which has btrfs:
docker info

Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 15
Server Version: 17.05.0-ce
Storage Driver: btrfs
 Build Version: Btrfs v4.10.2
 Library Version: 102
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9048e5e50717ea4497b757314bad98ea3763c145
runc version: 9c2d8d184e5da67c95d601382adf14862e4f2228
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 4.10.13-1-ARCH
Operating System: Arch Linux
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 15.55GiB
Name: izanami
ID: LY2X:EOSA:YURV:H2OK:MVIU:HWPB:DXCX:GTUJ:DDKJ:2CUC:IZZT:RNFA
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

time docker run --rm -it gargron/mastodon:v1.4rc2 true

Creating mastodon user (UID : 991 and GID : 991)...
Updating permissions...
Executing process...
docker run --rm -it gargron/mastodon:v1.4rc2 true  0,02s user 0,02s system 0% cpu 12,941 total
Contributor

fmauNeko commented May 21, 2017

Only thing I see is the storage driver, and indeed on my desktop computer which has btrfs:
docker info

Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 15
Server Version: 17.05.0-ce
Storage Driver: btrfs
 Build Version: Btrfs v4.10.2
 Library Version: 102
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9048e5e50717ea4497b757314bad98ea3763c145
runc version: 9c2d8d184e5da67c95d601382adf14862e4f2228
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 4.10.13-1-ARCH
Operating System: Arch Linux
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 15.55GiB
Name: izanami
ID: LY2X:EOSA:YURV:H2OK:MVIU:HWPB:DXCX:GTUJ:DDKJ:2CUC:IZZT:RNFA
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

time docker run --rm -it gargron/mastodon:v1.4rc2 true

Creating mastodon user (UID : 991 and GID : 991)...
Updating permissions...
Executing process...
docker run --rm -it gargron/mastodon:v1.4rc2 true  0,02s user 0,02s system 0% cpu 12,941 total
@Wonderfall

This comment has been minimized.

Show comment
Hide comment
@Wonderfall

Wonderfall May 21, 2017

Contributor

I tried on my Mac, which is using aufs, it takes around 5 seconds. Perhaps it's because of the SSD. But aufs is clearly less performant than btrfs.

# time find /mastodon -path /mastodon/public/system -prune -o -not -user mastodon -not -group mastodon -print0 | xargs -0 chown -f mastodon:mastodon
real	0m 4.26s
user	0m 0.10s
sys	0m 0.45s

# I ran another container to try a "non-optimised" command
# time chown -R 991:991 *
real	0m 5.45s
user	0m 0.18s
sys	0m 1.55s
Contributor

Wonderfall commented May 21, 2017

I tried on my Mac, which is using aufs, it takes around 5 seconds. Perhaps it's because of the SSD. But aufs is clearly less performant than btrfs.

# time find /mastodon -path /mastodon/public/system -prune -o -not -user mastodon -not -group mastodon -print0 | xargs -0 chown -f mastodon:mastodon
real	0m 4.26s
user	0m 0.10s
sys	0m 0.45s

# I ran another container to try a "non-optimised" command
# time chown -R 991:991 *
real	0m 5.45s
user	0m 0.18s
sys	0m 1.55s
@fmauNeko

This comment has been minimized.

Show comment
Hide comment
@fmauNeko

fmauNeko May 21, 2017

Contributor

Yeah so I guess we can accurately say that this is not an issue with Mastodon, but that it's linked to the Docker storage driver choice.

Contributor

fmauNeko commented May 21, 2017

Yeah so I guess we can accurately say that this is not an issue with Mastodon, but that it's linked to the Docker storage driver choice.

@Wonderfall

This comment has been minimized.

Show comment
Hide comment
@Wonderfall

Wonderfall May 21, 2017

Contributor

I agree with @fmauNeko, unfortunately we can't do more here.
If you're curious, this article also explains the reasons of why we should use an entrypoint rather than hardcode something in the Dockerfile : https://denibertovic.com/posts/handling-permissions-with-docker-volumes/ (just found it but it seems we had exactly the same idea! that's also what I'm doing for all my images you can find at wonderfall/dockerfiles)

cc @xataz if you have an idea.

Contributor

Wonderfall commented May 21, 2017

I agree with @fmauNeko, unfortunately we can't do more here.
If you're curious, this article also explains the reasons of why we should use an entrypoint rather than hardcode something in the Dockerfile : https://denibertovic.com/posts/handling-permissions-with-docker-volumes/ (just found it but it seems we had exactly the same idea! that's also what I'm doing for all my images you can find at wonderfall/dockerfiles)

cc @xataz if you have an idea.

@miguelpeixe

This comment has been minimized.

Show comment
Hide comment
@miguelpeixe

miguelpeixe May 21, 2017

You are right, took 9m20s to update permissions on my cloud server (overlay2 storage driver) and 7s on my local machine (aufs storage driver). I thought find command would cost more, but its not the case. Looks like a storage driver issue. I'll look into improving my cloud docker setup.

miguelpeixe commented May 21, 2017

You are right, took 9m20s to update permissions on my cloud server (overlay2 storage driver) and 7s on my local machine (aufs storage driver). I thought find command would cost more, but its not the case. Looks like a storage driver issue. I'll look into improving my cloud docker setup.

@katarpilar

This comment has been minimized.

Show comment
Hide comment
@katarpilar

katarpilar May 21, 2017

I tried to change the storage driver from overlay2 to aufs on my debian jessie VC1S scaleway instance, but the docker daemon fail to start, my kernel dosen't support aufs.
I will build my image without the chown command in the entrypoint script.
But i think this is better to tell admins to do the command before pulling new images in a release notes than to force all admins with overlayfs to build they own images if they dont want to wait 30 min (yes with 3 conteiners it takes time) to start their containers.

katarpilar commented May 21, 2017

I tried to change the storage driver from overlay2 to aufs on my debian jessie VC1S scaleway instance, but the docker daemon fail to start, my kernel dosen't support aufs.
I will build my image without the chown command in the entrypoint script.
But i think this is better to tell admins to do the command before pulling new images in a release notes than to force all admins with overlayfs to build they own images if they dont want to wait 30 min (yes with 3 conteiners it takes time) to start their containers.

@miguelpeixe

This comment has been minimized.

Show comment
Hide comment
@miguelpeixe

miguelpeixe May 21, 2017

It's weird to have such bad performance on overlay2, which should be better than aufs. I'm also running on scaleway (VC1M).

miguelpeixe commented May 21, 2017

It's weird to have such bad performance on overlay2, which should be better than aufs. I'm also running on scaleway (VC1M).

@katarpilar

This comment has been minimized.

Show comment
Hide comment
@katarpilar

katarpilar May 21, 2017

Ok i can overwrite the entrypoint script : https://docs.docker.com/compose/compose-file/#entrypoint
Maybe we should add this in the documentation?

katarpilar commented May 21, 2017

Ok i can overwrite the entrypoint script : https://docs.docker.com/compose/compose-file/#entrypoint
Maybe we should add this in the documentation?

@miguelpeixe

This comment has been minimized.

Show comment
Hide comment
@miguelpeixe

miguelpeixe May 21, 2017

@katarpilar I believe would be best to figure out why we have such bad performance on our cloud setup and add the solution to Troubleshoot docs. I assume a lot of admins use scaleway services.

miguelpeixe commented May 21, 2017

@katarpilar I believe would be best to figure out why we have such bad performance on our cloud setup and add the solution to Troubleshoot docs. I assume a lot of admins use scaleway services.

@miguelpeixe

This comment has been minimized.

Show comment
Hide comment
@miguelpeixe

miguelpeixe May 21, 2017

FYI I just got 13s run on another aufs non-ssd cloud service.

miguelpeixe commented May 21, 2017

FYI I just got 13s run on another aufs non-ssd cloud service.

@Wonderfall

This comment has been minimized.

Show comment
Hide comment
@Wonderfall

Wonderfall May 21, 2017

Contributor

That's the reason why I gave up overlay2. The performances are terrible, not with this command in particular, but in general.

That being said aufs is still the standard storage-driver for Docker so I thought no one would complain (but I was wrong :sad: ). Note that overlayfs shouldn't be really recommended for production environments, despite it has been seen as a potential successor to aufs :

As promising as OverlayFS is, it is still relatively young. Therefore caution should be taken before using it in production Docker environments.

Source : Docker documentation

Speaking of OverlayFS, I noticed this performance issue a while ago. I came up with a hack which consists of refreshing (somehow) the files in the layers (so basically this could fix your issue), but it stopped working with a Docker update.

Contributor

Wonderfall commented May 21, 2017

That's the reason why I gave up overlay2. The performances are terrible, not with this command in particular, but in general.

That being said aufs is still the standard storage-driver for Docker so I thought no one would complain (but I was wrong :sad: ). Note that overlayfs shouldn't be really recommended for production environments, despite it has been seen as a potential successor to aufs :

As promising as OverlayFS is, it is still relatively young. Therefore caution should be taken before using it in production Docker environments.

Source : Docker documentation

Speaking of OverlayFS, I noticed this performance issue a while ago. I came up with a hack which consists of refreshing (somehow) the files in the layers (so basically this could fix your issue), but it stopped working with a Docker update.

@fmauNeko

This comment has been minimized.

Show comment
Hide comment
@fmauNeko

fmauNeko May 21, 2017

Contributor

Well I have bad performance with aufs myself, so that's strange.

Contributor

fmauNeko commented May 21, 2017

Well I have bad performance with aufs myself, so that's strange.

@oxynux

This comment has been minimized.

Show comment
Hide comment
@oxynux

oxynux May 22, 2017

Same issue (overlay2) :

# docker info
Containers: 14
 Running: 14
 Paused: 0
 Stopped: 0
Images: 61
Server Version: 17.05.0-ce
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9048e5e50717ea4497b757314bad98ea3763c145
runc version: 9c2d8d184e5da67c95d601382adf14862e4f2228
init version: 949e6fa
Kernel Version: 4.9.0-0.bpo.2-amd64
Operating System: Debian GNU/Linux 8 (jessie)
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 31.26GiB
Name: tortank
ID: JHJB:GWQI:ZGEE:WEZ4:RLH5:RVRR:UWPA:ZFAH:5MK4:LBAM:4DRA:ZZ36
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Username: oxynux
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: true

oxynux commented May 22, 2017

Same issue (overlay2) :

# docker info
Containers: 14
 Running: 14
 Paused: 0
 Stopped: 0
Images: 61
Server Version: 17.05.0-ce
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9048e5e50717ea4497b757314bad98ea3763c145
runc version: 9c2d8d184e5da67c95d601382adf14862e4f2228
init version: 949e6fa
Kernel Version: 4.9.0-0.bpo.2-amd64
Operating System: Debian GNU/Linux 8 (jessie)
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 31.26GiB
Name: tortank
ID: JHJB:GWQI:ZGEE:WEZ4:RLH5:RVRR:UWPA:ZFAH:5MK4:LBAM:4DRA:ZZ36
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Username: oxynux
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: true
@malicioustoker

This comment has been minimized.

Show comment
Hide comment
@malicioustoker

malicioustoker May 22, 2017

I am also having HORRIBLE load times with permissions running locally on macOS. 9+ minutes

malicioustoker commented May 22, 2017

I am also having HORRIBLE load times with permissions running locally on macOS. 9+ minutes

@miguelpeixe

This comment has been minimized.

Show comment
Hide comment
@miguelpeixe

miguelpeixe May 22, 2017

@malicioustoker is it overlay2 storage driver? you can get the information with docker info command.

miguelpeixe commented May 22, 2017

@malicioustoker is it overlay2 storage driver? you can get the information with docker info command.

@malicioustoker

This comment has been minimized.

Show comment
Hide comment
@malicioustoker

malicioustoker May 22, 2017

malicioustoker commented May 22, 2017

@malicioustoker

This comment has been minimized.

Show comment
Hide comment
@malicioustoker

malicioustoker May 22, 2017

How can I change it from Overlay2 to something else? That's just the default option when Docker is installed on macOS

malicioustoker commented May 22, 2017

How can I change it from Overlay2 to something else? That's just the default option when Docker is installed on macOS

@Wonderfall

This comment has been minimized.

Show comment
Hide comment
@Wonderfall

Wonderfall May 22, 2017

Contributor

@malicioustoker Interesting, I still have aufs on my macOS machine (that said I don't have the latest version yet).

But you can change your storage-driver easily :

screen shot 2017-05-22 at 17 50 07

I really think we should open an issue at Docker (moby/moby) rather than arguing about this change, because what else can we do? overlay2 shouldn't have performance issues, while btrfs is as fast as it would be on a classic filesystem.

I do understand your frustration if it's taking too long (10 minutes ??? Come on!), and I suffered from this bug during months before finally giving up overlay2. The thing is overlay2 will overtake aufs in the future (btrfs, devicemapper which shouldn't be used and zfs remain as alternative options), so I'm concerned too.

Contributor

Wonderfall commented May 22, 2017

@malicioustoker Interesting, I still have aufs on my macOS machine (that said I don't have the latest version yet).

But you can change your storage-driver easily :

screen shot 2017-05-22 at 17 50 07

I really think we should open an issue at Docker (moby/moby) rather than arguing about this change, because what else can we do? overlay2 shouldn't have performance issues, while btrfs is as fast as it would be on a classic filesystem.

I do understand your frustration if it's taking too long (10 minutes ??? Come on!), and I suffered from this bug during months before finally giving up overlay2. The thing is overlay2 will overtake aufs in the future (btrfs, devicemapper which shouldn't be used and zfs remain as alternative options), so I'm concerned too.

@malicioustoker

This comment has been minimized.

Show comment
Hide comment
@malicioustoker

malicioustoker May 22, 2017

malicioustoker commented May 22, 2017

@miguelpeixe

This comment has been minimized.

Show comment
Hide comment
@miguelpeixe

miguelpeixe May 22, 2017

For Scaleway users, this is how you change to aufs:

  1. Make sure your server bootscript is set to docker. If not, set and reboot.
  2. Test if aufs driver is working by typing sudo modprobe aufs. If it exits empty, means its there.
  3. Edit /etc/docker/daemon.json (create if missing) and add the following:
{
  "storage-driver": "aufs"
}

Restart docker service and that's it.

miguelpeixe commented May 22, 2017

For Scaleway users, this is how you change to aufs:

  1. Make sure your server bootscript is set to docker. If not, set and reboot.
  2. Test if aufs driver is working by typing sudo modprobe aufs. If it exits empty, means its there.
  3. Edit /etc/docker/daemon.json (create if missing) and add the following:
{
  "storage-driver": "aufs"
}

Restart docker service and that's it.

@malicioustoker

This comment has been minimized.

Show comment
Hide comment
@malicioustoker

malicioustoker May 22, 2017

Changing to aufs fixed the issue - it now takes no more than 5 seconds to change permissions - thanks everyone!

malicioustoker commented May 22, 2017

Changing to aufs fixed the issue - it now takes no more than 5 seconds to change permissions - thanks everyone!

@Wonderfall

This comment has been minimized.

Show comment
Hide comment
@Wonderfall

Wonderfall May 24, 2017

Contributor

Can someone try the --squash option? Someone still using overlay2 I mean.

docker build --squash -t mastodon .

For this to work you'll have to enable experimental features in Docker, put this in /etc/docker/daemon.json :

{
    "experimental": true
}
Contributor

Wonderfall commented May 24, 2017

Can someone try the --squash option? Someone still using overlay2 I mean.

docker build --squash -t mastodon .

For this to work you'll have to enable experimental features in Docker, put this in /etc/docker/daemon.json :

{
    "experimental": true
}
@malicioustoker

This comment has been minimized.

Show comment
Hide comment
@malicioustoker

malicioustoker May 24, 2017

malicioustoker commented May 24, 2017

@Wonderfall

This comment has been minimized.

Show comment
Hide comment
@Wonderfall

Wonderfall May 24, 2017

Contributor

OverlayFS implements union mount, it's supposed to be faster, and it's in Linux kernel upstream. It will overtake aufs for these reasons, once it's mature (and it already happened in RHEL/CentOS).

That said there are other alternatives :

  • Btrfs (the one I'm using, no problem)
  • ZFS
  • Devicemapper
Contributor

Wonderfall commented May 24, 2017

OverlayFS implements union mount, it's supposed to be faster, and it's in Linux kernel upstream. It will overtake aufs for these reasons, once it's mature (and it already happened in RHEL/CentOS).

That said there are other alternatives :

  • Btrfs (the one I'm using, no problem)
  • ZFS
  • Devicemapper
@fmauNeko

This comment has been minimized.

Show comment
Hide comment
@fmauNeko

fmauNeko May 24, 2017

Contributor

overlay2 already improves a lot from overlay, but still, OverlayFS is a bit immature, that's why stable Docker still uses aufs as a default (Edge Docker use overlay2 now AFAIK).
It's only advantage now is that it's already in the upstream Kernel source.

If you need to setup a new server for Docker though, use btrfs or zfs, as they are natively Copy-on-write filesystems.

Contributor

fmauNeko commented May 24, 2017

overlay2 already improves a lot from overlay, but still, OverlayFS is a bit immature, that's why stable Docker still uses aufs as a default (Edge Docker use overlay2 now AFAIK).
It's only advantage now is that it's already in the upstream Kernel source.

If you need to setup a new server for Docker though, use btrfs or zfs, as they are natively Copy-on-write filesystems.

@xsteadfastx

This comment has been minimized.

Show comment
Hide comment
@xsteadfastx

xsteadfastx May 29, 2017

im on overlay2 too and it takes over 30 minutes for me. im all morning for upgrading because the recreate and migrate commands start chown in the entrypoint.

xsteadfastx commented May 29, 2017

im on overlay2 too and it takes over 30 minutes for me. im all morning for upgrading because the recreate and migrate commands start chown in the entrypoint.

@xsteadfastx

This comment has been minimized.

Show comment
Hide comment
@xsteadfastx

xsteadfastx May 29, 2017

and spinning up the containers start 3 chown jobs... a full hour before just getting mastodon up.

xsteadfastx commented May 29, 2017

and spinning up the containers start 3 chown jobs... a full hour before just getting mastodon up.

@katarpilar

This comment has been minimized.

Show comment
Hide comment
@katarpilar

katarpilar May 29, 2017

@xsteadfastx you have 2 options :

  • switch to aufs
  • rewrite the entrypoint script with a volume directive and comment the chown command in the new script
    volumes:
    - /home/docker/entrypoint/entrypoint.sh:/usr/local/bin/run

katarpilar commented May 29, 2017

@xsteadfastx you have 2 options :

  • switch to aufs
  • rewrite the entrypoint script with a volume directive and comment the chown command in the new script
    volumes:
    - /home/docker/entrypoint/entrypoint.sh:/usr/local/bin/run
@xsteadfastx

This comment has been minimized.

Show comment
Hide comment
@xsteadfastx

xsteadfastx May 29, 2017

@katarpilar i choosed overlay2 because the docker docs say i should do... ;-)

i know i can overwrite the entrypointscript but maybe we can have a discussion about running this chown on everystart on every container.

xsteadfastx commented May 29, 2017

@katarpilar i choosed overlay2 because the docker docs say i should do... ;-)

i know i can overwrite the entrypointscript but maybe we can have a discussion about running this chown on everystart on every container.

@fmauNeko

This comment has been minimized.

Show comment
Hide comment
@fmauNeko

fmauNeko May 29, 2017

Contributor

The reason why the chown is done in the entrypoint and not in the Dockerfile: #3194 (comment)

Contributor

fmauNeko commented May 29, 2017

The reason why the chown is done in the entrypoint and not in the Dockerfile: #3194 (comment)

@xsteadfastx

This comment has been minimized.

Show comment
Hide comment
@xsteadfastx

xsteadfastx May 29, 2017

@fmauNeko i know this and i also do this on my docker images. but maybe this could be a task that you can run as a command... like the asset compiling or db migration... if its needed....

xsteadfastx commented May 29, 2017

@fmauNeko i know this and i also do this on my docker images. but maybe this could be a task that you can run as a command... like the asset compiling or db migration... if its needed....

@fmauNeko

This comment has been minimized.

Show comment
Hide comment
@fmauNeko

fmauNeko May 29, 2017

Contributor

@xsteadfastx Actually it's needed every time, because the chown changes the mastodon source files, not the data volumes, which are explicitly ignored by the find command at https://github.com/tootsuite/mastodon/blob/master/docker_entrypoint.sh#L11.
This is needed because Dockerfile's COPY at https://github.com/tootsuite/mastodon/blob/master/Dockerfile#L46 create files with UID 0.

Contributor

fmauNeko commented May 29, 2017

@xsteadfastx Actually it's needed every time, because the chown changes the mastodon source files, not the data volumes, which are explicitly ignored by the find command at https://github.com/tootsuite/mastodon/blob/master/docker_entrypoint.sh#L11.
This is needed because Dockerfile's COPY at https://github.com/tootsuite/mastodon/blob/master/Dockerfile#L46 create files with UID 0.

@miguelpeixe

This comment has been minimized.

Show comment
Hide comment
@miguelpeixe

miguelpeixe May 29, 2017

The way its built on the entrypoint is common practice.

I'm not sure where docker recommended overlay2 for production environments, but I've learned that this is not a correct statement. I think its safe to say this is not a mastodon problem, this is a storage driver problem. Use dd to compare IO performance.

miguelpeixe commented May 29, 2017

The way its built on the entrypoint is common practice.

I'm not sure where docker recommended overlay2 for production environments, but I've learned that this is not a correct statement. I think its safe to say this is not a mastodon problem, this is a storage driver problem. Use dd to compare IO performance.

@fmauNeko

This comment has been minimized.

Show comment
Hide comment
@fmauNeko

fmauNeko May 29, 2017

Contributor

They recommend it because it's in the mainline kernel, and they made it the default driver on the edge versions. The stable versions are staying with aufs as a default.
But from my experience I'd say that the best for production environments performance-wise would be zfs and btrfs.

Contributor

fmauNeko commented May 29, 2017

They recommend it because it's in the mainline kernel, and they made it the default driver on the edge versions. The stable versions are staying with aufs as a default.
But from my experience I'd say that the best for production environments performance-wise would be zfs and btrfs.

@xsteadfastx

This comment has been minimized.

Show comment
Hide comment
@xsteadfastx

xsteadfastx May 29, 2017

xsteadfastx commented May 29, 2017

@Wonderfall

This comment has been minimized.

Show comment
Hide comment
@Wonderfall

Wonderfall May 29, 2017

Contributor

The command was designed with this issue in mind : it won't execute chown where it's not needed. It's almost instant on every file system except overlay2 (find will be very slow too...).

As I suggested earlier, can someone using overlay2 try to squash the image during the build process? There will be no Docker cache, but the command can be much faster since we're using a single layer in the final image.

Otherwise, I suggest that someone should open an issue at Moby/Moby. This shouldn't be a Mastodon issue. I believe it's a serious performance issue and if something should change, that would be at overlay2. Or I'm missing something else, the way it works, etc. but anyway, the final user doesn't care, I don't know why he should be forced to use a buggy feature. And I don't know why we should revert a common & good practice because among the several choices, there's only one with a serious issue.

Contributor

Wonderfall commented May 29, 2017

The command was designed with this issue in mind : it won't execute chown where it's not needed. It's almost instant on every file system except overlay2 (find will be very slow too...).

As I suggested earlier, can someone using overlay2 try to squash the image during the build process? There will be no Docker cache, but the command can be much faster since we're using a single layer in the final image.

Otherwise, I suggest that someone should open an issue at Moby/Moby. This shouldn't be a Mastodon issue. I believe it's a serious performance issue and if something should change, that would be at overlay2. Or I'm missing something else, the way it works, etc. but anyway, the final user doesn't care, I don't know why he should be forced to use a buggy feature. And I don't know why we should revert a common & good practice because among the several choices, there's only one with a serious issue.

@xsteadfastx

This comment has been minimized.

Show comment
Hide comment
@xsteadfastx

xsteadfastx May 30, 2017

ok i will try to switch to aufs.

xsteadfastx commented May 30, 2017

ok i will try to switch to aufs.

@xsteadfastx

This comment has been minimized.

Show comment
Hide comment
@xsteadfastx

xsteadfastx May 30, 2017

it looks like aufs is not possible on a ubuntu 16.04 LTS. too bad. so im stuck with hours of chowning files.

xsteadfastx commented May 30, 2017

it looks like aufs is not possible on a ubuntu 16.04 LTS. too bad. so im stuck with hours of chowning files.

@fmauNeko

This comment has been minimized.

Show comment
Hide comment
@fmauNeko

fmauNeko May 30, 2017

Contributor

Do you have the linux-image-extra package installed for your kernel branch/version ? If so, what's your docker info ?

Contributor

fmauNeko commented May 30, 2017

Do you have the linux-image-extra package installed for your kernel branch/version ? If so, what's your docker info ?

@xsteadfastx

This comment has been minimized.

Show comment
Hide comment
@xsteadfastx

xsteadfastx May 30, 2017

@Wonderfall tried sqashing and didnt helped at all

xsteadfastx commented May 30, 2017

@Wonderfall tried sqashing and didnt helped at all

@Wonderfall

This comment has been minimized.

Show comment
Hide comment
@Wonderfall

Wonderfall May 30, 2017

Contributor

@xsteadfastx No you're not stuck. You can still use btrfs :

  • Make a new volume formatted with Btrfs.
  • Mount the volume somewhere else like /docker.
  • Edit /etc/docker/daemon.json :
    • Change /var/lib/docker to /docker
    • Change the storage-driver (or add it) value to "btrfs"
  • Restart Docker.
Contributor

Wonderfall commented May 30, 2017

@xsteadfastx No you're not stuck. You can still use btrfs :

  • Make a new volume formatted with Btrfs.
  • Mount the volume somewhere else like /docker.
  • Edit /etc/docker/daemon.json :
    • Change /var/lib/docker to /docker
    • Change the storage-driver (or add it) value to "btrfs"
  • Restart Docker.
@xsteadfastx

This comment has been minimized.

Show comment
Hide comment
@xsteadfastx

xsteadfastx May 30, 2017

@fmauNeko yep... installed...

Containers: 21
 Running: 7
 Paused: 0
 Stopped: 14
Images: 165
Server Version: 17.03.1-ce
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host ipvlan macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 4ab9917febca54791c5f071a9d1f404867857fcc
runc version: 54296cf40ad8143b62dbcaa1d90e520a2136ddfe
init version: 949e6fa
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.4.0-78-generic
Operating System: Ubuntu 16.04.2 LTS
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 3.736 GiB
Name: rorschach
ID: 46XI:SRHG:O332:EDNX:4Q6P:ZOEW:XRA2:EGIU:2ANA:I2AZ:JOJ7:NLUD
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Experimental: true
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

xsteadfastx commented May 30, 2017

@fmauNeko yep... installed...

Containers: 21
 Running: 7
 Paused: 0
 Stopped: 14
Images: 165
Server Version: 17.03.1-ce
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host ipvlan macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 4ab9917febca54791c5f071a9d1f404867857fcc
runc version: 54296cf40ad8143b62dbcaa1d90e520a2136ddfe
init version: 949e6fa
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.4.0-78-generic
Operating System: Ubuntu 16.04.2 LTS
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 3.736 GiB
Name: rorschach
ID: 46XI:SRHG:O332:EDNX:4Q6P:ZOEW:XRA2:EGIU:2ANA:I2AZ:JOJ7:NLUD
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Experimental: true
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false
@miguelpeixe

This comment has been minimized.

Show comment
Hide comment
@miguelpeixe

miguelpeixe May 30, 2017

@xsteadfastx try installing linux-image-extra-virtual than sudo modprobe aufs

miguelpeixe commented May 30, 2017

@xsteadfastx try installing linux-image-extra-virtual than sudo modprobe aufs

@miguelpeixe

This comment has been minimized.

Show comment
Hide comment
@miguelpeixe

miguelpeixe May 30, 2017

@xsteadfastx also, don't forget to add to /etc/docker/daemon.json

{
  "storage-driver": "aufs"
}

miguelpeixe commented May 30, 2017

@xsteadfastx also, don't forget to add to /etc/docker/daemon.json

{
  "storage-driver": "aufs"
}
@xsteadfastx

This comment has been minimized.

Show comment
Hide comment
@xsteadfastx

xsteadfastx May 30, 2017

@miguelpeixe no kernel support for aufs on ubuntu 16.04
@Wonderfall i dont have a spare partition for btrfs... too bad... else i would test it right away

xsteadfastx commented May 30, 2017

@miguelpeixe no kernel support for aufs on ubuntu 16.04
@Wonderfall i dont have a spare partition for btrfs... too bad... else i would test it right away

@miguelpeixe

This comment has been minimized.

Show comment
Hide comment
@miguelpeixe

miguelpeixe May 30, 2017

Hum, that's weird, I'm using aufs on all my 16.04 servers, including my local setup.

miguelpeixe commented May 30, 2017

Hum, that's weird, I'm using aufs on all my 16.04 servers, including my local setup.

@gc373

This comment has been minimized.

Show comment
Hide comment
@gc373

gc373 May 30, 2017

@xsteadfastx
now, I'm updating. overlay2 -> aufs is ok. (sorry, Ubuntu 16.10)

$ sudo apt-get update
$ sudo apt-get install     linux-image-extra-$(uname -r)     linux-image-extra-virtual
$ sudo modprobe aufs
$ cat /proc/filesystems | grep aufs

$ nano /etc/docker/daemon.json

{
  "storage-driver": "aufs"
}

$ service docker restart
$ docker info

gc373 commented May 30, 2017

@xsteadfastx
now, I'm updating. overlay2 -> aufs is ok. (sorry, Ubuntu 16.10)

$ sudo apt-get update
$ sudo apt-get install     linux-image-extra-$(uname -r)     linux-image-extra-virtual
$ sudo modprobe aufs
$ cat /proc/filesystems | grep aufs

$ nano /etc/docker/daemon.json

{
  "storage-driver": "aufs"
}

$ service docker restart
$ docker info

@xsteadfastx

This comment has been minimized.

Show comment
Hide comment
@xsteadfastx

xsteadfastx May 30, 2017

ok i have to say sorry... modprobe aufs after installing linux-image-extra-virtual did the trick... sorry for this discussion about the docker image... it works pretty well i was just in a bad mood because it tooks hours to upgrade mastodon.

thanks for all the help.

xsteadfastx commented May 30, 2017

ok i have to say sorry... modprobe aufs after installing linux-image-extra-virtual did the trick... sorry for this discussion about the docker image... it works pretty well i was just in a bad mood because it tooks hours to upgrade mastodon.

thanks for all the help.

@George3d6

This comment has been minimized.

Show comment
Hide comment
@George3d6

George3d6 Oct 15, 2017

There seems to be no "fix" for this issue yet and some us don't have aufs compatible kernels or the luxury of creating a VM or attaching an extra partition with brtfs or zfs for the sake of prototyping.

So my question here would be, how harmful exactly is just doing:

chown mastodon:mastodon /mastodon/public/system

instead of: find /mastodon -path /mastodon/public/system -prune -o -not -user mastodon -not -group mastodon -print0 | xargs -0 chown -f mastodon:mastodon

Would it suffice to just add a check to the script and when overlay or overlay2 is detected run a warning and just chown the entire system directory ? It seems like a good compromise considering the speed "bug" is with docker (or overlay2... depending on how you think about it) and may not be fixed in a while, however docker is slowly migrating to overlay and linunx distros are slowly removing aufs support out of the default shipped kernels.

My current fix is to run the mastodon instance with the original chown (as to not risk any issues) and "manually" modify the script inside the image to the chown of the directory for running any other tasks (e.g. creating admin users) much faster. But that is hardly the most convenient think to do, since it require an file edit every time I want to reboot my instance.

George3d6 commented Oct 15, 2017

There seems to be no "fix" for this issue yet and some us don't have aufs compatible kernels or the luxury of creating a VM or attaching an extra partition with brtfs or zfs for the sake of prototyping.

So my question here would be, how harmful exactly is just doing:

chown mastodon:mastodon /mastodon/public/system

instead of: find /mastodon -path /mastodon/public/system -prune -o -not -user mastodon -not -group mastodon -print0 | xargs -0 chown -f mastodon:mastodon

Would it suffice to just add a check to the script and when overlay or overlay2 is detected run a warning and just chown the entire system directory ? It seems like a good compromise considering the speed "bug" is with docker (or overlay2... depending on how you think about it) and may not be fixed in a while, however docker is slowly migrating to overlay and linunx distros are slowly removing aufs support out of the default shipped kernels.

My current fix is to run the mastodon instance with the original chown (as to not risk any issues) and "manually" modify the script inside the image to the chown of the directory for running any other tasks (e.g. creating admin users) much faster. But that is hardly the most convenient think to do, since it require an file edit every time I want to reboot my instance.

@fmauNeko

This comment has been minimized.

Show comment
Hide comment
@fmauNeko

fmauNeko Oct 15, 2017

Contributor

Well it won't be recursive, and I'm not sure chown -R would be faster than the current solution.

Contributor

fmauNeko commented Oct 15, 2017

Well it won't be recursive, and I'm not sure chown -R would be faster than the current solution.

@shuaiscott

This comment has been minimized.

Show comment
Hide comment
@shuaiscott

shuaiscott Oct 21, 2017

Just tried it on my 2 CPU, 8 GB RAM server and it look 49 mins... 😦

shuaiscott commented Oct 21, 2017

Just tried it on my 2 CPU, 8 GB RAM server and it look 49 mins... 😦

@pierreozoux

This comment has been minimized.

Show comment
Hide comment
@pierreozoux

pierreozoux Feb 19, 2018

Contributor

What do you think about
#6510

For me it would be a nice workaround.

Contributor

pierreozoux commented Feb 19, 2018

What do you think about
#6510

For me it would be a nice workaround.

@malicioustoker

This comment has been minimized.

Show comment
Hide comment
@malicioustoker

malicioustoker Feb 20, 2018

malicioustoker commented Feb 20, 2018

@Gargron

This comment has been minimized.

Show comment
Hide comment
Member

Gargron commented Feb 20, 2018

@moritzheiber

This comment has been minimized.

Show comment
Hide comment
@moritzheiber

moritzheiber Feb 20, 2018

Member

The solution here would be to use the user/group names instead of the variables.

I'll come up with a PR.

Member

moritzheiber commented Feb 20, 2018

The solution here would be to use the user/group names instead of the variables.

I'll come up with a PR.

@moritzheiber

This comment has been minimized.

Show comment
Hide comment
@moritzheiber

moritzheiber Feb 20, 2018

Member

@malicioustoker This should be fixed now.

Member

moritzheiber commented Feb 20, 2018

@malicioustoker This should be fixed now.

@malicioustoker

This comment has been minimized.

Show comment
Hide comment
@malicioustoker

malicioustoker Feb 20, 2018

malicioustoker commented Feb 20, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment