New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker removes hardlinks to files from other layers on commit #5632

Closed
kscherer opened this Issue May 6, 2014 · 11 comments

Comments

Projects
None yet
8 participants
@kscherer

kscherer commented May 6, 2014

docker 0.10 on Ubuntu 14.04 x86_64 with devicemapper storage backend.

I first noticed this on a fedora container where the size of the /usr/libexec/git-core directory ballooned from 14M to 181MB (~25% of image) due to missing hardlinks. I made a simple reproducer Dockerfile

FROM busybox
RUN touch file1 && cp -l file1 file2 && cp -l file1 file3 && ls -al file*
RUN cp -l file1 file4 && cp -l file1 file5 && ls -al file*

The output on my docker 0.10:

> docker build --rm=true --no-cache=true - < Dockerfile-hardlink
Uploading context 2.048 kB
Uploading context
Step 0 : FROM busybox
 ---> 2d8e5b282c81
Step 1 : RUN touch file1 && cp -l file1 file2 && cp -l file1 file3 && ls -al file*
 ---> Running in 684590e4167d
-rw-r--r--    3 root     root             0 May  6 17:23 file1
-rw-r--r--    3 root     root             0 May  6 17:23 file2
-rw-r--r--    3 root     root             0 May  6 17:23 file3
 ---> e512a3a3de04
Step 2 : RUN cp -l file1 file4 && cp -l file1 file5 && ls -al file*
 ---> Running in 2234d1924ab1
-rw-r--r--    3 root     root             0 May  6 17:23 file1
-rw-r--r--    1 root     root             0 May  6 17:23 file2
-rw-r--r--    1 root     root             0 May  6 17:23 file3
-rw-r--r--    3 root     root             0 May  6 17:23 file4
-rw-r--r--    3 root     root             0 May  6 17:23 file5
 ---> 76088dc1f68d

Notice the hardlink count on file1,2,3 has been reset but the top "layer" still has the correct hardlink count.

@kscherer

This comment has been minimized.

Show comment
Hide comment
@kscherer

kscherer May 7, 2014

Here is a workaround for the git hardlinks in /usr/libexec/git-core taking up a lot of space, replace hard links with symlinks:

RUN yum install git && cd /usr/libexec/git-core && find . -samefile git -name 'git-*' -exec ln -sf git {} \; && cd /

This doesn't handle all hardlinks but the git-core dir is now 15MB instead of 182MB.

kscherer commented May 7, 2014

Here is a workaround for the git hardlinks in /usr/libexec/git-core taking up a lot of space, replace hard links with symlinks:

RUN yum install git && cd /usr/libexec/git-core && find . -samefile git -name 'git-*' -exec ln -sf git {} \; && cd /

This doesn't handle all hardlinks but the git-core dir is now 15MB instead of 182MB.

@unclejack unclejack changed the title from Devicemapper undoes hardlinks when using multiple layers to Docker removes hardlinks to files from other layers on commit Jul 22, 2014

@unclejack

This comment has been minimized.

Show comment
Hide comment
@unclejack

unclejack Jul 22, 2014

Contributor

This problem is affecting all graph drivers (storage backends), not just devicemapper. I've updated the title to reflect this.

Contributor

unclejack commented Jul 22, 2014

This problem is affecting all graph drivers (storage backends), not just devicemapper. I've updated the title to reflect this.

@vbatts

This comment has been minimized.

Show comment
Hide comment
@vbatts

vbatts Jul 28, 2014

Contributor

@unclejack for hardlinks, should we scan for hardlinks during commit? This seems particularly expensive, especially since it would have to map parent layers to determine the hardlink is out of the scope of a given layer.

Contributor

vbatts commented Jul 28, 2014

@unclejack for hardlinks, should we scan for hardlinks during commit? This seems particularly expensive, especially since it would have to map parent layers to determine the hardlink is out of the scope of a given layer.

@unclejack

This comment has been minimized.

Show comment
Hide comment
@unclejack

unclejack Aug 11, 2014

Contributor

@vbatts We need to do this because this used to work with aufs a long time ago.

Contributor

unclejack commented Aug 11, 2014

@vbatts We need to do this because this used to work with aufs a long time ago.

@vbatts

This comment has been minimized.

Show comment
Hide comment
@vbatts

vbatts Aug 11, 2014

Contributor

@unclejack did it work for btrfs too?

vbatts@jellyroll ~ (master *) $ cat Dockerfile.gh5632
FROM fedora:latest
RUN dd if=/dev/urandom of=file.img bs=1M count=10 && \
        ln file.img file1.img && \
        ln file.img file2.img && \
        ln file.img file3.img && \
        stat -c '%i' *.img
CMD stat -c '%i' *.img
vbatts@jellyroll ~ (master *) $ cat Dockerfile.gh5632 | docker build --no-cache -t gh5632 -
Sending build context to Docker daemon 2.048 kB
Sending build context to Docker daemon
Step 0 : FROM fedora:latest
 ---> 5d2c1c0f1c07
Step 1 : RUN dd if=/dev/urandom of=file.img bs=1M count=10 &&  ln file.img file1.img &&  ln file.img file2.img &&  ln file.img file3.img &&  stat -c '%i' *.img
 ---> Running in ef8bfd24a8d0
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.731466 s, 14.3 MB/s
15909
15909
15909
15909
 ---> 699c5de18f51
Removing intermediate container ef8bfd24a8d0
Step 2 : CMD stat -c '%i' *.img
 ---> Running in 81d3ee699763
 ---> c095631a74c1
Removing intermediate container 81d3ee699763
Successfully built c095631a74c1
vbatts@jellyroll ~ (master *) $ docker run gh5632
15902
15901
15903
15904
vbatts@jellyroll ~ (master *) $ docker info
Containers: 8
Images: 115
Storage Driver: btrfs
Execution Driver: native-0.2
Kernel Version: 3.14.5
Debug mode (server): true
Debug mode (client): false
Fds: 20
Goroutines: 25
EventsListeners: 0
Init SHA1: 8e1ac2e5492b0bdd50cbf4146cce9183543abc28
Init Path: /home/vbatts/src/docker/docker/bundles/1.1.2-dev/dynbinary/dockerinit
Sockets: [unix:///var/run/docker.sock]
WARNING: No swap limit support
Contributor

vbatts commented Aug 11, 2014

@unclejack did it work for btrfs too?

vbatts@jellyroll ~ (master *) $ cat Dockerfile.gh5632
FROM fedora:latest
RUN dd if=/dev/urandom of=file.img bs=1M count=10 && \
        ln file.img file1.img && \
        ln file.img file2.img && \
        ln file.img file3.img && \
        stat -c '%i' *.img
CMD stat -c '%i' *.img
vbatts@jellyroll ~ (master *) $ cat Dockerfile.gh5632 | docker build --no-cache -t gh5632 -
Sending build context to Docker daemon 2.048 kB
Sending build context to Docker daemon
Step 0 : FROM fedora:latest
 ---> 5d2c1c0f1c07
Step 1 : RUN dd if=/dev/urandom of=file.img bs=1M count=10 &&  ln file.img file1.img &&  ln file.img file2.img &&  ln file.img file3.img &&  stat -c '%i' *.img
 ---> Running in ef8bfd24a8d0
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.731466 s, 14.3 MB/s
15909
15909
15909
15909
 ---> 699c5de18f51
Removing intermediate container ef8bfd24a8d0
Step 2 : CMD stat -c '%i' *.img
 ---> Running in 81d3ee699763
 ---> c095631a74c1
Removing intermediate container 81d3ee699763
Successfully built c095631a74c1
vbatts@jellyroll ~ (master *) $ docker run gh5632
15902
15901
15903
15904
vbatts@jellyroll ~ (master *) $ docker info
Containers: 8
Images: 115
Storage Driver: btrfs
Execution Driver: native-0.2
Kernel Version: 3.14.5
Debug mode (server): true
Debug mode (client): false
Fds: 20
Goroutines: 25
EventsListeners: 0
Init SHA1: 8e1ac2e5492b0bdd50cbf4146cce9183543abc28
Init Path: /home/vbatts/src/docker/docker/bundles/1.1.2-dev/dynbinary/dockerinit
Sockets: [unix:///var/run/docker.sock]
WARNING: No swap limit support
@tianon

This comment has been minimized.

Show comment
Hide comment
@tianon

tianon Feb 3, 2015

Member

Still seeing this on BTRFS with latest master. 😢

$ docker version
Client version: 1.4.1-dev
Client API version: 1.17
Go version (client): go1.4
Git commit (client): 662dffe-dirty
OS/Arch (client): linux/amd64
Server version: 1.4.1-dev
Server API version: 1.17
Go version (server): go1.4
Git commit (server): 662dffe-dirty
$ docker info
Containers: 10
Images: 4772
Storage Driver: btrfs
 Build Version: Btrfs v3.16.2
 Library Version: 101
Execution Driver: native-0.2
Kernel Version: 3.17.7-gentoo
Operating System: Gentoo/Linux
CPUs: 8
Total Memory: 31.14 GiB
Name: viper
ID: GXBZ:CEH6:TO43:QWSN:UFRL:3KR6:YL3H:TGJO:O3B5:RZX5:ZFEK:6CRA
Debug mode (server): true
Debug mode (client): false
Fds: 40
Goroutines: 44
EventsListeners: 0
Init SHA1: 552eb69161a183fa42112321783c176653590da1
Init Path: /usr/libexec/docker/dockerinit
Docker Root Dir: /mnt/docker
Username: tianon
Registry: [https://index.docker.io/v1/]
Member

tianon commented Feb 3, 2015

Still seeing this on BTRFS with latest master. 😢

$ docker version
Client version: 1.4.1-dev
Client API version: 1.17
Go version (client): go1.4
Git commit (client): 662dffe-dirty
OS/Arch (client): linux/amd64
Server version: 1.4.1-dev
Server API version: 1.17
Go version (server): go1.4
Git commit (server): 662dffe-dirty
$ docker info
Containers: 10
Images: 4772
Storage Driver: btrfs
 Build Version: Btrfs v3.16.2
 Library Version: 101
Execution Driver: native-0.2
Kernel Version: 3.17.7-gentoo
Operating System: Gentoo/Linux
CPUs: 8
Total Memory: 31.14 GiB
Name: viper
ID: GXBZ:CEH6:TO43:QWSN:UFRL:3KR6:YL3H:TGJO:O3B5:RZX5:ZFEK:6CRA
Debug mode (server): true
Debug mode (client): false
Fds: 40
Goroutines: 44
EventsListeners: 0
Init SHA1: 552eb69161a183fa42112321783c176653590da1
Init Path: /usr/libexec/docker/dockerinit
Docker Root Dir: /mnt/docker
Username: tianon
Registry: [https://index.docker.io/v1/]
@tianon

This comment has been minimized.

Show comment
Hide comment
@tianon

tianon Feb 3, 2015

Member

Same on AUFS.

$ docker version
Client version: 1.4.1-dev
Client API version: 1.17
Go version (client): go1.4.1
Git commit (client): 895f9a6-dirty
OS/Arch (client): linux/amd64
Server version: 1.4.1-dev
Server API version: 1.17
Go version (server): go1.4.1
Git commit (server): 382f187-dirty
$ docker info
Containers: 60
Images: 1995
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 2504
Execution Driver: native-0.2
Kernel Version: 3.16.5-gentoo
Operating System: Gentoo/Linux
CPUs: 8
Total Memory: 31.4 GiB
Name: nameless
ID: F3BT:IIX6:2JCR:LZHH:RJLQ:3DDC:CLU2:PNGL:CHYC:4HMO:P4WG:5EQJ
Debug mode (server): true
Debug mode (client): false
Fds: 57
Goroutines: 48
EventsListeners: 0
Init SHA1: 1cfe93dc2a115ce42e6adfaac77a9f47b44dc890
Init Path: /var/lib/docker/init/dockerinit-1.4.1-dev
Docker Root Dir: /var/lib/docker
Username: tianon
Registry: [https://index.docker.io/v1/]
Member

tianon commented Feb 3, 2015

Same on AUFS.

$ docker version
Client version: 1.4.1-dev
Client API version: 1.17
Go version (client): go1.4.1
Git commit (client): 895f9a6-dirty
OS/Arch (client): linux/amd64
Server version: 1.4.1-dev
Server API version: 1.17
Go version (server): go1.4.1
Git commit (server): 382f187-dirty
$ docker info
Containers: 60
Images: 1995
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 2504
Execution Driver: native-0.2
Kernel Version: 3.16.5-gentoo
Operating System: Gentoo/Linux
CPUs: 8
Total Memory: 31.4 GiB
Name: nameless
ID: F3BT:IIX6:2JCR:LZHH:RJLQ:3DDC:CLU2:PNGL:CHYC:4HMO:P4WG:5EQJ
Debug mode (server): true
Debug mode (client): false
Fds: 57
Goroutines: 48
EventsListeners: 0
Init SHA1: 1cfe93dc2a115ce42e6adfaac77a9f47b44dc890
Init Path: /var/lib/docker/init/dockerinit-1.4.1-dev
Docker Root Dir: /var/lib/docker
Username: tianon
Registry: [https://index.docker.io/v1/]
@vbatts

This comment has been minimized.

Show comment
Hide comment
@vbatts

vbatts Feb 3, 2015

Contributor

@tianon yeah. when the tar archive is created, if the linked file is not in that layer's tar archive ...

Contributor

vbatts commented Feb 3, 2015

@tianon yeah. when the tar archive is created, if the linked file is not in that layer's tar archive ...

@Dieken

This comment has been minimized.

Show comment
Hide comment
@Dieken

Dieken Feb 22, 2015

I meet a simpler case where hardlink also breaks even in single layer with Docker 1.3, seems this issue is fixed in Docker 1.5.

$ cat Dockerfile
FROM busybox
RUN ( touch /root/a && ls -hl /root && ln /root/a /root/b && ls -hl /root ) > /root/log
$ docker build --no-cache -t t .

(1) Hardlink breaks on Docker 1.3.1 + devicemapper + Linux 2.6.32 (RHEL-6.5):

$ docker run --rm -it t bash -c 'cat /root/log; echo; echo; ls -l /root'
total 0
-rw-r--r-- 1 root root 0 Feb 22 08:58 a
-rw-r--r-- 1 root root 0 Feb 22 08:58 log
total 4.0K
-rw-r--r-- 2 root root  0 Feb 22 08:58 a
-rw-r--r-- 2 root root  0 Feb 22 08:58 b
-rw-r--r-- 1 root root 90 Feb 22 08:58 log


total 4
-rw-r--r-- 1 root root   0 Feb 22 08:58 a
-rw-r--r-- 1 root root   0 Feb 22 08:58 b
-rw-r--r-- 1 root root 226 Feb 22 08:58 log

(2) Works on Docker 1.5 + devicemapper + RHEL 6.5:

$ sudo docker run --rm -it t bash -c 'cat /root/log; echo; echo; ls -l /root'
total 0
-rw-r--r-- 1 root root 0 Feb 22 09:09 a
-rw-r--r-- 1 root root 0 Feb 22 09:09 log
total 4.0K
-rw-r--r-- 2 root root  0 Feb 22 09:09 a
-rw-r--r-- 2 root root  0 Feb 22 09:09 b
-rw-r--r-- 1 root root 90 Feb 22 09:09 log


total 4
-rw-r--r-- 2 root root   0 Feb 22 09:09 a
-rw-r--r-- 2 root root   0 Feb 22 09:09 b
-rw-r--r-- 1 root root 226 Feb 22 09:09 log

(3) Works on Docker 1.5 (boot2docker 1.5):

$ docker run --rm -it t bash -c 'cat /root/log; echo; echo; ls -l /root'
total 0
-rw-r--r-- 1 root root 0 Feb 22 13:52 a
-rw-r--r-- 1 root root 0 Feb 22 13:52 log
total 4.0K
-rw-r--r-- 2 root root  0 Feb 22 13:52 a
-rw-r--r-- 2 root root  0 Feb 22 13:52 b
-rw-r--r-- 1 root root 90 Feb 22 13:52 log


total 4
-rw-r--r-- 2 root root   0 Feb 22 13:52 a
-rw-r--r-- 2 root root   0 Feb 22 13:52 b
-rw-r--r-- 1 root root 226 Feb 22 13:52 log

Dieken commented Feb 22, 2015

I meet a simpler case where hardlink also breaks even in single layer with Docker 1.3, seems this issue is fixed in Docker 1.5.

$ cat Dockerfile
FROM busybox
RUN ( touch /root/a && ls -hl /root && ln /root/a /root/b && ls -hl /root ) > /root/log
$ docker build --no-cache -t t .

(1) Hardlink breaks on Docker 1.3.1 + devicemapper + Linux 2.6.32 (RHEL-6.5):

$ docker run --rm -it t bash -c 'cat /root/log; echo; echo; ls -l /root'
total 0
-rw-r--r-- 1 root root 0 Feb 22 08:58 a
-rw-r--r-- 1 root root 0 Feb 22 08:58 log
total 4.0K
-rw-r--r-- 2 root root  0 Feb 22 08:58 a
-rw-r--r-- 2 root root  0 Feb 22 08:58 b
-rw-r--r-- 1 root root 90 Feb 22 08:58 log


total 4
-rw-r--r-- 1 root root   0 Feb 22 08:58 a
-rw-r--r-- 1 root root   0 Feb 22 08:58 b
-rw-r--r-- 1 root root 226 Feb 22 08:58 log

(2) Works on Docker 1.5 + devicemapper + RHEL 6.5:

$ sudo docker run --rm -it t bash -c 'cat /root/log; echo; echo; ls -l /root'
total 0
-rw-r--r-- 1 root root 0 Feb 22 09:09 a
-rw-r--r-- 1 root root 0 Feb 22 09:09 log
total 4.0K
-rw-r--r-- 2 root root  0 Feb 22 09:09 a
-rw-r--r-- 2 root root  0 Feb 22 09:09 b
-rw-r--r-- 1 root root 90 Feb 22 09:09 log


total 4
-rw-r--r-- 2 root root   0 Feb 22 09:09 a
-rw-r--r-- 2 root root   0 Feb 22 09:09 b
-rw-r--r-- 1 root root 226 Feb 22 09:09 log

(3) Works on Docker 1.5 (boot2docker 1.5):

$ docker run --rm -it t bash -c 'cat /root/log; echo; echo; ls -l /root'
total 0
-rw-r--r-- 1 root root 0 Feb 22 13:52 a
-rw-r--r-- 1 root root 0 Feb 22 13:52 log
total 4.0K
-rw-r--r-- 2 root root  0 Feb 22 13:52 a
-rw-r--r-- 2 root root  0 Feb 22 13:52 b
-rw-r--r-- 1 root root 90 Feb 22 13:52 log


total 4
-rw-r--r-- 2 root root   0 Feb 22 13:52 a
-rw-r--r-- 2 root root   0 Feb 22 13:52 b
-rw-r--r-- 1 root root 226 Feb 22 13:52 log
@Dieken

This comment has been minimized.

Show comment
Hide comment
@Dieken

Dieken Feb 22, 2015

The issue reported by @kscherer still exists, but changes a little on my machine (docker 1.5 + RHEL-6.5 + devicemapper):

$ sudo docker build --no-cache -t t .
Sending build context to Docker daemon 2.048 kB
Sending build context to Docker daemon
Step 0 : FROM busybox
511136ea3c5a: Pull complete
df7546f9f060: Pull complete
ea13149945cb: Pull complete
4986bf8c1536: Pull complete
busybox:latest: The image you are pulling has been verified. Important: image verification is a tech preview feature and should not be relied on to provide security.

Status: Downloaded newer image for busybox:latest
 ---> 4986bf8c1536
Step 1 : RUN touch file1 && cp -l file1 file2 && cp -l file1 file3 && ls -al file*
 ---> Running in 711fc327e4c4
-rw-r--r--    3 root     root             0 Feb 22 14:15 file1
-rw-r--r--    3 root     root             0 Feb 22 14:15 file2
-rw-r--r--    3 root     root             0 Feb 22 14:15 file3
 ---> 167823d07490
Removing intermediate container 711fc327e4c4
Step 2 : RUN cp -l file1 file4 && cp -l file1 file5 && ls -al file*
 ---> Running in 6e02cedf1ac9
-rw-r--r--    5 root     root             0 Feb 22 14:15 file1
-rw-r--r--    5 root     root             0 Feb 22 14:15 file2
-rw-r--r--    5 root     root             0 Feb 22 14:15 file3
-rw-r--r--    5 root     root             0 Feb 22 14:15 file4
-rw-r--r--    5 root     root             0 Feb 22 14:15 file5
 ---> e3f645a59194
Removing intermediate container 6e02cedf1ac9
Successfully built e3f645a59194

$ sudo docker run --rm -it t sh -c 'ls -l /file*'
-rw-r--r--    3 root     root             0 Feb 22 14:15 /file1
-rw-r--r--    3 root     root             0 Feb 22 14:15 /file2
-rw-r--r--    3 root     root             0 Feb 22 14:15 /file3
-rw-r--r--    2 root     root             0 Feb 22 14:15 /file4
-rw-r--r--    2 root     root             0 Feb 22 14:15 /file5

Dieken commented Feb 22, 2015

The issue reported by @kscherer still exists, but changes a little on my machine (docker 1.5 + RHEL-6.5 + devicemapper):

$ sudo docker build --no-cache -t t .
Sending build context to Docker daemon 2.048 kB
Sending build context to Docker daemon
Step 0 : FROM busybox
511136ea3c5a: Pull complete
df7546f9f060: Pull complete
ea13149945cb: Pull complete
4986bf8c1536: Pull complete
busybox:latest: The image you are pulling has been verified. Important: image verification is a tech preview feature and should not be relied on to provide security.

Status: Downloaded newer image for busybox:latest
 ---> 4986bf8c1536
Step 1 : RUN touch file1 && cp -l file1 file2 && cp -l file1 file3 && ls -al file*
 ---> Running in 711fc327e4c4
-rw-r--r--    3 root     root             0 Feb 22 14:15 file1
-rw-r--r--    3 root     root             0 Feb 22 14:15 file2
-rw-r--r--    3 root     root             0 Feb 22 14:15 file3
 ---> 167823d07490
Removing intermediate container 711fc327e4c4
Step 2 : RUN cp -l file1 file4 && cp -l file1 file5 && ls -al file*
 ---> Running in 6e02cedf1ac9
-rw-r--r--    5 root     root             0 Feb 22 14:15 file1
-rw-r--r--    5 root     root             0 Feb 22 14:15 file2
-rw-r--r--    5 root     root             0 Feb 22 14:15 file3
-rw-r--r--    5 root     root             0 Feb 22 14:15 file4
-rw-r--r--    5 root     root             0 Feb 22 14:15 file5
 ---> e3f645a59194
Removing intermediate container 6e02cedf1ac9
Successfully built e3f645a59194

$ sudo docker run --rm -it t sh -c 'ls -l /file*'
-rw-r--r--    3 root     root             0 Feb 22 14:15 /file1
-rw-r--r--    3 root     root             0 Feb 22 14:15 /file2
-rw-r--r--    3 root     root             0 Feb 22 14:15 /file3
-rw-r--r--    2 root     root             0 Feb 22 14:15 /file4
-rw-r--r--    2 root     root             0 Feb 22 14:15 /file5
@LK4D4

This comment has been minimized.

Show comment
Hide comment
@LK4D4

LK4D4 Nov 29, 2016

Contributor

I've tried and it's consistent on build and run now.

Contributor

LK4D4 commented Nov 29, 2016

I've tried and it's consistent on build and run now.

@LK4D4 LK4D4 closed this Nov 29, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment