Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Put the 3 bind mounted network files into a size-limited loopback filesystem #14956

Open
stefanberger opened this issue Jul 24, 2015 · 3 comments
Labels
area/networking area/runtime kind/enhancement Enhancements are not bugs or new features but can improve usability or performance.

Comments

@stefanberger
Copy link
Contributor

Issue:

The docker daemon creates the 3 configuration files /etc/resolv.conf, /etc/hostname, and /etc/hosts in the in the filesystem of the host and then bind mounts them into the filesystem of the container. To leave control over these files and aspects of network configuration to the container user, they need to be writeable. The problem that arises with these bind-mounted files being writeable is that a user can write large amounts of data into these 3 files and fill up the host's disk, which can lead to adverse side effects.

Proposed solution:

We propose a solution in which the docker daemon creates a loopback mounted filesystem whose contents are held in a file whose size can be chosen via command line option.
In our current prototype code we experiment with the following command line:

docker run -ti --nwfiles-size 100 fedora /bin/bash

This command line causes the docker daemon to create a 100kb file, format it with ext4 filesystem, and mount it using a loopback device. It subsequently writes the contents of the above mentioned 3 files into files in this filesystem and bind-mounts them as usual.

The result then is the following:

# docker run -ti --nwfiles-size 60 fedora /bin/bash
[root@7299fbc89997 /]# dd if=/dev/zero of=/etc/hosts count=1024 bs=1024
dd: error writing '/etc/hosts': No space left on device
35+0 records in
34+0 records out
34816 bytes (35 kB) copied, 0.000432129 s, 80.6 MB/s
[root@7299fbc89997 /]# ls -l /etc/hosts
-rw-r--r-- 1 root root 34816 Jul 24 15:29 /etc/hosts

The container user was only able to create a file with 33kb size.

Implementation details:

The prototype code is located here:

https://github.com/stefanberger/docker/commits/loopbackfs

The go code calls into an external script called 'dockerloopfs' that takes the following parameters for setting up this filesystem:

# dockerloopfs help
Usage: dockerloopfs command options

Utility to create a loopback mounted filesystem of given size. The
filesystem will be mounted on a given mountpoint (directory) and the
filesystem contents will be held in a given file.

The following commands are support:

create      : To create the loopback mounted filesystem

delete      : To unmount the loopback mounted filesystem and delete its
              file

help        : Display this help screen


The following options can be provided:

--fsfile    : The name of the file for where the filesystem will be create in;
              this option is mandatory

--mountpoint: The directory where to mount the filesystem on;
              this option is mandatory

--size      : The size for the filesystem in kb; default is 100 kb;
              this option is optional

The script internally checks whether the filesystem has already been mounted and returns.
It creates the file of the given size otherwise and mounts it; if new loopback devices need to be created, it creates a new one using 'mknod'. The reason for having to create additional loopback devices is that typically only 8 loopback devices are available but many more containers can be created.

It seems easier to implement this functionality in an external (bash) script calling a sequence of tools rather than having golang code call a sequence of external tools.

A side effect of using many loopback devices may be that other software, that counts on the availability of a free loopback devices, will now have to create additional loopback devices.

Contacts

Stefan Berger stefanb@us.ibm.com
Salman Baset sabaset@us.ibm.com

@phemmer
Copy link
Contributor

phemmer commented Jul 24, 2015

I would propose something more along the lines of making those files real files, and bind mount them (or even symlink) outside the container.

The reason being, part of the discussion when the current implementation of these files was added, it was brought up that an atomic modification of the files is not possible.
The typical way of updating /etc/hosts safely is to load the file, change it in memory, write it to /etc/hosts.tmp (or other name in the same directory), and then do a rename to move it in place. The reason for this procedure is so that anything which reads /etc/hosts doesn't catch it mid-write, and get corrupted data.
Because rename doesn't work across filesystem boundaries, having these files bind mounted breaks this.

Then, by making them real files within the container, they're subject to the same space limitations as the container itself.

@salmanbaset
Copy link

@phemmer it will be ideal to have these files inside the container. Then, we do not have to worry about loopback devices.

However, I believe that having these files inside the container will break the --link option.

@stefanberger
Copy link
Contributor Author

@phemmer I had an implementation like that as well; Iirc the concern here was related to the layers of the container ( @estesp )

@jessfraz jessfraz added kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny and removed kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny kind/proposal labels Sep 8, 2015
@icecrime icecrime added area/runtime kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. and removed kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny labels Sep 10, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/networking area/runtime kind/enhancement Enhancements are not bugs or new features but can improve usability or performance.
Projects
None yet
Development

No branches or pull requests

7 participants