Data-only containers obsolete with docker 1.9.0? #17798

jonaskello · 2015-11-08T10:34:49Z

This is a question regarding best-practice, not sure this is the right place to ask but here it goes anyway:

So in docker 1.9.0 we can create named volumes. This means I could create a container with a named volume, then remove the container completely, and then re-create it again with the same named volume and the data would be retained. I think this was at least one (if not the only) purpose of data-only containers. So my question is if having data-only containers is still considered best-practice? Or can we now skip data-only containers completely and only use named volumes?

GordonTheTurtle · 2015-11-08T10:34:49Z

Hi!

Please read this important information about creating issues.

If you are reporting a new issue, make sure that we do not have any duplicates already open. You can ensure this by searching the issue list for this repository. If there is a duplicate, please close your issue and add a comment to the existing issue instead.

If you suspect your issue is a bug, please edit your issue description to include the BUG REPORT INFORMATION shown below. If you fail to provide this information within 7 days, we cannot debug your issue and will close it. We will, however, reopen it if you later provide the information.

This is an automated, informational response.

Thank you.

For more information about reporting issues, see https://github.com/docker/docker/blob/master/CONTRIBUTING.md#reporting-other-issues

BUG REPORT INFORMATION

Use the commands below to provide key information from your environment:

docker version:
docker info:
uname -a:

Provide additional environment details (AWS, VirtualBox, physical, etc.):

List the steps to reproduce the issue:
1.
2.
3.

Describe the results you received:

Describe the results you expected:

Provide additional info you think is important:

----------END REPORT ---------

#ENEEDMOREINFO

thaJeztah · 2015-11-08T12:06:20Z

yes, named volumes should be able to replace data-only volumes in most (if not all) cases.

@cpuguy83 any ideas of cases where data-only containers still make sense?

We may have to further improve the docs around this

cpuguy83 · 2015-11-08T12:25:40Z

Yep, no reason I can see to use data-only containers.

jonaskello · 2015-11-08T15:50:49Z

Ok, so no more data-only containers :-) Thanks for verifying this!

runcom · 2015-11-08T16:05:38Z

Do we have docs regarding data-only containers? If so we could leave this open until we fix those
Ping @thaJeztah

trkoch · 2015-11-12T12:05:16Z

@runcom Yes, e.g. https://docs.docker.com/engine/userguide/dockervolumes/#creating-and-mounting-a-data-volume-container. I suggest to refer to named volumes instead to prevent future confusion.

duglin · 2015-11-12T12:12:47Z

Never really used data-only containers, but wouldn't there still be a need for them if you wanted a way to move data (not apps) between clouds? At that point the container becomes the portable-filesystem artifact.

thaJeztah · 2015-11-12T12:43:57Z

@duglin actually not, because the data-only container is only used to reference the volume through --volumes-from, so they're also not portable

duglin · 2015-11-12T13:15:10Z

But, if I used the VOLUME Dockerfile command and then pre-populate that volume during the build process, won't that data be available whenever/wherever I deploy that image?

trkoch · 2015-11-12T13:24:18Z

Maybe a remaining use case is to have pre-seeded volumes from data only container (see #14242 (comment) for subtle differences between anonymous and named volumes).

BrianAdams · 2015-12-22T21:43:30Z

Another use case that keeps data only containers relevant is that you can use container affinity (--volumes-from=dependency) to make sure you container runs on the same node as the data container. At the moment there does not appear to be a filter for volume affinity.

quinncomendant · 2016-02-26T21:04:17Z

@runcom @trkoch @thaJeztah The section Creating and mounting a data volume container still says, "…it’s best to create a named Data Volume Container…". If this is no longer the case, can y'all make a ticket to get the docs updated? I'm certainly confused by this. ;P

cpuguy83 · 2016-02-26T21:05:58Z

Already done: #20465

quinncomendant · 2016-02-26T21:26:37Z

@cpuguy83 Great, thanks!

carsten-ulrich-saitow-ag · 2016-03-02T15:47:31Z

Hi, I use that feature for nginx, php-fpm. I have nginx and php-fpm in two different containers. As nginx has a link to php-fpm I can not use volumes_from inside the php-fpm container as that would create a circular reference. So I found the solution to use a container that only has the php source code and both the nginx and the php-fpm container use the volumes_from feature.

thaJeztah · 2016-03-02T15:59:56Z

@carsten-ulrich-saitow-ag volumes-from is still supported (as is the possibility of using data-only containers), it's just that in many cases you don't need a data-only container per-se, because volumes can now be managed on their own (without the need for a container to be attached to it).

Also, with the new networking, you can reach containers by name (or still provide a --link to set an alias). For example

docker network create myapp
docker run --net=myapp -v phpsource:/var/www --name php php-fpm
docker run --net=myapp -v phpsource:/usr/share/nginx/html --name web nginx

Both containers share the "phpsource" volume (which is propagated with the content of the php-fpm containers the first time it's run). The nginx container can connect to the php container using php as hostname

quinncomendant · 2016-03-02T16:03:16Z

@carsten-ulrich-saitow-ag And the named volume phpsource would be created (just once) using:

docker volume create --name phpsource

Oh, wait, @thaJeztah, when you say the volume is “propagated with the content of the php-fpm containers the first time it's run” do you mean it's not necessary to run docker volume create first?

thaJeztah · 2016-03-02T16:25:32Z

@quinncomendant correct, you can either create the volume first (docker volume create --name phpsource), or provide a name when starting the container (as in my example). If a volume with that name does not yet exist, it is created automatically.

If is important to do it in the right order, because the files inside the container are only copied to the volume by the first container that uses it (and the volume is still empty)

damnhandy · 2016-03-06T14:36:08Z

If you need to apply permissions to a named volume, you're kind of SOL at the moment. When you create volumes with docker volume create, it is owned by root. If your container process runs under a UID other than root (i.e. Jenkins), you're kind of stuck. PR #20262 should will fix this, but that's not available today.

cpuguy83 · 2016-03-07T16:47:12Z

@damnhandy It is owned by whoever owns the data in the container at the path you mount it to (assuming it's the first time it's been mounted).

robvelor · 2016-03-16T22:13:37Z

What about a data only container when scaling in a swarm? For example scaling the same container to multiple host machines sharing the same volume? Is this the only use case or is scaling to the same host with the volume recommended?

thaJeztah · 2016-03-16T23:30:18Z

@robvelor the default ("local") volume driver is indeed local to a host (although swarm will create a volume with the same name on each host). You can, however, use a different driver/plugin; some plugins allow a volume to be shared or replicated on each host; you can find some plugins here; https://docs.docker.com/engine/extend/plugins/#finding-a-plugin

robvelor · 2016-03-16T23:36:56Z

@thaJeztah Yes, I forgot to add that I am using rexray to persist in the cloud (AWS-EBS) but the volume can only be mounted to one host machine, hence my question. Any thoughts on this? Maybe I need to use a different plugin to achieve multi-host volumes between containers.

thaJeztah · 2016-03-17T00:03:08Z

@robvelor hm, possibly yes; you could ask rex-ray what the options are with their plugin. Be aware, that it may also depend on the application that's writing to the volume; does the app support concurrent processes writing to the "same" volume.

sergeyklay · 2016-03-17T23:03:23Z

@thaJeztah Sorry for stupid question but I not fully realized.

You said

docker run --net=myapp -v phpsource:/var/www --name php php-fpm
docker run --net=myapp -v phpsource:/usr/share/nginx/html --name web nginx

you can either create the volume first (docker volume create --name phpsource), or provide a name when starting the container (as in my example). If a volume with that name does not yet exist, it is created automatically.

But how to work with these volumes at host machine? For example to develop application and store it to the named volume.

Sorry if the question is too stupid, but I actually don't quite understand 😐

derqnaque · 2016-04-07T15:27:54Z

Stand-alone volumes usually have root,root access until changed by a container with root access. If your container runs software as a normal user (e.g. jenkins, ...) there is no way to change the permissions of a volume mounted at runtime to normal user access for the container. A data-container can be used to run a single chown CMD as root on the volume before it is used by the normal-user container. AFAIK there is no way to do this on a named-volume without a root container. (See also long discussions at e.g. #2259, #7198)

cpuguy83 · 2016-04-07T15:32:57Z

@derqnaque stand-alone volumes (As of 1.10) will work the same as anonymous volumes (e.g. docker run -v /foo)... which is they will inherit data/perms from the container image (the first one that attaches to it).
in 1.11 you can supply mount opts for the volume to use, so if the underlying filesystem you are mounting supports uid/gid, you can specify those... e.g. docker volume create --opt type=bindfs --opt o=uid=1000,gid=1000

cpuguy83 · 2016-04-07T15:33:55Z

Also in 1.11, you can supply nocopy when attaching a volume to make sure it doesn't copy data to the volume (and set perms, etc) if you are sharing that volume with multiple containers.

derqnaque · 2016-04-07T16:04:58Z

@cpuguy83: The host volume might be something I don't have host access to (e.g. in a cloud setting where i am not the hoster). Mount options might work, if I know the filesystem that is in use on the host. And I think e.g. ext4 does not provide the uid, gid options.

The data-container solution still seems a lot easier and more portable to me, since all I need to know is the chown command. Of course I still need the root access in the data container. But this is only run once and the normal user of the software-container does not get the root access.

The nocopy option is interesting. And I sure want to get rid of data-containers. Maybe using it on a named volume and naming it a run-once-to-prepare-the-volume-container solves the problem.

cpuguy83 · 2016-04-07T16:08:21Z

@derqnaque A named volume will act exactly like a -v /foo volume as of 1.10 so there should be no added benefits in a data-only container other than grouping volumes together with --volumes-from

nioncode · 2016-06-10T20:10:11Z

Is there a way to create a named volume that stores the data of all VOLUMEs defined by the Dockerfile?
I don't really care which folders those are, the Dockerfile should already declare everything that is configuration (e.g. Jenkins' home) as VOLUME. I then want to capture everything outside of the application container, like I currently can do with a data-only container, so that I can upgrade the application independently from the data.

So, is there something like a -v all-data:$ALL_VOLUMES that can map to multiple folders?

thaJeztah · 2016-06-10T20:32:12Z

@nioncode no; you can't "nest" volumes, you'll have to assign a named volume for each folder, or (if you don't assign a named volume), docker creates "anonymous" / "unnamed" volumes for each volume that's declared in the Dockerfile

lsgd · 2016-06-16T11:04:16Z

@thaJeztah Is it also possible to automatically create data volumes with mount points? (Or do I just mix up auto-creation things?)

Do you also know how to use data volume containers with docker-compose?
Or do I have to manually create them beforehand?

thaJeztah · 2016-06-16T14:06:18Z

@lukas-schulze if you mean bind-mounting a directory, that's a different thing: a bind-mounted directory (-v /some/path/on/host:/path/in/container) doesn't copy the data from inside the container to the volume, so may have a different effect.

If you want to easily access the volume data from your host, you code consider using a volume plugin for that https://docs.docker.com/engine/extend/plugins/#volume-plugins, for example, the "local persist" plugin allows you to specify an custom path where volumes are stored

Define node_modules as named module so it will be automatically attached in one-off `docker-compose run` containers allowing `npm install` to have an effect without the need to rebuild containers. This requires compose file version 2 syntax: https://docs.docker.com/compose/compose-file/#/version-2 (which I think requires Docker Engine 1.10.0 or newer) For persisting with named modules, see: - brikis98/docker-osx-dev#168 (comment) - moby/moby#17798

arvenil · 2017-02-01T12:34:07Z

What's the best alternative to something like this:

docker create \
-v /var \
-v /bin \
-v /any/other/path \
--name data-xyz xyz /bin/true

docker run -d -p 80:80 --volumes-from data-xyz --name xyz xyz

In other words, data only container with multiple paths?

So far, best I could achieve is this:

docker run -d -p 80:80 \
-v data-xyz-var:/var \
-v data-xyz-bin:/bin \
-v data-xyz-any-other-path:/any/other/path \
--name xyz xyz

In other words for each directory I need separate volume?

thaJeztah · 2017-02-02T00:46:19Z

@arvenil as a replacement for "data-containers" generally looks good to me. Without knowing your exact use case however;

do those paths belong in the same volume, or not; i.e. is the data "tied" together, or separate? If not tied together, having separate volumes for each directory may make sense
if these are actual examples; /var/ and /bin/ look very broad to use as a volume;
- are all paths inside that actually intended to have a lifecycle independent of the container? /var/ contains a lot of directories; most of those are probably not related to the actual data for your container (/var/tmp, /var/run ?). Try to limit to the actual data you want to persist
- /bin looks odd as well (i.e. if your container's binary/executable is in there; how do you upgrade the binary? if it's not data, it may not have to be in a volume, that allows you to upgrade the container to update the binary)

arvenil · 2017-02-02T12:59:28Z

@thaJeztah hah, bad choice of examples from my side :) I shouldn't use /var /bin :) Sorry. I meant something more generic like /xyz /abc /qwe. I think what I'm looking for is a Group Volume, or a root / volume (yes, I'm making those names). Right now if I create volume data-xyz:/some/path it creates a volume/dir and puts all the context of /some/path under it

mkdir /xyz
cd /xyz
cp /some/path .

I would rather have a volume that keeps the paths, so I could have multiple paths in one volume

mkdir /xyz
cd /xyz
mkdir -p some/path
cp /some/path ./some/path

Maybe something like this: docker run -d -p 80:80 -v data-var/var:/var

nioncode · 2018-08-08T09:34:13Z

@thaJeztah Are there any plans to support grouping volumes or should we create volumes one by one for each declared VOLUME in a Dockerfile?

For example, you start with a simple Dockerfile that just has a single VOLUME /data/vol1, so I start the container with -v vol1:/data/vol1. If then the Dockerfile gets updated to have a second volume /data/vol2, I have to change my run command to -v vol1:/data/vol1 -v vol2:/data/vol2.

It would be nice to have a -v all-data:$ALL_VOLUMES that transparently captures all declared VOLUMEs of the Dockerfile and puts them at their absolute path inside the all-data volume.

jonaskello closed this as completed Nov 8, 2015

quinncomendant mentioned this issue Feb 26, 2016

Data Only Containers/No Run Containers docker/compose#942

Closed

Darep mentioned this issue May 11, 2016

named volumes in docker-compose.yml should not be synced brikis98/docker-osx-dev#168

Closed

geerlingguy mentioned this issue Aug 5, 2016

Don't use busybox for data-only images geerlingguy/docker-examples#2

Closed

aripalo mentioned this issue Sep 8, 2016

Define node_modules as named volume jdleesmiller/docker-chat-demo#6

Closed

qianlei90 mentioned this issue Mar 7, 2017

深入Docker数据管理 qianlei90/Blog#34

Open

tjesser-ucdavis-edu mentioned this issue Jun 19, 2017

Added DockerFile for docker containerization geodynamics/vq#191

Merged

baip mentioned this issue May 29, 2018

Docker: extract heredocs from multi-process init and update docker-compose files huginn/huginn#2298

Merged

Data-only containers obsolete with docker 1.9.0? #17798

Data-only containers obsolete with docker 1.9.0? #17798

Comments

jonaskello commented Nov 8, 2015

GordonTheTurtle commented Nov 8, 2015

BUG REPORT INFORMATION

thaJeztah commented Nov 8, 2015

cpuguy83 commented Nov 8, 2015

jonaskello commented Nov 8, 2015

runcom commented Nov 8, 2015

trkoch commented Nov 12, 2015

duglin commented Nov 12, 2015

thaJeztah commented Nov 12, 2015

duglin commented Nov 12, 2015

trkoch commented Nov 12, 2015

BrianAdams commented Dec 22, 2015

quinncomendant commented Feb 26, 2016

cpuguy83 commented Feb 26, 2016

quinncomendant commented Feb 26, 2016

carsten-ulrich-saitow-ag commented Mar 2, 2016

thaJeztah commented Mar 2, 2016

quinncomendant commented Mar 2, 2016

thaJeztah commented Mar 2, 2016

damnhandy commented Mar 6, 2016

cpuguy83 commented Mar 7, 2016

robvelor commented Mar 16, 2016

thaJeztah commented Mar 16, 2016

robvelor commented Mar 16, 2016

thaJeztah commented Mar 17, 2016

sergeyklay commented Mar 17, 2016

derqnaque commented Apr 7, 2016

cpuguy83 commented Apr 7, 2016

cpuguy83 commented Apr 7, 2016

derqnaque commented Apr 7, 2016

cpuguy83 commented Apr 7, 2016

nioncode commented Jun 10, 2016

thaJeztah commented Jun 10, 2016

lsgd commented Jun 16, 2016 • edited Loading

thaJeztah commented Jun 16, 2016

arvenil commented Feb 1, 2017

thaJeztah commented Feb 2, 2017

arvenil commented Feb 2, 2017

nioncode commented Aug 8, 2018

lsgd commented Jun 16, 2016 •

edited

Loading