Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Volumes clobber underlying files #3482

Closed
LIV2 opened this issue Dec 12, 2016 · 7 comments
Closed

Volumes clobber underlying files #3482

LIV2 opened this issue Dec 12, 2016 · 7 comments
Assignees
Labels
area/docker Support for the Docker operations area/storage Storage-related functionality kind/debt Problems that increase the cost of other work kind/investigation A scoped effort to learn the answers to a set of questions which may include prototyping priority/p0 source/customer Reported by a customer, directly or via an intermediary
Milestone

Comments

@LIV2
Copy link

LIV2 commented Dec 12, 2016

When volumes are mounted in the container they clobber the underlying files instead of copying them like docker does:
https://docs.docker.com/engine/tutorials/dockervolumes/

Volumes are initialized when a container is created. If the container’s base image contains data at the specified mount point, that existing data is copied into the new volume upon volume initialization. (Note that this does not apply when mounting a host directory.)

Steps to reproduce:
Build & run container from the following dockerfile, /etc/example/nowyouseeme should exist.
Works correctly on Docker 1.10.3, does not work on VIC.

FROM debian:jessie

RUN mkdir -p /etc/example/nowyouseeme

VOLUME ["/etc/example"]

ENTRYPOINT ["/bin/bash"]
@mdubya66
Copy link
Contributor

@fdawg4l please triage

@mdubya66 mdubya66 added area/docker Support for the Docker operations area/storage Storage-related functionality labels Dec 12, 2016
@fdawg4l
Copy link
Contributor

fdawg4l commented Dec 12, 2016

I remember we made a judgement call to get mounting of volumes to work and punted on copying the underlying data that may exist in the mount point before mounting. The original issue #1819 covered both aspects- mounting, copying, remounting, but the fix only covered the first. Lets use this to track getting the copy working.

@LIV2 We'll get a fix going ASAP. In the meantime, your data isn't gone. You can umount the volume, tar -czf </tmp/filename.tar.gz> <mountpoint>, mount the volume, then cd <mountpoint>; tar -xvzf </tmp/filename.tar.gz>. Totally unfriendly way around the problem and certainly not a long term solution. But we're working on it and if you need the data ASAP, that's one way to get it.

@fdawg4l
Copy link
Contributor

fdawg4l commented Dec 12, 2016

A few things to keep in mind.

Note that this does not apply when mounting a host directory.

In our model, volumes are actual devices, not just directories. So the implementation doesn't exactly map to our model as-is.

Some things we need to consider.

  • How do we intend to handle a reboot. Should the volume's data be modified from the underlying image's mount point every time you reboot, or only the first time we attach the volume to the container vm.
  • How do we intend to handle attach/detach/attach. If we are to copy the source data the first time we attach a volume to a container and only then, then how do we handle reattach?

image:latest has files /foo and intends to mount a volume to /foo. When we create a vol to attach to /foo, we copy the source files. But then image:latest moves ahead a few versions. The user detaches the volume, creates a new container with the new latest, and then, again, the image wants the volume on /foo. Except in this latest image the source has updates files in /foo. Are we suppose to copy the source files into /foo's volume in this case? Is it up to the user to update the files? What if the files have been modified by the user already in the volume once when it was attached to the first container?

What if it doesn't involve an update at all? What if the volume is simply detached from container1 and attached to container2. Both use the same image, but how do we differentiate the above (upgrade) scenario with this scenario? Will we need to keep track of source file versions and only apply the updated (future) version?

I think we stalled on the above which is why we never implemented the source file in volume workflow. It's trivial to implement, but the model isn't very clear from a user's perspective. @hickeng ?

@hickeng
Copy link
Member

hickeng commented Dec 12, 2016

@fdawg4l we simply didn't do it because of time. There's quite an array of possible semantics that can be applied and no really good answer. Initially we'll just duplicate docker's behaviour as there is at least a subset who assume those semantics.

The primary question for longer term is what use case was this behaviour intended for - I think it was intended to address the scenario where an application doesn't allow for a clean separation of configuration and data files.

@fdawg4l
Copy link
Contributor

fdawg4l commented Dec 12, 2016

Sounds good @hickeng. IIRC docker's semantics means ovewriting files every time the volume is attached to the container. That shouldn't be too difficult.

@fdawg4l
Copy link
Contributor

fdawg4l commented Jan 17, 2017

We need a plan of how we want to address this functionality. One suggestion was to use overlayfs to implement COW on the volume. In any case, investigation is required and an interface defined.

@anchal-agrawal
Copy link
Contributor

This issue was encountered in #3909.

@fdawg4l @matthewavery Can we add an estimate to this and tag it as kind/investigation if necessary so we can pull this into the backlog? Thanks!

@fdawg4l fdawg4l added the kind/investigation A scoped effort to learn the answers to a set of questions which may include prototyping label Feb 17, 2017
@hmahmood hmahmood added the kind/debt Problems that increase the cost of other work label Apr 12, 2017
@jakedsouza jakedsouza self-assigned this Apr 12, 2017
@mdubya66 mdubya66 added this to the Sprint 7 milestone Apr 12, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docker Support for the Docker operations area/storage Storage-related functionality kind/debt Problems that increase the cost of other work kind/investigation A scoped effort to learn the answers to a set of questions which may include prototyping priority/p0 source/customer Reported by a customer, directly or via an intermediary
Projects
None yet
Development

No branches or pull requests

8 participants