Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experimental feedback - Volume plugins #13420

Closed
icecrime opened this issue May 22, 2015 · 47 comments
Closed

Experimental feedback - Volume plugins #13420

icecrime opened this issue May 22, 2015 · 47 comments

Comments

@icecrime
Copy link
Contributor

This is a placeholder issue to collect feedback on the volume plugins experimental feature shipped as part of Docker 1.7.0.

@TomasTomecek
Copy link
Contributor

I went through docs and got to admit, plugin system looks pretty interesting; haven't tried it myself though. One note I could think of was that I find the API between plugin and engine to be too simple. E.g. I would like to see there ID of a container which is asking for volume creation/mounting. I also assume that configuration is completely up to plugin and there is no way to pass anything from engine to the plugin (e.g. arguments for mounting).

Also, are you guys planning to do other plugin types? E.g.

  • container plugin — call plugin whenever container is created, started, stopped, killed
  • build plugin — call plugin when build is started/stopped/failed
  • image plugin — call plugin when image is pulled/pushed/imported/exported

That would make plugins very powerful and I can image a lot of ongoing problems would got solved (e.g. secrets/mounts during build, layer/cache control).

@mattes
Copy link

mattes commented May 25, 2015

Is someone working on anything rsync related regarding volume drivers? For the client side, I started working on https://github.com/synack/docker-rsync.

@larsks
Copy link
Contributor

larsks commented May 29, 2015

The suggested syntax (-v volumename:/data --volume-driver=flocker) looks to me like it would unnecessarily limit things to a single volume driver per container. What if instead of --volume-driver, --volume was enhanced to accept a third initial parameter, the name of a volume driver? Assume that no volume driver means "local", so this would still work:

-v /host/path:/container/path

But assuming, say, a "blockdev" volume driver, you could do this:

-v blockdev:/dev/sda1:/container/path

Or a "tmpfs" volume driver (Hi, Dan), like this:

-v tmpfs::/container/path

And you could combine these:

docker run -v /host/path:/container/path -v blockdev:/dev/sda1:/data -v tmpfs::/run ...

@cpuguy83
Copy link
Member

I agree, I don't like --volume-driver.
Even if we extend it to do something like -v /data --volume-driver blockdev:/data, it's still not nice.
I much prefer something like -v blockdev://<name>:/data:opt1,opt2,opt3... or potentially -v blockdev://<name>@/data:opt1,opt2,opt3
I don't think there is really a problem "overloading" this flag here since it's generally quite natural syntax with proto + path, and then opts like mount -o opt1,opt2,opt3

@rhatdan
Copy link
Contributor

rhatdan commented May 29, 2015

As I commented in my tmpfs volume patch, I dont line --volume-driver either rather then use syntax like

docker run -ti -v tmpfs:/run -v tmpfs:/tmp fedora /bin/sh

"
Well I think that stinks, for several reasons,

  • It gives me no flexibility, I don't have any ability to have multiple volumes of different types.
  • It is not easy to understand. The -v syntax is well understood, and my change is intuitive, I belive, Similar individual mount commands
  • I don't believe the --volume-driver concept will tar up the contents of the underlying file system to be used in the new tmpfs, which is critical to using it for /run and I like the idea of using it for /var.
    "

@rhatdan
Copy link
Contributor

rhatdan commented May 29, 2015

I like the idea of passing the mount options on the volume command also. With tmpfs the big one is size.

Other potentials would be noexec. I think nodev should be default for all of these file systems

@rhatdan
Copy link
Contributor

rhatdan commented May 29, 2015

@larsks Not sure you need to tmpfs::/mount/point. tmpfs:/mount/point would be easier for users and not hard to implement.

@larsks
Copy link
Contributor

larsks commented May 29, 2015

@rhatdan I am proposing that the syntax is:

[<driver>:]<src>:<dest>

Where the interpretation of <src> is driver dependent, and a missing <driver> means local:. With this model, tmpfs:/mount/point would need to be handled magically by the local driver, or there would need to be some sort of weird special casing for -v that would act different if the first component starts with a / or not.

I'm in favor of an explicit vs. implicit syntax.

@cpuguy83
Copy link
Member

@larsks Right now it's [<name>|<src>:]<dest>, where is used to name a volume, not a driver.

@icecrime
Copy link
Contributor Author

Hello all, and thanks for the feedbacks! Let me try to answer some of the questions here.

@TomasTomecek:

  1. API was deliberately kept minimal in this first version. We'd rather first hear from the community what use cases they can't achieve with the current design, and "poke more holes" as necessary. On your specific example of passing the container ID, I'd like to hear an example where it's required (right now we provide a unique volume name which I believe is sufficient).
  2. Driver-specific are indeed very likely to be needed at some point, they just weren't implemented so far.
  3. We might have more plugin types in the future. One of the reasons we marked the feature as "experimental" is that we still don't really know what's going to be the community and user adoption. As we gain more feedback and use cases, we'll refine the UX and potentially expand the scope. API plugins is definitely something that we are thinking about (think Powerstrip), but our current API implementation might need a good rewrite before that.

@larsks @cpuguy83 @rhatdan:

You're right that the --volume-driver syntax is not ideal. There are several reasons behind it right now:

  1. The goal is clearly to have a top-level docker volume command to manage volumes, which will open the possibility for per-volume drivers1 (e.g., docker volume create foo --driver=bar). This is by the way exactly the plan for network management too. We didn't want to put this as a requirement for this first release of the feature, which is why we went for the simplest. Brian I know you had work on this in the past, I think we're ready to do this for 1.8.0 if you interested.
  2. Solomon was very clear that he doesn't want the driver name to appear as part of the volume specification. His reasoning for this is that we want to decouple volume definition from the operational aspect (e.g., "my container needs a volume X" versus "volume X should be managed by driver Y"). He'd ideally want it to be possible to label volumes and somehow leave it to a separate (ops defined) configuration to associates labels to drivers.

I hope this clears up some of the questions.


1 Truth is there already is a hackish way to do multiple volume drivers for a single containers using --volume-from, but that's far from convenient.

@rhatdan
Copy link
Contributor

rhatdan commented May 29, 2015

@cpuguy83 I am fine with -v tmpfs::/tmp

@icecrime I find the whole --volume-driver to be huge failure on the Ease of Use Scale. One of the huge advantages of docker is its ease of use from the CLI, and I see this as a huge step backwards.

docker run --volume-from=tmpfs /tmp --volume-from=local /mnt fedora /usr/sbin/yuck

docker run -v tmpfs::/tmp -v /foo:/bar -v /mnt fedora /usr/bin/nice

@cpuguy83
Copy link
Member

@icecrime I totally get not wanting to do much more than -v /foo in the syntax... however...
In generally I think most people will use the same volume driver for everything, and then other ones for special cases (maybe mount something from keywhiz/vault, a tmpfs mount, etc).

So enable a user to change the default driver (at the daemon level) thus providing the desired minimal syntax for most cases, and then allowing people to have the finer control with on the -v for their special cases.

@icecrime
Copy link
Contributor Author

@rhatdan But if you consider this as a transition path toward

docker volume create <name> --driver=<driver>
docker run -v <name> <image>

How would you feel about that? One of the reasons the feature is experimental, is that we know this is not the final UX, so I'm glad we can discuss this now.

@rhatdan
Copy link
Contributor

rhatdan commented May 29, 2015

From a usability point of view not as easy as what we have. I have no problem with that but I still prefer the shorthand.

Lots of different applications will be using tmpfs volumes, each vying for the same name tmpfs or worse each creating their own slightly different volume name.

It breaks our atomic command since we currently expect their to be one command to start an application if I use tmpfs now I need two commands, if my volume creation tool fails what do I do? Do I know if it failed because the content already existed versus other errors? Do I need to start doing docker volume list to figure out if the volume already exists.

@rhatdan
Copy link
Contributor

rhatdan commented May 29, 2015

Bottom line I would want both. Yours is good for seting up advanced volumes, but it is harder for the general use case.

@TomasTomecek
Copy link
Contributor

@icecrime I was thinking of implementing secrets during build.

@SvenDowideit
Copy link
Contributor

in playing with https://github.com/SvenDowideit/docker-volumes-nfs I think we need more meta-info.

because the plugin doesn't know anything about the daemon, or the container that is being requested, there's no way to make a unique mount point per container, to mount the volume to the actual docker graphdriver dir (either matching the non-root partition, or the correct docker daemon if there are more than one), and if the same thing is mounted into 2 containers, we're forced to add reference counting.

BUT.... it works :)

@cpuguy83
Copy link
Member

@SvenDowideit Plz don't mount to the graphdriver dir :)
unique mount point should be the id/name that the daemon sends.

@mauri
Copy link
Contributor

mauri commented Jun 18, 2015

What happen if we want to use different volume drivers on the same container? for example, a nfs volume and local volume. Would we use multiple --volume-drivers before the actual -v?

+1 to @larsks proposal

@aisrael
Copy link

aisrael commented Jul 1, 2015

Hi. We just wrote a (crude) implementation of the docker volume plugin using Ceph RBD.

Quick feedback: when creating new volumes for Ceph, we need to pass in the desired volume size (among other things). Currently, we have to hard-code that or provide it as a configuration option but that will be applied to all volumes.

IOW, we might want to think about some way of passing driver-specific parameters (say, part of the docker run), on a per volume basis.

@SvenDowideit
Copy link
Contributor

@cpuguy83 atm, the daemon isn't sending a unique id/name, which is part of the problem :)

@ioggstream
Copy link

Is snapshotting an intended feature?
I modified @SvenDowideit plugin to support generic block devices (eg. lvm), so an intended way of cloning volumes could be useful.

@cpuguy83
Copy link
Member

cpuguy83 commented Jul 6, 2015

@loggstream What is the intention of snapshotting?
I don't think Docker should be a data management tool, rather it should be calling out to data mangement tools.
API endpoints should be focused around things happening with containers and the driver makes the decision on what to do.
For instance:

  1. Fire up container on host A with volume named demo
    • docker calls out to the volume driver to create a new volume called demo
    • volume driver mounts new volume to host
    • host path is bind-mounted into the new container
  2. A new container is fired up on a different host with a volume named demo
    • docker calls out to the volume driver to create a new volume called demo
    • volume driver knows that it already has a volume called demo
    • volume driver makes decision on what to do here (is it copying, is it providing a CoW snapshot, some intelligent cloning, etc)

@ioggstream
Copy link

@cpuguy83 a typical Openstack use-case is snapshotting a machine and its volumes for backup/scaleup.

In your 2nd scenario, I expect to pass some hints to trigger the cloning...

How would you implement with docker the following use case?

  • given a container with a volume attached;
  • clone the server volume (eg. CoW);
  • attach the cloned volume to another container.

@childsb
Copy link

childsb commented Jul 16, 2015

Will this support containerized volume client? If I wanted to keep the CEPH RBD client stack (or any other docker volume client) in a container and not on the host, will that be possible?

@chakri-nelluri
Copy link

I prefer
docker volume create --driver=
docker run -v

Seems more inline with the network UI.

Any thoughts on what is the best place to specify runtime mount options?
For example: NFS rsize,wsize options.

@j-griffith
Copy link

FWIW, I'd leave the management functionality out of the picture for now (ie snapshots). Docker isn't/shouldn't be an Infrastructure management platform. Just a plugin to make things easier to consume.

@jvinod
Copy link

jvinod commented Jul 24, 2015

As @SvenDowideit points out, passing in the container_id as part of the mount/unmount request will help. If the same volume is mounted in more than one container, then it gets a little messy for the plugin to keep track of refcounts across restarts and reboots.

@SvenDowideit
Copy link
Contributor

The compromise is to add a unique volume id for each volume-name+container pair - see #14737

@abhishek-kane
Copy link

I agree to @cpuguy83 's comment that generally people will use same volume driver for a container.

@icecrime How about an option where user can specify a config file to be used. Use case- say for example, user wants to use a specific size and layout for the volume. Each volume driver provider can have special formats of the config file. This config option could also be used for plugins other than volume plugin.

@mauri
Copy link
Contributor

mauri commented Jul 29, 2015

I'm a bit worried about the "one volume driver per container should be enough" assumption. Is it driven by how difficult is the implementation, or what's the rational?

@Patiljn
Copy link

Patiljn commented Jul 29, 2015

I am agree with @aisrael. Volume driver API only accepts a volume name. We are currently unable to specify other volume characteristics like volume size, IOPS, volume layout, pool name, snapshot etc. How are we planning to add these storage requirements to volume driver API?

@cpuguy83
Copy link
Member

@mauri This is a temporary issue.
A top-level volume API/CLI should be part of Docker 1.9 that will make this a non-issue.

@Patiljn See above. The top-level API will allow setting volume-driver specific options when creating a volume as just a map of key/value pairs.
For instance: docker volume create --driver awesomevolumes --opt size=10G --opt iops=10000
The opts are passed directly down into the volume driver.

@Patiljn
Copy link

Patiljn commented Jul 30, 2015

@cpuguy83 Are these opts per volume basis, right? e.g. If we want to create more than one volume with the same Volume driver, then we should be able to specify different opts to different volumes depending on our requirement.
One more thing is these opts should be specific to volume driver and not the generic one. Just wanted to clarify this.

@ioggstream
Copy link

@cpuguy83 so what about supporting snapshots like this?

--opt snapshot-from=other-volume-wwid

@cpuguy83
Copy link
Member

@ioggstream So far these opts just get passed directly to the volume driver and are not parsed by docker at all, so yeah that's totally possible, but must be implemented by the driver.

@ioggstream
Copy link

@ioggstream yeah that's totally possible, but must be implemented by the driver.
Seems reasonable to me.

Is there any orchestration tool already supporting something like that?

@cpuguy83
Copy link
Member

@ioggstream Being that the api is not part of Docker yet? I'd say no.

@cpuguy83
Copy link
Member

@ioggstream volume API was just merged, so 1.9 will include it, and hopefully orchestration systems will follow suit.

Also, re-read my last comment and it probably seemed a bit rude... totally not intended to be! Sorry about that.

@ioggstream
Copy link

2015-08-26 22:59 GMT+02:00 Brian Goff notifications@github.com:

@ioggstream volume API was just merged, so 1.9 will include it, and hopefully orchestration systems will follow suit.

Looking ...

Also, re-read my last comment and it probably seemed a bit rude... totally not intended to be! Sorry about that.
Don't worry :)

Peace,
R.

@h0tbird
Copy link

h0tbird commented Sep 1, 2015

I have the habit of using the --rm flag when I run docker containers just to keep my host clean.
If I use this flag with volume drivers a remove call is sent to the driver API. I don't like it.
I want to keep my host clean and my volumes reusable.

I am also happy to share with you my first Go program: docker-volume-rsync

@cooljiansir
Copy link

Hi! I'm confused by volume plugin:
What 's the differences between volume plugin and -v /host/path/:/docker/path/?
I found that"Docker volume plugins enable Docker deployments to be integrated with external storage systems, such as Amazon EBS...".
If there is a path /home/amazon/,which is an EBS,then I just docker run -v /home/amazon/:/data/ -it ubuntu,so why I have to use volume plugins? Thanks :-)

@ioggstream
Copy link

2015-09-09 7:57 GMT+02:00 zhijian notifications@github.com:

If there is a path /home/amazon/

The plugin will:

  • eg. mount the EBS store for you;
  • on a remote docker server.

So you shouldn't care about how binding an EBS to your server.

@cooljiansir
Copy link

So:
1 Using driverA plugin
2 mount driverA /home/driverAand then docker run -v /home/driverA/:/data/ -it ubuntu
They are the same,right?
But it'll be more convenient if we use plugin ? :-)

@clintkitson
Copy link
Contributor

@cooljiansir The process of making a volume available to a Docker host in EC2 has a handful of steps.

  1. What is the instance ID of the host
  2. Attach the Volume to this instance
  3. Find the Device
  4. Format the device
  5. Mount the device

.. and the reverse when the container stops. So you can think of the steps that a Volume Driver accomplishes as a workflow that simplifies preparing external volumes to be used with Docker.

For example, see the following example where we are using Docker Machine to fire up a host with a volume driver configured. We then can have a single command that enables the container to use the new volume. The same command applied to another host would swing the volume to that host.

dicey1:machine clintonkitson$ docker $(docker-machine config testt26) run -ti --volume-driver=rexray -v testing:/testing busybox
/ # df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/mapper/docker-202:1-16819084-3c8daea37802658c36e86cd6254e5e79f300cd830c14cc64a11030eb841d7130
                     103080888     62640  97758984   0% /
tmpfs                   507972         0    507972   0% /dev
shm                      65536         0     65536   0% /dev/shm
tmpfs                   507972         0    507972   0% /sys/fs/cgroup
/dev/scinia1           16382844     45080  15482520   0% /testing

@thaJeztah
Copy link
Member

@icecrime I think we can close this now that the volume plugins moved out of experimental with 1.9?

@thaJeztah
Copy link
Member

closing per my comment above

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests