-
Notifications
You must be signed in to change notification settings - Fork 1.1k
The utility of creating a volume per node #1529
Comments
👍 for bringing this to light again. I don't see a use case for this either. If you use a volume driver to talk to a persistent datastore "outside" of the host itself, then you are creating multiple volumes on the storage end-point. We've found the only way around this is to use If it's the normal expected behavior by the machine, that's understandable. But I wouldn't consider it to be normal expected behavior from a user perspective. |
In theory the Different storage drivers may have workflows to achieve a volume creation. In the case of EC2, the workflow involves creating a volume and then assigning metadata. So the idempotency would be violated here since a volume operation may proceed on all hosts since the volume by name doesn't exist yet, but then when it comes time to change the metadata other hosts will fail after the first host succeeds. cc @cpuguy83 |
The reasoning for this is to make sure a container started with btw, I'm inclined to disable implicit volume creation on |
Personally, I would prefer that a volume is NOT created unless specified through the docker volume create command. One spelling mistake and then you end up with a container and a volume that needs to be killed and re-created. |
@cpuguy83 But let's look just a bit beyond that initial It a very common use case to separate your data volume from your service container. That way you can kill an existing service version X, upgrade that service to version X+1, and still connect to the exact same data volume. However, if there are N volumes and you cannot uniquely identify them, that makes it impossible to connect service version X+1 to the exact same data volume that service X was connected to. Also, I just confirmed that when added a new node to a Swarm cluster that has a volume created with |
@everett-toews This is up to the volume driver being used. |
👍 This behavior was hugely confusing to me. I'm just coming up to speed on Swarm, but I still don't quite understand all the implications of |
Agree that this is confusing. Having the local driver just create a volume local to the requesting container makes sense. It would seem a reasonable limitation that you cannot schedule containers with local volumes on other nodes unless another volume with the same name has been created with some other workflow. Also agree that I would prefer an explicit volume create and a volume affinity flag instead of implicitly creating the volume during docker run. |
As a long time Docker user who tended not to use data containers, because it felt clunky, I was pleased to see
and those two assumed constraints seemed fine to me. What I gather from the discussion above is that neither assumption is correct, but would enforcing those help pin down behavior? |
@itzg Yes, data containers are clunky but the alternative (duplicated volumes) is a can of worms; You can't tell where your data is and in which state of sync it is. |
Closing due to lack of activity. Please reopen if you wish to continue discussing it. |
While researching issue #1528, it really made me wonder about the practical utility of Swarm creating a volume per node when doing a
docker volume create
. I realize that a volume per node is the expected behavior but what use case does that cover?When I'm using Swarm, I want a place for my stuff. Doing a
docker volume create
seems ideal as it gives me a named volume where I can store persistent data (e.g. #1528). I can reference that volume later from other containers and not have to worry about orphaning the volume if any associated container gets removed. This seems to me like it's the primary use case for volumes.But when Swarm creates a volume per node using the default driver, it seems to eliminate that use case. Now I have X volumes (depending on how many nodes I have) and I have no idea where my stuff is. It could be in any one of those volumes but I have no way of knowing which one much less being able to reference that one directly even if I did.
It seems to me we'll need more control over
docker volume create
in order to be able to effectively utilize volumes in Swarm.Thoughts?
The text was updated successfully, but these errors were encountered: