New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[1.13-rc2] cannot use existing volumes if plugin is upgraded; volume is bound to a plugin ID #28913

Closed
srust opened this Issue Nov 29, 2016 · 10 comments

Comments

Projects
None yet
5 participants
@srust
Contributor

srust commented Nov 29, 2016

A volume that is created in a volume plugin, causes docker to cache the plugin ID that it was created / referenced with. If that plugin ID changes (volume plugin is removed, re-added. Or volume plugin is upgraded), then all of the volumes associated with the previous version of the plugin can no longer be used.

Steps to reproduce the issue:

  1. docker volume create --driver my/plugin myvol
  2. docker run -v myvol:/data alpine
  3. upgrade plugin via plugin rm / plugin create or plugin install of new/different version
  4. docker run -v myvol:/data alpine -- WILL NOT WORK
  5. systemctl restart docker
  6. Now, after a restart, cache is cleared and docker run -v myvol:/data alpine works again with new plugin.

Describe the results you received:

From docker run step 2:

Nov 29 00:21:46 srust7.localnet dockerd[16909]: time="2016-11-29T00:21:46Z" level=info msg="2016-11-29T00:21:46.987 DEBUG    myvol - \"POST /VolumeDriver.Path 1.1\" 200 28 [583.58ms]" plugin=54755b020a828b6ae0f09ae28a59334aca1d50b067340a3637f01dbec8163612

From docker run step 4:

Nov 29 00:24:16 srust7.localnet dockerd[16909]: time="2016-11-29T00:24:16.148683177Z" level=warning msg="Unable to connect to plugin: /run/docker/54755b020a828b6ae0f09ae28a59334aca1d50b067340a3637f01dbec8163612/my/plugin.sock/VolumeDriver.Path: Post http://%2Frun%2Fdocker%2F54755b020a828b6ae0f09ae28a59334aca1d50b067340a3637f01dbec8163612%2Fmy%2Fplugin.sock/VolumeDriver.Path: dial unix /run/docker/54755b020a828b6ae0f09ae28a59334aca1d50b067340a3637f01dbec8163612/my/plugin.sock: connect: connection refused, retrying in 1s"

From docker run step 6, new plugin ID:

Nov 29 00:27:08 srust7.localnet dockerd[18281]: time="2016-11-29T00:27:08Z" level=info msg="2016-11-29T00:27:08.256 DEBUG    myvol - \"POST /VolumeDriver.Get 1.1\" 200 67 [1066.28ms]" plugin=6911578a118dfc21ba42c46eb69b7a4789a7be85664e4476c4489d98ab445613

Describe the results you expected:

The existing volumes created / referenced with previous versions of the plugin should have worked without an issue. The plugin ID should be re-validated without requiring a docker restart. Docker restart should never be required to clear caches, or otherwise recover from normal system operation. The next time the docker run referencing the existing volume was executed, it should have just worked.

Additional information you deem important (e.g. issue happens only occasionally):

Output of docker version:

Client:
 Version:      1.13.0-rc2
 API version:  1.25
 Go version:   go1.7.3
 Git commit:   1f9b3ef
 Built:        
 OS/Arch:      linux/amd64

Server:
 Version:             1.13.0-rc2
 API version:         1.25
 Minimum API version: 1.12
 Go version:          go1.7.3
 Git commit:          1f9b3ef
 Built:               
 OS/Arch:             linux/amd64
 Experimental:        false

Output of docker info:

Containers: 1
 Running: 0
 Paused: 0
 Stopped: 1
Images: 80
Server Version: 1.13.0-rc2
Storage Driver: devicemapper
 Pool Name: docker-thinpool
 Pool Blocksize: 524.3 kB
 Base Device Size: 10.74 GB
 Backing Filesystem: xfs
 Data file: 
 Metadata file: 
 Data Space Used: 5.698 GB
 Data Space Total: 102 GB
 Data Space Available: 96.3 GB
 Metadata Space Used: 1.286 MB
 Metadata Space Total: 1.07 GB
 Metadata Space Available: 1.068 GB
 Thin Pool Minimum Free Space: 10.2 GB
 Udev Sync Supported: true
 Deferred Removal Enabled: true
 Deferred Deletion Enabled: false
 Deferred Deleted Device Count: 0
 Library Version: 1.02.107-RHEL7 (2016-06-09)
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 03e5862ec0d8d3b3f750e19fca3ee367e13c090e
runc version: 51371867a01c467f08af739783b8beafc154c4d7
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 4.4.27-1.el7.elrepo.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 15.67 GiB
Name: srust7.localnet
ID: MIUE:GKLN:L4OB:B2JU:OIPH:BTWO:L2P5:RQLP:F4DZ:GS5A:RHEH:PDCX
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): true
 File Descriptors: 20
 Goroutines: 26
 System Time: 2016-11-29T00:44:03.475716298Z
 EventsListeners: 0

Additional environment details (AWS, VirtualBox, physical, etc.):

@thaJeztah

This comment has been minimized.

Member

thaJeztah commented Nov 29, 2016

@anusha-ragunathan

This comment has been minimized.

Contributor

anusha-ragunathan commented Nov 29, 2016

@srust : The volume is indeed pinned to the pluginID (aka volume driver ID in this case). That is precisely why you cannot remove a plugin, when a volume created out of it, is active. I'm surprised that you were able to remove the volume plugin, while the volume was in use. Is the plugin used in step 1 a v1plugin? If yes, then that's expected.

Also, with #28717, you also cannot create a plugin which has a duplicate name with an existing plugin.

On master, I tried to reproduce the issue you are facing with and the daemon works and errors out as expected.

  1. Plugin upgrade via docker plugin rm followed by an plugin install/create:
  • docker plugin install tiborvass/no-remove
  • docker volume create -d tiborvass/no-remove --name myvol
  • docker plugin rm tiborvass/no-remove throws expected error Error response from daemon: plugin tiborvass/no-remove:latest is in use
  1. Plugin upgrade via docker plugin create with the same name:tag as existing plugin:
  • docker plugin create tiborvass/no-remove /tmp/pluginDir throws expected error error response from daemon: tiborvass/no-remove:latest already exists
  1. Plugin upgrade via docker plugin create with the same name, but tag that is different from existing plugin:
  • docker plugin create tiborvass/no-remove:2.0 /tmp/pluginDir
  • docker plugin enable tiborvass/no-remove:2.0
  • docker volume create -d tiborvass/no-remove:2.0 // A new volume should be created.
  • docker run -v myvol:/data alpine ash
@srust

This comment has been minimized.

Contributor

srust commented Nov 29, 2016

Hi @anusha-ragunathan, thanks for the response.

What you say is true. If the plugin is enabled, it cannot be removed (without a --force). And if you attempt to create or install a plugin with an existing name it will fail. That is all correct behavior, as you have outlined.

The issue is when a user actually does want to do an upgrade, and how a user may do that. I'll try to be specific so that I can get feedback on the exact use-case.

Let's say a plugin my/plugin is enabled using the :latest version, and a volume is created.

  • docker plugin create my/plugin /tmp/pluginDir

  • docker plugin enable my/plugin

  • docker volume create -d my/plugin:latest --name myvol

  • docker volume ls

DRIVER                 VOLUME NAME
my/plugin:latest       myvol

Now, imagine a new plugin is released. This could be a bugfix, security update, specific feature enhancement, etc. A plugin could be upgraded for many valid reasons.

To upgrade the plugin, the user may perform the following steps:

  • docker plugin disable my/plugin
  • docker plugin rm my/plugin
  • docker plugin create my/plugin /tmp/pluginDir
  • docker plugin enable my/plugin

The new version of the plugin shows the volume just fine:

  • docker volume ls
DRIVER                 VOLUME NAME
my/plugin:latest       myvol

But it cannot use it, because docker has cached the old plugin ID (volume driver ID).

  • docker run -v myvol:/data alpine sh
    docker: Error response from daemon: error while mounting volume '': Post http://%2Frun%2Fdocker%2F3c8baa75ed16e84b93c02d58e33f286fec2b467667bf9c1b640ad8bcb3b0b82b%2Fmy%2Fplugin.sock/VolumeDriver.Mount: dial unix /run/docker/3c8baa75ed16e84b93c02d58e33f286fec2b467667bf9c1b640ad8bcb3b0b82b/my/plugin.sock: connect: no such file or directory.

(NOTE: In the above example, docker plugin install could be used instead of docker plugin create)

It is my assertion that:

  1. Volumes are long-lived. The plugin is an accessor, not a data store.
  2. Plugin upgrade is an important and common operation.
  3. Plugin upgrade can occur for various reasons, such as security fixes, bug fixes, new features -- and is not just a plugin development problem.
  4. Plugin upgrade should be possible without the restart of the docker daemon, just so that caching be cleared, particularly since unrelated services can be running, across swarm environments, etc.
  5. That docker plugin rm --force and docker plugin disable ; docker plugin rm are equivalent, and neither of them should strand volumes associated with old plugin versions.

It seems to me that on plugin rm, any volumes referencing it should have their references removed. While that is not ideal from a potential performance perspective (walking the volume cache), and is not great if that volume is used while the plugin is down, it does at least allow the docker daemon to be internally consistent when the plugin ID changes. Another alternative would be to go through the volumes on plugin enable and update any old volume references at that point.

Can you please comment on the above example and assertions and help me where there is any misunderstanding?

Thanks.

@cpuguy83

This comment has been minimized.

Contributor

cpuguy83 commented Nov 29, 2016

So, to me it sounds like we are missing a case here.
While we could "fix" the issue by doing some ID->Name comparisons with registered drivers and cached volumes, I feel this is a hacky solution.

As @srust mentioned, being able to upgrade a plugin is going to be a common case, and not necessarily desirable to remove a plugin in order to install the upgraded one.
I think in such a case we need a way to update the rootfs/config of the plugin without change the plugin's ID.
Example:

$ docker plugin install cpuguy83/beststorageever:v1
$ docker plugin update cpuguy83/beststorageever:v1 cpuguy83/beststorageever:v2

This unties the plugin lifecycle from the actual process management.

@srust

This comment has been minimized.

Contributor

srust commented Nov 29, 2016

Perhaps a docker plugin pull cpuguy83/beststorageever:latest as well, coupled with a docker plugin restart cpuguy83/beststorageever:latest?

As the latest :latest could be an upgrade as well, just like normal image management.

@cpuguy83

This comment has been minimized.

Contributor

cpuguy83 commented Nov 30, 2016

@srust Or docker plugin upgrade <name> to just have it re-pull the same tag.

I'd rather not mix container/image lifecycle with plugins.

@srust

This comment has been minimized.

Contributor

srust commented Nov 30, 2016

@cpuguy83 Yep that works!

@cpuguy83

This comment has been minimized.

Contributor

cpuguy83 commented Nov 30, 2016

@anusha-ragunathan @tiborvass SGTY?
update or upgrade?

Should this have a flag (e.g. --image, --repo) in case there are other options that would be added later? (inclined to say no since we have plugin set).

@anusha-ragunathan

This comment has been minimized.

Contributor

anusha-ragunathan commented Nov 30, 2016

Having a docker plugin upgrade sounds necessary. This would be useful when a plugin with the same tag (usually latest) needs an upgrade. This would retain the pluginID and update the rootfs and config.json.

@cpuguy83

This comment has been minimized.

Contributor

cpuguy83 commented Dec 14, 2016

plugin upgrade here: #29414

@thaJeztah thaJeztah modified the milestones: 1.13.1, 1.13.0 Feb 4, 2017

@thaJeztah thaJeztah removed the priority/P1 label Feb 10, 2017

@thaJeztah thaJeztah removed this from Must have in 1.13-rcX Feb 10, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment