Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker cp to attached volume in container causes an unmount to be issued #93

Closed
j-griffith opened this issue Aug 25, 2017 · 5 comments
Closed

Comments

@j-griffith
Copy link
Contributor

j-griffith commented Aug 25, 2017

If you try and use to xfr data from node to a mounted volume in a container the result is an unMount being sent from Docker after the Mount is detected as already existing.

To reproduce:
docker volume create -d netapp --name=foobar
docker run -it --volume-driver netapp --volume foobar:/data --name crumbs bash

Then on the host:
docker cp /var/log/syslog- crumbs:/data/syslog

After running the docker cp command the volume is unmounted by the go-plugin-helper, the logs show this here:

DEBU[2017-08-25T15:21:05Z] Mount(&{foo b995165f99e38f246a0236cda8c2505b5922dc8c0632cc685a28f18ed67f1814}) DEBU[2017-08-25T15:21:05Z] Mounting volume foo on /var/lib/docker-volumes/solidfire/foo DEBU[2017-08-25T15:21:05Z] Begin osutils.GetDFOutput DEBU[2017-08-25T15:21:05Z] /var/lib/docker-volumes/solidfire/foo already mounted, returning existing mount 2017/08/25 15:21:05 Entering go-plugins-helpers unmountPath DEBU[2017-08-25T15:21:05Z] Unmount(&{foo b995165f99e38f246a0236cda8c2505b5922dc8c0632cc685a28f18ed67f1814}) DEBU[2017-08-25T15:21:05Z] SolidfireSANStorageDriver#Detach(foo, /var/lib/docker-volumes/solidfire/foo) DEBU[2017-08-25T15:21:05Z] Begin osutils.Umount: /var/lib/docker-volumes/solidfire/foo DEBU[2017-08-25T15:21:05Z] Response from umount /var/lib/docker-volumes/solidfire/foo:

I think this might be an upstream issue with the Docker volume API but not sure yet.

@j-griffith
Copy link
Contributor Author

After further investigation this appears to be a problem with Docker and some assumptions that may be made regarding iSCSI connections.

There are different ways to creat an iSCSI target for a device, one is to use unique target IQN's for each "volume" that exists on a device. This is how SolidFire works.

The other is to used a shared TGT IQN and access individual volumes via iscsi portal/lun assignment. A number of devices use this method including the ONTAP devices.

The docker cp operation in this case does some things with a volume that's attached to a container that doesn't work out well for unique targets (SolidFire for example). As part of the copy process docker will first issue an unmount command to the plugin API, which in our case does an iSCSI logout. This is the first problem, this means we just disconnected and unmounted an in use file system. Next, docker will issue a new Mount command; and copy the data from the local system to the mount point on the local system. After that's complete, it will again Unmount, disconnect and then Mount again back on the container.

This won't work for devices that use individual targets and could potentially cause issues for others. To work around this the best option is to mount the data source in to the container as well. From there use the container FS to copy the data from the source to the attached NDVP volume in the container. So something like this:
docker run -it -v <path-to-source-data>:/src-data --volume-driver ndvp -v my-ndvp-vol:/dest debian cp /src-data/<my-stuff> /dest/

@dutchiechris
Copy link
Contributor

@j-griffith Eeww! Is there an issue open with the Moby project? I searched but couldn't find one.

@j-griffith j-griffith reopened this Aug 29, 2017
@j-griffith
Copy link
Contributor Author

I have a fix using ref counting here, there's definitely some things we should've been doing better here.

j-griffith added a commit to j-griffith/netappdvp that referenced this issue Aug 29, 2017
SolidFire is a bit different than some other devices in the portfolio
in that it uses a unique target for each volume and it doesn't do
any internal reference counting in terms of attachment counts.

This gets pretty wonky in Docker becuase docker does a lot of
mount/unmount calls in the background, particularly when using
commands like `docker cp ./some-data container-foo:/data/`

It's debatable whether docker is well behaved here, but regardless
we can and should do better in terms of handling things like this.

This change adds a simple mechanism where we count the number of
Mounts issued against a volume.  We increment this count each time
a new mount request is received, and decrement it each time an
unmount request is received.  This way we don't accidentally rip
a volume out from under a container during things like docker cp
and more importantly we don't blow up if somebody is trying to do
multiple attachments to a single volume from multiple containers on
the same node.

For now this isn't persistent, and it can also cause some difficulty
with hanging connections if the counter somehow gets out of sync.  Those
types of things should be considered/addressed in the future, and we
will probably need somebody to get on creating an admin cli like the
solidfire-docker-driver has to do clean ups etc.

Addresses Github Issue: NetApp#93
@j-griffith
Copy link
Contributor Author

j-griffith added a commit to j-griffith/netappdvp that referenced this issue Aug 31, 2017
There are soem questionable behaviors on the docker side wrt
sequences of mounting and unmounting.  The `docker cp` command
inparticular has a good bit of seemingly unnecessary mount/unmount
operations that for some devices (those that use unique targets)
can trip things up a bit.

The problems include removing a device that's in use by a container
as well as removing an attachment that is expected to be present.

One way of dealing with this (and it's called out in the Docker
Volume API docs) is to utilize reference counting of the mounts
and track things that way.  We could do that, but there's some
concern around that in terms of things getting out of sync and
also storing state in the driver isn't the most ideal thing either.

It turns out however that maybe none of this is really necessary.
Instead of always unmounting nad removing iSCSI connections we can
instead leave mounts and iscsi connections in place for the life
of a volume.

So what this change does is makes the Docker API Unmount basically
a NOOP.  We still do things like Mount and create iSCSI connection
the same as before, but now we ignore the Unmount command.  If
subsequent or multiple Mount requests come in for a volume that's
fine, we just reuse the connection and mountpoint that we already
have in place.  We leave this in place until the volume is removed.

This works will in the case of things like reboots, or restarts of
NDVP as well, because we don't need to worry about state or ref
counting.  If a Mount cmd is received in those scenarios and the
mount was lost due to a reboot, we just create one and go about
our business.

Addresses Github Issue: NetApp#93
j-griffith added a commit to j-griffith/netappdvp that referenced this issue Aug 31, 2017
There are some questionable behaviors on the docker side wrt
sequences of mounting and unmounting.  The `docker cp` command
in particular has a good bit of seemingly unnecessary mount/unmount
operations that for some devices (those that use unique targets)
can trip things up a bit.

The problems include removing a device that's in use by a container
as well as removing an attachment that is expected to be present.

One way of dealing with this (and it's called out in the Docker
Volume API docs) is to utilize reference counting of the mounts
and track things that way.  We could do that, but there's some
concern around that in terms of things getting out of sync and
also storing state in the driver isn't the most ideal thing either.

It turns out however that maybe none of this is really necessary.
Instead of always unmounting and removing iSCSI connections we can
instead leave mounts and iscsi connections in place for the life
of a volume.

So what this change does is makes the Docker API Unmount basically
a NOOP.  We still do things like Mount and create iSCSI connection
the same as before, but now we ignore the Unmount command.  If
subsequent or multiple Mount requests come in for a volume that's
fine, we just reuse the connection and mount point that we already
have in place.  We leave this in place until the volume is removed.

This works will in the case of things like reboots, or restarts of
NDVP as well, because we don't need to worry about state or ref
counting.  If a Mount cmd is received in those scenarios and the
mount was lost due to a reboot, we just create one and go about
our business.

Addresses Github Issue: NetApp#93
@ntap-rippy
Copy link
Contributor

The change in 29aceaf should work around the problem by delaying the iSCSI session teardown until we delete the volume. The 'docker cp' commands that failed above work for me after this change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants