GCS will delete contents of mapped directories on exit if they are under /tmp/gcs/<containerID>/ #131

beweedon · 2017-09-07T19:41:36Z

This is because the GCS apparently doesn't unmount mapped directories during cleanup (see https://github.com/Microsoft/opengcs/blob/master/service/gcs/core/gcs/cleanup.go). We should add code to perform this unmounting during the cleanup phase.

@jterry75 @jhowardmsft

Resolves: microsoft#131

jstarks · 2017-09-08T00:49:49Z

Holy smokes! As an extra precaution, can we avoid crossing file-system boundaries when we delete /tmp/gcs/.../ ?

simonferquel · 2017-09-08T07:58:33Z

@jstarks I agree. If for whatever reason the unmounts fails, we need to be sure the content does not get erased by some code further down the path. We have been hurt very hard by a similar issue at some point with Docker for Windows, and really, you don't want to have to patch this kind of bug while in production.

jterry75 · 2017-09-08T15:14:06Z

Do you have a proposed mitigation? I agree there is a risk here for sure when unmount fails but I don't know how I can cleanup the container in this case. Any deletes to the rootfs will also delete the mounted folder

simonferquel · 2017-09-08T15:34:28Z

with unix fs semantics, I would assume that mv <mountpoint> /to-not-remove might work... In the case of docker for Windows we decided that if an unmount fails, we don't remove the mount points at all.

gupta-ak · 2017-09-08T17:52:34Z

Reopening for further discussion.

After unmounting, you might be able to do a rename(rootfs, rootfs-removing) or something to make sure you're not crossing file system boundaries and then delete it.

lowenna · 2017-09-08T17:54:08Z

Would it just be easier to put it back at /binds/containerID/... instead? The location is under the control of docker, so it's a simple fix. At least that would guarantee not to be cleaned by GCS.

gupta-ak · 2017-09-08T19:53:25Z

I think the issue is that the gcs does the equivalent of rm -rf rootfs which will delete everything including layer + plan9 data if the previous unmounts did not succeed. It doesn't matter where the mounts sources are located, because they eventually have to be mounted inside rootfs so that the container can access them.

Right now the current cleanup code tries to unmount the mapped disks, the mapped directories and the layers and then does rm -rf rootfs. If any of the unmounts fails, then it just logs it and continues forward, which can corrupt data on the delete.

It makes sense to me to have the same behavior as Docker for Windows where we simply don't clean up if the unmounts fail. Otherwise, we could corrupt the layers.

rn · 2017-09-08T20:15:35Z

Out of curiosity (and showing my ignorance to the finer implementation details here), why do you attempt to clean up in the first place? If the container goes away, the VM get's destroyed and then can't the layer just be removed on the host?

beweedon · 2017-09-08T22:52:17Z

Unmounting is necessary, because if the share isn't unmounted not all file operations on it are guaranteed to be flushed to the host. As for actually deleting the layers, I think this is less relevant now, but could be useful if attempting to run multiple containers in the same UVM. Otherwise a container would leak its private storage on exit.

beweedon · 2017-09-08T22:54:22Z

And, yeah, @simonferquel @gupta-ak I think it makes sense to emulate docker for Windows's behavior of not deleting layers which were not unmounted.

friism · 2017-09-12T17:12:28Z

@beweedon @jhowardmsft are you making progress on this? It seems like this on the critical path for LCOW, no?

beweedon · 2017-09-12T17:25:41Z

@jterry75, does the behavior of only deleting directories which we successfully unmounted seem good? We can make those changes if so.

Also, FYI to everyone else, Justin has been taking over the GCS from me, and at this point I won't really be doing any more work on it. I'll still be around to take part in discussions and stuff though :)

jterry75 · 2017-09-14T17:51:31Z

I think we can skip the delete call on the layers if the unmount fails. The original issue was that we were not calling unmount for mapped directories so we are really talking about an unlikely failure path here. As far as I see it it only has two consequences to not delete the layers on failure to unmount:

The UVM storage size will grow over time if this happens a lot. (Highly unlikely)
The UVM will not be able to use that containerID again. (Unclear how likely this would be. I assume Docker creates a new ID per instance so we probably fine with this too even in a world where we have multiple containers per uvm) (If this becomes a problem we can always map external id's to unique gcs id's)

Everyone ok with me only deleting the layer if we have no unmount failures?

beweedon · 2017-09-14T18:42:25Z

Sounds good to me!

Will only destroy the container storage in the UVM if all mounts are successfully unmounted. Without this we have no way of knowing if it safe to delete the files or if this could have an affect on the host files. Resolves: microsoft#131

beweedon assigned jterry75 Sep 7, 2017

lowenna mentioned this issue Sep 7, 2017

LCOW: Prepare work for bind mounts moby/moby#34258

Merged

jterry75 added a commit to jterry75/opengcs that referenced this issue Sep 7, 2017

Fixes an issue on cleanup with mapped directories

fba31a0

Resolves: microsoft#131

jterry75 mentioned this issue Sep 7, 2017

Fixes an issue on cleanup with mapped directories #132

Merged

jterry75 added a commit to jterry75/opengcs that referenced this issue Sep 7, 2017

Fixes an issue on cleanup with mapped directories

2403c72

Resolves: microsoft#131

gupta-ak closed this as completed in #132 Sep 7, 2017

gupta-ak reopened this Sep 8, 2017

jterry75 mentioned this issue Sep 14, 2017

Fixes potential delete on unmounts failure #137

Merged

gupta-ak closed this as completed in #137 Sep 14, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GCS will delete contents of mapped directories on exit if they are under /tmp/gcs/<containerID>/ #131

GCS will delete contents of mapped directories on exit if they are under /tmp/gcs/<containerID>/ #131

beweedon commented Sep 7, 2017

jstarks commented Sep 8, 2017

simonferquel commented Sep 8, 2017

jterry75 commented Sep 8, 2017

simonferquel commented Sep 8, 2017

gupta-ak commented Sep 8, 2017

lowenna commented Sep 8, 2017

gupta-ak commented Sep 8, 2017

rn commented Sep 8, 2017

beweedon commented Sep 8, 2017

beweedon commented Sep 8, 2017

friism commented Sep 12, 2017 •

edited

beweedon commented Sep 12, 2017

jterry75 commented Sep 14, 2017

beweedon commented Sep 14, 2017

GCS will delete contents of mapped directories on exit if they are under /tmp/gcs/<containerID>/ #131

GCS will delete contents of mapped directories on exit if they are under /tmp/gcs/<containerID>/ #131

Comments

beweedon commented Sep 7, 2017

jstarks commented Sep 8, 2017

simonferquel commented Sep 8, 2017

jterry75 commented Sep 8, 2017

simonferquel commented Sep 8, 2017

gupta-ak commented Sep 8, 2017

lowenna commented Sep 8, 2017

gupta-ak commented Sep 8, 2017

rn commented Sep 8, 2017

beweedon commented Sep 8, 2017

beweedon commented Sep 8, 2017

friism commented Sep 12, 2017 • edited

beweedon commented Sep 12, 2017

jterry75 commented Sep 14, 2017

beweedon commented Sep 14, 2017

friism commented Sep 12, 2017 •

edited