Skip to content
This repository has been archived by the owner on Jul 28, 2021. It is now read-only.

GCS will delete contents of mapped directories on exit if they are under /tmp/gcs/<containerID>/ #131

Closed
beweedon opened this issue Sep 7, 2017 · 14 comments
Assignees

Comments

@beweedon
Copy link
Contributor

beweedon commented Sep 7, 2017

This is because the GCS apparently doesn't unmount mapped directories during cleanup (see https://github.com/Microsoft/opengcs/blob/master/service/gcs/core/gcs/cleanup.go). We should add code to perform this unmounting during the cleanup phase.

@jterry75 @jhowardmsft

@jstarks
Copy link
Member

jstarks commented Sep 8, 2017

Holy smokes! As an extra precaution, can we avoid crossing file-system boundaries when we delete /tmp/gcs/.../ ?

@simonferquel
Copy link

@jstarks I agree. If for whatever reason the unmounts fails, we need to be sure the content does not get erased by some code further down the path. We have been hurt very hard by a similar issue at some point with Docker for Windows, and really, you don't want to have to patch this kind of bug while in production.

@jterry75
Copy link
Contributor

jterry75 commented Sep 8, 2017

Do you have a proposed mitigation? I agree there is a risk here for sure when unmount fails but I don't know how I can cleanup the container in this case. Any deletes to the rootfs will also delete the mounted folder

@simonferquel
Copy link

with unix fs semantics, I would assume that mv <mountpoint> /to-not-remove might work... In the case of docker for Windows we decided that if an unmount fails, we don't remove the mount points at all.

@gupta-ak
Copy link
Member

gupta-ak commented Sep 8, 2017

Reopening for further discussion.

After unmounting, you might be able to do a rename(rootfs, rootfs-removing) or something to make sure you're not crossing file system boundaries and then delete it.

@gupta-ak gupta-ak reopened this Sep 8, 2017
@lowenna
Copy link
Contributor

lowenna commented Sep 8, 2017

Would it just be easier to put it back at /binds/containerID/... instead? The location is under the control of docker, so it's a simple fix. At least that would guarantee not to be cleaned by GCS.

@gupta-ak
Copy link
Member

gupta-ak commented Sep 8, 2017

I think the issue is that the gcs does the equivalent of rm -rf rootfs which will delete everything including layer + plan9 data if the previous unmounts did not succeed. It doesn't matter where the mounts sources are located, because they eventually have to be mounted inside rootfs so that the container can access them.

Right now the current cleanup code tries to unmount the mapped disks, the mapped directories and the layers and then does rm -rf rootfs. If any of the unmounts fails, then it just logs it and continues forward, which can corrupt data on the delete.

It makes sense to me to have the same behavior as Docker for Windows where we simply don't clean up if the unmounts fail. Otherwise, we could corrupt the layers.

@rn
Copy link
Contributor

rn commented Sep 8, 2017

Out of curiosity (and showing my ignorance to the finer implementation details here), why do you attempt to clean up in the first place? If the container goes away, the VM get's destroyed and then can't the layer just be removed on the host?

@beweedon
Copy link
Contributor Author

beweedon commented Sep 8, 2017

Unmounting is necessary, because if the share isn't unmounted not all file operations on it are guaranteed to be flushed to the host. As for actually deleting the layers, I think this is less relevant now, but could be useful if attempting to run multiple containers in the same UVM. Otherwise a container would leak its private storage on exit.

@beweedon
Copy link
Contributor Author

beweedon commented Sep 8, 2017

And, yeah, @simonferquel @gupta-ak I think it makes sense to emulate docker for Windows's behavior of not deleting layers which were not unmounted.

@friism
Copy link

friism commented Sep 12, 2017

@beweedon @jhowardmsft are you making progress on this? It seems like this on the critical path for LCOW, no?

@beweedon
Copy link
Contributor Author

@jterry75, does the behavior of only deleting directories which we successfully unmounted seem good? We can make those changes if so.

Also, FYI to everyone else, Justin has been taking over the GCS from me, and at this point I won't really be doing any more work on it. I'll still be around to take part in discussions and stuff though :)

@jterry75
Copy link
Contributor

I think we can skip the delete call on the layers if the unmount fails. The original issue was that we were not calling unmount for mapped directories so we are really talking about an unlikely failure path here. As far as I see it it only has two consequences to not delete the layers on failure to unmount:

  1. The UVM storage size will grow over time if this happens a lot. (Highly unlikely)
  2. The UVM will not be able to use that containerID again. (Unclear how likely this would be. I assume Docker creates a new ID per instance so we probably fine with this too even in a world where we have multiple containers per uvm) (If this becomes a problem we can always map external id's to unique gcs id's)

Everyone ok with me only deleting the layer if we have no unmount failures?

@beweedon
Copy link
Contributor Author

Sounds good to me!

jterry75 added a commit to jterry75/opengcs that referenced this issue Sep 14, 2017
Will only destroy the container storage in the UVM if all mounts are successfully unmounted. Without this we have no way of knowing if it safe to delete the files or if this could have an affect on the host files.

Resolves: microsoft#131
jterry75 added a commit to jterry75/opengcs that referenced this issue Sep 14, 2017
Will only destroy the container storage in the UVM if all mounts are successfully unmounted. Without this we have no way of knowing if it safe to delete the files or if this could have an affect on the host files.

Resolves: microsoft#131
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants