-
Notifications
You must be signed in to change notification settings - Fork 43
GCS will delete contents of mapped directories on exit if they are under /tmp/gcs/<containerID>/ #131
Comments
Holy smokes! As an extra precaution, can we avoid crossing file-system boundaries when we delete /tmp/gcs/.../ ? |
@jstarks I agree. If for whatever reason the unmounts fails, we need to be sure the content does not get erased by some code further down the path. We have been hurt very hard by a similar issue at some point with Docker for Windows, and really, you don't want to have to patch this kind of bug while in production. |
Do you have a proposed mitigation? I agree there is a risk here for sure when unmount fails but I don't know how I can cleanup the container in this case. Any deletes to the rootfs will also delete the mounted folder |
with unix fs semantics, I would assume that |
Reopening for further discussion. After unmounting, you might be able to do a rename(rootfs, rootfs-removing) or something to make sure you're not crossing file system boundaries and then delete it. |
Would it just be easier to put it back at /binds/containerID/... instead? The location is under the control of docker, so it's a simple fix. At least that would guarantee not to be cleaned by GCS. |
I think the issue is that the gcs does the equivalent of Right now the current cleanup code tries to unmount the mapped disks, the mapped directories and the layers and then does It makes sense to me to have the same behavior as Docker for Windows where we simply don't clean up if the unmounts fail. Otherwise, we could corrupt the layers. |
Out of curiosity (and showing my ignorance to the finer implementation details here), why do you attempt to clean up in the first place? If the container goes away, the VM get's destroyed and then can't the layer just be removed on the host? |
Unmounting is necessary, because if the share isn't unmounted not all file operations on it are guaranteed to be flushed to the host. As for actually deleting the layers, I think this is less relevant now, but could be useful if attempting to run multiple containers in the same UVM. Otherwise a container would leak its private storage on exit. |
And, yeah, @simonferquel @gupta-ak I think it makes sense to emulate docker for Windows's behavior of not deleting layers which were not unmounted. |
@beweedon @jhowardmsft are you making progress on this? It seems like this on the critical path for LCOW, no? |
@jterry75, does the behavior of only deleting directories which we successfully unmounted seem good? We can make those changes if so. Also, FYI to everyone else, Justin has been taking over the GCS from me, and at this point I won't really be doing any more work on it. I'll still be around to take part in discussions and stuff though :) |
I think we can skip the delete call on the layers if the unmount fails. The original issue was that we were not calling unmount for mapped directories so we are really talking about an unlikely failure path here. As far as I see it it only has two consequences to not delete the layers on failure to unmount:
Everyone ok with me only deleting the layer if we have no unmount failures? |
Sounds good to me! |
Will only destroy the container storage in the UVM if all mounts are successfully unmounted. Without this we have no way of knowing if it safe to delete the files or if this could have an affect on the host files. Resolves: microsoft#131
Will only destroy the container storage in the UVM if all mounts are successfully unmounted. Without this we have no way of knowing if it safe to delete the files or if this could have an affect on the host files. Resolves: microsoft#131
This is because the GCS apparently doesn't unmount mapped directories during cleanup (see https://github.com/Microsoft/opengcs/blob/master/service/gcs/core/gcs/cleanup.go). We should add code to perform this unmounting during the cleanup phase.
@jterry75 @jhowardmsft
The text was updated successfully, but these errors were encountered: