Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix GCI mounter issue #38124

Merged
merged 2 commits into from Dec 7, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 1 addition & 1 deletion cluster/gce/gci/mounter/mounter
Expand Up @@ -31,7 +31,7 @@ function gc {
# Rkt pods end up creating new copies of mounts on the host. Hence it is ideal to clean them up right away.
attempt=0
until [ $attempt -ge 5 ]; do
${RKT_BINARY} gc --grace-period=0s &> /dev/null && break
${RKT_BINARY} gc --grace-period=0s &> /dev/null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you making this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otherwise, it only run gc once.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is that a problem?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when I test it, I think sometimes the container still in running state in the moment when gc is executed. Give a few try can give more chances to garbage collect the containers . I also thought it was intended to execute 5 times.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was only intended to run multiple times if any of the previous attempts fail. Blocking for 5 seconds doesn't sound like a good idea. Is it possible to detect incomplete gc and then retry?
Also why is GC of concern?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the container is not GC, it causes those extra mounts which makes

  1. mount table is very big if there are many mount points, which might be a performance concern since we list the mount points during unmount operations
  2. Without this PR's fix, unmount device will fail because of those unrelated mount references. (umount device function first check whether there are other mount references for the mount path and will fail the operation if there are some exist)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 might be of concern. Do we have any data to back up that concern?
I don't get 2. Why is umount failing if other shared recursive mounts exist? Is the umount syscall failing or is it our own custom logic?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jingxu97

it is our own logic to check the mount references.

Can't we fix that logic then? As long as there is no separate mount point with the same device (bind mount), we can safely unmount. Check the mount reference number in /proc/self/mountinfo to make that decision.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a mount appears in two locations via a shared mount then /proc/self/mountinfo will explicitly indicate that. Here is an example.

$ cat /proc/self/mountinfo
...
90 80 0:66 / /tmp/original/tmpfs rw,relatime **shared:2** - tmpfs tmpfs rw
103 101 0:66 / /tmp/shared/tmpfs rw,relatime **shared:2** - tmpfs tmpfs rw

Look at the mount propagation type followed by a propagation number. If the number is the same, then that mount can be ignored. A umount from any one of those paths will result in the mount disappearing from both locations.

$ sudo umount /tmp/original/tmpfs 
vishnuk@vishnuk-linux:~$ cat /proc/self/mountinfo
...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current solution used here is checking the mount path and the shared mount path created by the rkt container. If the shared mount path contains the original mount path, this will be filtered out during GetMountRef. See comments #38124 (comment)

attempt=$[$attempt+1]
sleep 1
done
Expand Down
10 changes: 9 additions & 1 deletion pkg/util/mount/mount.go
Expand Up @@ -130,13 +130,21 @@ func GetMountRefs(mounter Interface, mountPath string) ([]string, error) {
}
}

// TODO: this is a workaround for the unmount device issue caused by gci mounter.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a total hack. gci_mounter is a GKE specific deployment detail. We should not be leaking that logic into upstream.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed a few options, but they are all seem hacky...

// In GCI cluster, if gci mounter is used for mounting, the container started by mounter
// script will cause additional mounts created in the container. Since these mounts are
// irrelavant to the original mounts, they should be not considered when checking the
// mount references. Current solution is to filter out those mount paths that contain
// the string of original mount path.
// Plan to work on better approach to solve this issue.

// Find all references to the device.
var refs []string
if deviceName == "" {
glog.Warningf("could not determine device for path: %q", mountPath)
} else {
for i := range mps {
if mps[i].Device == deviceName && mps[i].Path != slTarget {
if mps[i].Device == deviceName && !strings.Contains(mps[i].Path, slTarget) {
refs = append(refs, mps[i].Path)
}
}
Expand Down
4 changes: 2 additions & 2 deletions pkg/util/mount/mount_linux_test.go
Expand Up @@ -98,7 +98,7 @@ func TestGetMountRefs(t *testing.T) {
{Device: "/dev/sdb", Path: "/var/lib/kubelet/plugins/kubernetes.io/gce-pd/mounts/gce-pd"},
{Device: "/dev/sdb", Path: "/var/lib/kubelet/pods/some-pod/volumes/kubernetes.io~gce-pd/gce-pd-in-pod"},
{Device: "/dev/sdc", Path: "/var/lib/kubelet/plugins/kubernetes.io/gce-pd/mounts/gce-pd2"},
{Device: "/dev/sdc", Path: "/var/lib/kubelet/pods/some-pod/volumes/kubernetes.io~gce-pd/gce-pd2-in-pod"},
{Device: "/dev/sdc", Path: "/var/lib/kubelet/pods/some-pod/volumes/kubernetes.io~gce-pd/gce-pd2-in-pod1"},
{Device: "/dev/sdc", Path: "/var/lib/kubelet/pods/some-pod/volumes/kubernetes.io~gce-pd/gce-pd2-in-pod2"},
},
}
Expand All @@ -114,7 +114,7 @@ func TestGetMountRefs(t *testing.T) {
},
},
{
"/var/lib/kubelet/pods/some-pod/volumes/kubernetes.io~gce-pd/gce-pd2-in-pod",
"/var/lib/kubelet/pods/some-pod/volumes/kubernetes.io~gce-pd/gce-pd2-in-pod1",
[]string{
"/var/lib/kubelet/pods/some-pod/volumes/kubernetes.io~gce-pd/gce-pd2-in-pod2",
"/var/lib/kubelet/plugins/kubernetes.io/gce-pd/mounts/gce-pd2",
Expand Down
2 changes: 1 addition & 1 deletion pkg/volume/util/util.go
Expand Up @@ -94,7 +94,7 @@ func UnmountPath(mountPath string, mounter mount.Interface) error {
return err
}
if notMnt {
glog.V(4).Info("%q is unmounted, deleting the directory", mountPath)
glog.V(4).Infof("%q is unmounted, deleting the directory", mountPath)
return os.Remove(mountPath)
}
return nil
Expand Down