Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix leaking memory backed volumes of terminated pods #36779

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
Expand Up @@ -129,10 +129,18 @@ func (dswp *desiredStateOfWorldPopulator) populatorLoopFunc() func() {
}
}

func isPodTerminated(pod *api.Pod) bool {
return pod.Status.Phase == api.PodFailed || pod.Status.Phase == api.PodSucceeded
}

// Iterate through all pods and add to desired state of world if they don't
// exist but should
func (dswp *desiredStateOfWorldPopulator) findAndAddNewPods() {
for _, pod := range dswp.podManager.GetPods() {
if isPodTerminated(pod) {
// Do not (re)add volumes for terminated pods
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless "non-memory backed volume"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this but I couldn't think of a good reason to add a volume back to a node for a pod that is terminated. The only reason I'm filtering the ones we remove is to not make waves for the release. In general, I can't think of a reason we would want to leave volumes attached to a node for a terminated pods except for debugging.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this but I couldn't think of a good reason to add a volume back to a node for a pod that is terminated. The only reason I'm filtering the ones we remove is to not make waves for the release. In general, I can't think of a reason we would want to leave volumes attached to a node for a terminated pods except for debugging.

Neither can I, but in the interest of minimizing potentially disruptive changes, it makes sense to limit the scope of the change as much as possible. Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saad-ali the change required to check that actually makes the change more invasive because I need to get the volume. In findAndRemoveDeletedPods(), that is already available via volumeToMount. In findAndAddNewPods(), the logic is inverted; for each pod, process volumes. I'm not sure if it is safe to call dswp.desiredStateOfWorld.GetVolumesToMount() in findAndAddNewPods().

I can extend my e2e tests to include a check that ensures non-memory backed volumes are untouched by this change?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack. That's fine. I'm a little nervous about it, but let's just get it in and give it time to bake and keep an eye out for strange volume behaviors.

continue
}
dswp.processPodVolumes(pod)
}
}
Expand All @@ -144,9 +152,18 @@ func (dswp *desiredStateOfWorldPopulator) findAndRemoveDeletedPods() {

runningPodsFetched := false
for _, volumeToMount := range dswp.desiredStateOfWorld.GetVolumesToMount() {
if _, podExists :=
dswp.podManager.GetPodByUID(volumeToMount.Pod.UID); podExists {
continue
pod, podExists := dswp.podManager.GetPodByUID(volumeToMount.Pod.UID)
if podExists {
// Skip running pods
if !isPodTerminated(pod) {
continue
}
// Skip non-memory backed volumes belonging to terminated pods
volume := volumeToMount.VolumeSpec.Volume
if (volume.EmptyDir == nil || volume.EmptyDir.Medium != api.StorageMediumMemory) &&
volume.ConfigMap == nil && volume.Secret == nil {
continue
}
}

// Once a pod has been deleted from kubelet pod manager, do not delete
Expand Down
1 change: 1 addition & 0 deletions test/e2e_node/BUILD
Expand Up @@ -67,6 +67,7 @@ go_test(
"restart_test.go",
"runtime_conformance_test.go",
"summary_test.go",
"volume_manager_test.go",
],
library = "go_default_library",
tags = ["automanaged"],
Expand Down
127 changes: 127 additions & 0 deletions test/e2e_node/volume_manager_test.go
@@ -0,0 +1,127 @@
/*
Copyright 2016 The Kubernetes Authors.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package e2e_node

import (
"time"

"k8s.io/kubernetes/pkg/api"
"k8s.io/kubernetes/pkg/util/uuid"
"k8s.io/kubernetes/test/e2e/framework"

"fmt"

. "github.com/onsi/ginkgo"
. "github.com/onsi/gomega"
)

var _ = framework.KubeDescribe("Kubelet Volume Manager", func() {
f := framework.NewDefaultFramework("kubelet-volume-manager")
Describe("Volume Manager", func() {
Context("On terminatation of pod with memory backed volume", func() {
It("should remove the volume from the node", func() {
var (
memoryBackedPod *api.Pod
volumeName string
)
By("Creating a pod with a memory backed volume that exits success without restart", func() {
volumeName = "memory-volume"
memoryBackedPod = f.PodClient().Create(&api.Pod{
ObjectMeta: api.ObjectMeta{
Name: "pod" + string(uuid.NewUUID()),
Namespace: f.Namespace.Name,
},
Spec: api.PodSpec{
RestartPolicy: api.RestartPolicyNever,
Containers: []api.Container{
{
Image: "gcr.io/google_containers/busybox:1.24",
Name: "container" + string(uuid.NewUUID()),
Command: []string{"sh", "-c", "echo"},
VolumeMounts: []api.VolumeMount{
{
Name: volumeName,
MountPath: "/tmp",
},
},
},
},
Volumes: []api.Volume{
{
Name: volumeName,
VolumeSource: api.VolumeSource{
EmptyDir: &api.EmptyDirVolumeSource{Medium: api.StorageMediumMemory},
},
},
},
},
})
err := framework.WaitForPodSuccessInNamespace(f.ClientSet, memoryBackedPod.Name, f.Namespace.Name)
Expect(err).NotTo(HaveOccurred())
})
By("Verifying the memory backed volume was removed from node", func() {
volumePath := fmt.Sprintf("/tmp/%s/volumes/kubernetes.io~empty-dir/%s", string(memoryBackedPod.UID), volumeName)
var err error
for i := 0; i < 10; i++ {
// need to create a new verification pod on each pass since updates
//to the HostPath volume aren't propogated to the pod
pod := f.PodClient().Create(&api.Pod{
ObjectMeta: api.ObjectMeta{
Name: "pod" + string(uuid.NewUUID()),
Namespace: f.Namespace.Name,
},
Spec: api.PodSpec{
RestartPolicy: api.RestartPolicyNever,
Containers: []api.Container{
{
Image: "gcr.io/google_containers/busybox:1.24",
Name: "container" + string(uuid.NewUUID()),
Command: []string{"sh", "-c", "if [ -d " + volumePath + " ]; then exit 1; fi;"},
VolumeMounts: []api.VolumeMount{
{
Name: "kubelet-pods",
MountPath: "/tmp",
},
},
},
},
Volumes: []api.Volume{
{
Name: "kubelet-pods",
VolumeSource: api.VolumeSource{
// TODO: remove hardcoded kubelet volume directory path
// framework.TestContext.KubeVolumeDir is currently not populated for node e2e
HostPath: &api.HostPathVolumeSource{Path: "/var/lib/kubelet/pods"},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you put a // TODO to pull this value from the test context for node (it needs to be added)

},
},
},
},
})
err = framework.WaitForPodSuccessInNamespace(f.ClientSet, pod.Name, f.Namespace.Name)
gp := int64(1)
f.PodClient().Delete(pod.Name, &api.DeleteOptions{GracePeriodSeconds: &gp})
if err == nil {
break
}
<-time.After(10 * time.Second)
}
Expect(err).NotTo(HaveOccurred())
})
})
})
})
})