Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GracefulNodeShutdown e2e test #98658

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions test/e2e_node/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,7 @@ go_test(
"mirror_pod_test.go",
"node_container_manager_test.go",
"node_perf_test.go",
"node_shutdown_linux_test.go",
"pids_test.go",
"pod_hostnamefqdn_test.go",
"pods_container_manager_test.go",
Expand Down
228 changes: 228 additions & 0 deletions test/e2e_node/node_shutdown_linux_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,228 @@
// +build linux

/*
Copyright 2021 The Kubernetes Authors.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package e2enode

import (
"context"
"fmt"
"strconv"
"time"

"k8s.io/apimachinery/pkg/fields"

"github.com/onsi/ginkgo"
"github.com/onsi/gomega"
v1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/kubernetes/pkg/apis/scheduling"
"k8s.io/kubernetes/pkg/features"
kubeletconfig "k8s.io/kubernetes/pkg/kubelet/apis/config"
kubelettypes "k8s.io/kubernetes/pkg/kubelet/types"
"k8s.io/kubernetes/test/e2e/framework"
)

var _ = framework.KubeDescribe("GracefulNodeShutdown [Serial] [NodeAlphaFeature:GracefulNodeShutdown]", func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will be great to explain the side effects of this test on other tests

f := framework.NewDefaultFramework("graceful-node-shutdown")
ginkgo.Context("when gracefully shutting down", func() {

const (
pollInterval = 1 * time.Second
podStatusUpdateTimeout = 5 * time.Second
nodeStatusUpdateTimeout = 10 * time.Second
nodeShutdownGracePeriod = 20 * time.Second
nodeShutdownGracePeriodCriticalPods = 10 * time.Second
)

tempSetCurrentKubeletConfig(f, func(initialConfig *kubeletconfig.KubeletConfiguration) {
initialConfig.FeatureGates = map[string]bool{
string(features.GracefulNodeShutdown): true,
}
initialConfig.ShutdownGracePeriod = metav1.Duration{Duration: nodeShutdownGracePeriod}
initialConfig.ShutdownGracePeriodCriticalPods = metav1.Duration{Duration: nodeShutdownGracePeriodCriticalPods}
})

ginkgo.AfterEach(func() {
ginkgo.By("Emitting Shutdown false signal; cancelling the shutdown")
err := emitSignalPrepareForShutdown(false)
framework.ExpectNoError(err)
})

ginkgo.It("should be able to gracefully shutdown pods with various grace periods", func() {
nodeName := getNodeName(f)
nodeSelector := fields.Set{
"spec.nodeName": nodeName,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it guaranteed by [Serial] that there will be no other pods on the node? Should it query by test namespace instead?

Copy link
Member Author

@wzshiming wzshiming Feb 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be guaranteed by the e2e framework. we don't have to reinvent the wheel.

}.AsSelector().String()

// Define test pods
pods := []*v1.Pod{
getGracePeriodOverrideTestPod("period-120", nodeName, 120, false),
getGracePeriodOverrideTestPod("period-5", nodeName, 5, false),
getGracePeriodOverrideTestPod("period-critical-120", nodeName, 120, true),
getGracePeriodOverrideTestPod("period-critical-5", nodeName, 5, true),
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's add some verbose logging of the various main steps the test does to make debugging failure easier in future. The pattern seems to be to use ginkgo.By, e.g. https://github.com/kubernetes/kubernetes/blob/master/test/e2e_node/docker_test.go#L60

I added some suggestions of places to log some of the important steps in the test.

ginkgo.By("Creating batch pods")
f.PodClient().CreateBatch(pods)

list, err := f.PodClient().List(context.TODO(), metav1.ListOptions{
FieldSelector: nodeSelector,
})
framework.ExpectNoError(err)
framework.ExpectEqual(len(list.Items), len(pods), "the number of pods is not as expected")

for _, pod := range list.Items {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check that the list contain 4 elements

framework.ExpectEqual(
pod.Status.Phase,
v1.PodRunning,
"pod is not ready",
wzshiming marked this conversation as resolved.
Show resolved Hide resolved
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: framework.ExpectEqual(pod.Status.Phase, v1.PodRunning, "pod is not ready")

}

ginkgo.By("Emitting shutdown signal")
err = emitSignalPrepareForShutdown(true)
framework.ExpectNoError(err)

// Not critical pod should be shutdown
gomega.Eventually(func() error {
list, err = f.PodClient().List(context.TODO(), metav1.ListOptions{
FieldSelector: nodeSelector,
})
if err != nil {
return err
}
framework.ExpectEqual(len(list.Items), len(pods), "the number of pods is not as expected")

for _, pod := range list.Items {
if kubelettypes.IsCriticalPod(&pod) {
if pod.Status.Phase != v1.PodRunning {
return fmt.Errorf("critical pod should not be shutdown, phase: %s", pod.Status.Phase)
}
} else {
if pod.Status.Phase != v1.PodFailed || pod.Status.Reason != "Shutdown" {
return fmt.Errorf("pod should be shutdown, phase: %s", pod.Status.Phase)
}
}
}
return nil
}, podStatusUpdateTimeout, pollInterval).Should(gomega.BeNil())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it validate that pods were given enough time for graceful termination?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a risk this will start flaking. test machines sometimes are super slow and it may take a while for this loop to execute so the critical pods termination will already be started. I don't know what would be the best way to address it. Either pods can signal back using some logs so test can validate how long pods were alive. Or this loop to run more often than once a second

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point. Maybe one simple solution to start with is just to increase the nodeShutdownGracePeriod from 20s and nodeShutdownGracePeriodCriticalPods from 10s to something much higher, say nodeShutdownGracePeriod: 5min and nodeShutdownGracePeriodCriticalPods: 2min

Having pods signal back would be ideal but I think it makes the test more complicated, maybe makes sense as a followup... I also brought this up in #98658 (comment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the time to add is not very good. This will add a few minutes of checking time for each PR.
Pods can use some logs to send out signals. I think it’s a good idea.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we want to proceed with this, can you add a comment suggesting that this may become flaky so it will be easier to investigate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine going as is and see if it will actually become a problem on e2e environment


// All pod should be shutdown
gomega.Eventually(func() error {
list, err = f.PodClient().List(context.TODO(), metav1.ListOptions{
FieldSelector: nodeSelector,
})
if err != nil {
return err
}
framework.ExpectEqual(len(list.Items), len(pods), "the number of pods is not as expected")

for _, pod := range list.Items {
if pod.Status.Phase != v1.PodFailed || pod.Status.Reason != "Shutdown" {
return fmt.Errorf("pod should be shutdown, phase: %s", pod.Status.Phase)
}
}
return nil
},
// Critical pod starts shutdown after (nodeShutdownGracePeriod-nodeShutdownGracePeriodCriticalPods)
podStatusUpdateTimeout+(nodeShutdownGracePeriod-nodeShutdownGracePeriodCriticalPods),
pollInterval).Should(gomega.BeNil())
})

ginkgo.It("should be able to handle a cancelled shutdown", func() {
ginkgo.By("Emitting Shutdown signal")
err := emitSignalPrepareForShutdown(true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add above:

ginkgo.By("Emitting Shutdown signal")

framework.ExpectNoError(err)
gomega.Eventually(func() error {
isReady := getNodeReadyStatus(f)
if isReady {
return fmt.Errorf("node did not become shutdown as expected")
}
return nil
}, nodeStatusUpdateTimeout, pollInterval).Should(gomega.BeNil())

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ginkgo.By("Emitting Shutdown false signal; cancelling the shutdown")

ginkgo.By("Emitting Shutdown false signal; cancelling the shutdown")
err = emitSignalPrepareForShutdown(false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this signal be emited in test cleanup so we guarantee that the state of kubelet after these tests is that signal is cancelled even if test has failed?

framework.ExpectNoError(err)
gomega.Eventually(func() error {
isReady := getNodeReadyStatus(f)
if !isReady {
return fmt.Errorf("node did not recover as expected")
}
return nil
}, nodeStatusUpdateTimeout, pollInterval).Should(gomega.BeNil())
})
})
})

func getGracePeriodOverrideTestPod(name string, node string, gracePeriod int64, critical bool) *v1.Pod {
pod := &v1.Pod{
TypeMeta: metav1.TypeMeta{
Kind: "Pod",
APIVersion: "v1",
},
ObjectMeta: metav1.ObjectMeta{
Name: name,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is a best practice to namespace pods created for this test with the test namespace

},
Spec: v1.PodSpec{
Containers: []v1.Container{
{
Name: name,
Image: busyboxImage,
Command: []string{"sh", "-c"},
Args: []string{`
_term() {
echo "Caught SIGTERM signal!"
sleep infinity
}
trap _term SIGTERM
sleep infinity
`},
},
},
TerminationGracePeriodSeconds: &gracePeriod,
NodeName: node,
},
}
if critical {
pod.ObjectMeta.Annotations = map[string]string{
kubelettypes.ConfigSourceAnnotationKey: kubelettypes.FileSource,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious what this is / why is it needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if critical {
pod.ObjectMeta.Namespace = kubeapi.NamespaceSystem
pod.ObjectMeta.Annotations = map[string]string{
kubelettypes.ConfigSourceAnnotationKey: kubelettypes.FileSource,
}
pod.Spec.PriorityClassName = scheduling.SystemNodeCritical
framework.ExpectEqual(kubelettypes.IsCriticalPod(pod), true, "pod should be a critical pod")
} else {
framework.ExpectEqual(kubelettypes.IsCriticalPod(pod), false, "pod should not be a critical pod")
}

kubelettypes.IsCriticalPod judged based on this key to determine whether this pod is critical

}
pod.Spec.PriorityClassName = scheduling.SystemNodeCritical

framework.ExpectEqual(kubelettypes.IsCriticalPod(pod), true, "pod should be a critical pod")
} else {
framework.ExpectEqual(kubelettypes.IsCriticalPod(pod), false, "pod should not be a critical pod")
}
return pod
}

// Emits a fake PrepareForShutdown dbus message on system dbus. Will cause kubelet to react to an active shutdown event.
func emitSignalPrepareForShutdown(b bool) error {
cmd := "gdbus emit --system --object-path /org/freedesktop/login1 --signal org.freedesktop.login1.Manager.PrepareForShutdown " + strconv.FormatBool(b)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's leave a comment describing what this does:

// Emits a fake PrepareForShutdown dbus message on system dbus. Will cause kubelet to react to an active shutdown event.

_, err := runCommand("sh", "-c", cmd)
return err
}

func getNodeReadyStatus(f *framework.Framework) bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should be able to use getNode which already does this:

https://github.com/kubernetes/kubernetes/blob/master/test/e2e_node/e2e_node_suite_test.go#L314

I think all you need is something like this:
https://github.com/kubernetes/kubernetes/blob/master/test/e2e_node/e2e_node_suite_test.go#L273-L280

(or maybe above when you check the node status you can use waitForNodeReady directly and remove this function?

Copy link
Member Author

@wzshiming wzshiming Feb 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getNode may not be used because it is not compatible with framework.Framework

waitForNodeReady is also not needed. this case not only requires waiting for node ready but also waiting for node not ready.

nodeList, err := f.ClientSet.CoreV1().Nodes().List(context.TODO(), metav1.ListOptions{})
framework.ExpectNoError(err)
// Assuming that there is only one node, because this is a node e2e test.
framework.ExpectEqual(len(nodeList.Items), 1)
return isNodeReady(&nodeList.Items[0])
}