-
Notifications
You must be signed in to change notification settings - Fork 38.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes for graceful node shutdown test #106108
Fixes for graceful node shutdown test #106108
Conversation
/test pull-kubernetes-node-kubelet-serial |
/priority important-soon |
@bobbypage: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
* Bump the pod status and node status update timeouts to avoid flakes * Add a small delay after dbus restart to ensure dbus has enough time to restart to startup prior to sending shutdown signal * Change check of pod being terminated by graceful shutdown. Previously, the pod phase was checked to see if it was `Failed` and the pod reason string matched. This logic needs to change after 1.22 graceful node shutdown change introduced in PR kubernetes#102344 which changed behavior to no longer put the pods into a failed phase. Instead, the test now checks that containers are not ready, and the pod status message and reason are set appropriately. Signed-off-by: David Porter <david@porter.me>
7206a61
to
ddd0d8a
Compare
This is ready to review. With these changes in the above serial run, graceful node shutdown test passed. |
/cc @wzshiming @smarterclayton @mrunalp |
/kind failing-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bobbypage, SergeyKanzhelev, wzshiming The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Signed-off-by: David Porter david@porter.me
What type of PR is this?
/kind failing-test
What this PR does / why we need it:
restart prior to sending shutdown signal
the pod phase was checked to see if it was
Failed
and the pod reasonstring matched. This logic needs to change after 1.22 graceful node
shutdown change introduced in PR Prevent Kubelet from incorrectly interpreting "not yet started" pods as "ready to terminate pods" by unifying responsibility for pod lifecycle into pod worker #102344 which changed behavior to no
longer put the pods into a failed phase. Instead, the test now checks
that containers are not ready, and the pod status message and reason
are set appropriately.
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: