New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[release-4.9] Bug 2065786: Backport 108366 OutofCpu Fixes #1222
[release-4.9] Bug 2065786: Backport 108366 OutofCpu Fixes #1222
Conversation
…od is terminated Other components must know when the Kubelet has released critical resources for terminal pods. Do not set the phase in the apiserver to terminal until all containers are stopped and cannot restart. As a consequence of this change, the Kubelet must explicitly transition a terminal pod to the terminating state in the pod worker which is handled by returning a new isTerminal boolean from syncPod. Finally, if a pod with init containers hasn't been initialized yet, don't default container statuses or not yet attempted init containers to the unknown failure state.
Exploring termination revealed we have race conditions in certain parts of pod initialization and termination. To better catch these issues refactor the existing test so it can be reused, and then test a number of alternate scenarios.
Create an E2E test that creates a job that spawns a pod that should succeed. The job reserves a fixed amount of CPU and has a large number of completions and parallelism. Use to repro github.com/kubernetes/issues/106884 Signed-off-by: David Porter <david@porter.me>
… while ready Signed-off-by: David Porter <david@porter.me>
@rphillips: No Bugzilla bug is referenced in the title of this pull request. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@rphillips: the contents of this pull request could not be automatically validated. The following commits could not be validated and must be approved by a top-level approver:
Comment |
@rphillips: This pull request references Bugzilla bug 2065786, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/retest |
@rphillips: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/bugzilla refresh |
@rphillips: This pull request references Bugzilla bug 2065786, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/bugzilla refresh |
/label backport-risk-assessed |
@rphillips: This pull request references Bugzilla bug 2065786, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. 6 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Bugzilla (schoudha@redhat.com), skipping review request. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED Approval requirements bypassed by manually added approval. This pull-request has been approved by: ehashman, rphillips The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/label cherry-pick-approved |
@rphillips: All pull requests linked via external trackers have merged: Bugzilla bug 2065786 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
ref: kubernetes#108749