New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tight retry loops should not cause cascading failure of the cluster #74405
Comments
/remove-sig cluster-lifecycle |
@bgrant0607: Those labels are not set on the issue: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/remove-sig cluster-lifecycle |
Ref #2529 |
Thanks for the report. Was it a Deployment controller in the Helm chart? After a short time, OOMing containers should have resulted in CrashLoopBackoff. Did you observe that? Was the node considered unready at some point? Was that because its local disk filled up? Something would have to have caused the pods to be killed so they could be replaced by their controller and rescheduled. By "crash of a whole cluster", did your etcd fill up with pods and/or events, or was apiserver DoSed with requests, or both? What is your pod GC threshold set to? Did you set ResourceQuota for pods and events in your namespaces? |
Other refs: cc @kow3ns It looks like it was decided at some point (#35342) to leave broken pods/containers on nodes in a "waiting" state to avoid controller hot loops. This is a problem for clients to figure out that something is wrong and what to do about it: |
Another ref: #76370 |
Other possible scenarios may be in https://github.com/hjacobs/kubernetes-failure-stories |
This is also a problem for conformance tests and for other clients trying to determine success or failure of workloads: |
long-term-issue (note to self) |
The controller in question was a Deployment? Do you know what specifically caused the VM to crash? The Kubelet should have put the pods into CrashLoopBackoff, as shown here: http://www.google.com/url?q=http%3A%2F%2Fcloudgeekz.com%2F1605%2Fkubernetes-application-oomkilled.html&sa=D&sntz=1&usg=AFQjCNEoSnD7t17j8cvistbkDoWlB5DAvA When you mentioned "whole cluster crash", did you also mean the control plane (apiserver) died, or just all of the worker nodes? Approximately how many nodes were in the cluster? Which Kubernetes distribution or service were you using, or how did you create the cluster? |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
@fejta-bot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/reopen |
@dims: Reopened this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
@fejta-bot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
this issue still seems very relevant but I can't find actionable items on node side. Please add node back once we have a bit clearer picture and items we can work on. |
What happened:
Problem found by accident during cluster stability testing. The application consumes nearly 7MB of memory per pod.
While the helm chart was written the way that the deployment have set too low limits - see the snippet:
containers: ...... resources: limits: cpu: 200m memory: 4Mi requests: cpu: 100m memory: 4Mi
A pod was started; once initialized, it started consuming more memory than the limit allowed (7MB > 4MB) and pod was deleted by the system
It was happening fast enough to cause the hosting VM crash and in case of many replicas all the worker nodes crashed within a few minutes.
Impact:
It will lead to whole cluster crash.
What need to enhance for K8s
And here's the problem: we have no control over the chart contents in this area, and logical error may lead to crash of a whole cluster. There must be some mechanism introduced, which prevents the system from such tight loop of killing and restarting pods with memory limits set incorrectly**:
Environment:
Kubernetes version (use
kubectl version
): v1.12.2 and v1.12.2.It shall be common issue, doesn't matter with releases.
/sig node
/kind feature
/sig architecture
/sig scheduling
/sig cluster-lifecycle
The text was updated successfully, but these errors were encountered: