Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

revert failed pod marked as NotReady #296

Merged
merged 1 commit into from Dec 3, 2019

Conversation

pascalwhoop
Copy link
Contributor

This issue
#215
turned a "Failed" pod into a "PodNotReady". The creator of the PR wanted this for his evicted pods. I believe this is not correct. I would suggest looking at

kube_pod_container_status_terminated_reason{reason!="Completed"}
OR
kube_pod_container_status_terminated

to build a metric if people need it. Why are we reverting? Because any failed pod will now trigger this alert, since pods stay in the API for quite a long time and therefore any failed pod == PodNotReady.

Failed pods occur all the time. Build systems, batch processing, etc.

@pascalwhoop
Copy link
Contributor Author

@metalmatze what's your take on this? Currently, any failed pod triggers this alarm and the alarm doesn't disappear until the pod is removed from the API

@Capitrium
Copy link

It would be great to get this merged in soon, the KubePodNotReady alert is currently very noisy and isn't the right way to track Evicted pods.

From the docs (emphasis added):

The kubelet can proactively monitor for and prevent total starvation of a compute resource. In those cases, the kubelet can reclaim the starved resource by proactively failing one or more Pods. When the kubelet fails a Pod, it terminates all of its containers and transitions its PodPhase to Failed. If the evicted Pod is managed by a Deployment, the Deployment will create another Pod to be scheduled by Kubernetes.

The original issue in #215 (emphasis added) was:

Evicted pods can be in phase failed and I would expect this alert to catch this.

I don't think this expectation is correct; Evicted pods are always in phase=Failed.

@metalmatze @brancz thoughts?

@wu0407
Copy link

wu0407 commented Dec 21, 2019

pod not ready it look at "kube_pod_container_status_ready" and "kube_pod_status_ready"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants