Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding more labels to kube_pod_status_phase #332

Closed
rajatjindal opened this issue Jan 4, 2018 · 4 comments
Closed

Adding more labels to kube_pod_status_phase #332

rajatjindal opened this issue Jan 4, 2018 · 4 comments

Comments

@rajatjindal
Copy link

we had a situation where # of failed pod counts increased dramatically, and we were wondering what happened.

on debugging we found 2 nodes were having docker issues and most of the failed nodes were being scheduled on those problematic nodes.

i think it will be useful to add more labels to kube_pod_status_phase, so that we can run query like all failed pods count group by node.

screen shot 2018-01-04 at 10 23 43 am

@rajatjindal
Copy link
Author

I will be more than happy to open a PR if we agree that this is a reasonable ask.

@brancz
Copy link
Member

brancz commented Jan 5, 2018

That can be done at query time. Prometheus supports joins, so you can join pod info on the phase to figure out the node.

@andyxning
Copy link
Member

andyxning commented Jan 6, 2018

Close this via @brancz's comments above.

In case you have not used the join syntax of Prometheus, there is an example in the #137 comment with some query like:

sum by(node)(avg by(node,pod,namespace)(kube_pod_info{}) * on(pod, namespace) group_right(node) kube_pod_status_phase{phase="Failed"})

@rajatjindal
Copy link
Author

thank you very much guys. I will try this out tonight

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants