Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

internal/store/pod.go: Only create waiting_reason series if pods are in waiting state #1378

Merged

Conversation

lilic
Copy link
Member

@lilic lilic commented Feb 8, 2021

Due to the extremely high cardinality of this metric and with 2.0 release being breaking, this now changes that the waiting_reason metrics only include series that have state Waiting. We also now can include all the reasons as cardinality will only always be O(count of containers waiting). I made the adjustment for both containers and init containers.

The reduction of series can be seen in the unit tests, but I also tried it on my cluster and this is the result we only see the waiting containers:

# HELP kube_pod_container_status_waiting_reason Describes the reason the container is currently in waiting state.
# TYPE kube_pod_container_status_waiting_reason gauge
kube_pod_container_status_waiting_reason{namespace="default",pod="nginx-deployment-6b99d49f64-km96g",container="nginx",reason="ImagePullBackOff"} 1
kube_pod_container_status_waiting_reason{namespace="default",pod="nginx-deployment-6b99d49f64-2ghjb",container="nginx",reason="ImagePullBackOff"} 1
kube_pod_container_status_waiting_reason{namespace="default",pod="nginx-deployment-6b99d49f64-pgj2t",container="nginx",reason="ImagePullBackOff"} 1

Will open PRs to adjust the 0 values for other metrics here as well in another PR.

@brancz @smarterclayton @vsxen can you please have a look if this metric continues to work as you want.

Fixes #1321

waiting. This reduces the cardinality of this metric greatly, as it was
one of the highest cardinality metrics pre 2.0.
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 8, 2021
@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 8, 2021
@brancz
Copy link
Member

brancz commented Feb 8, 2021

Seems like CI is not happy, but strategy and implementation is sound.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 8, 2021
@lilic lilic force-pushed the change-waiting-reason-metrics branch from cf137a6 to 9d147ca Compare February 8, 2021 12:56
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 8, 2021
@lilic
Copy link
Member Author

lilic commented Feb 8, 2021

Thanks @brancz I always forget about our main tests :D PTAL, thanks!

/hold

Put a hold until one of the reporters confirms this fixes it 👍

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 8, 2021
docs/pod-metrics.md Outdated Show resolved Hide resolved
@brancz
Copy link
Member

brancz commented Feb 8, 2021

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 8, 2021
As with 2.0 we are breaking support for 1.x and previous stabilities.
In case we run into problems this gives us a chance to revert this.
@lilic lilic force-pushed the change-waiting-reason-metrics branch from 61a3881 to 93aeadc Compare February 8, 2021 14:23
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 8, 2021
@smarterclayton
Copy link
Contributor

Looks great, thanks!

@lilic
Copy link
Member Author

lilic commented Feb 8, 2021

/hold cancel

@brancz PTAL, I adjusted the docs so lgtm got dropped.

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 8, 2021
@vsxen
Copy link

vsxen commented Feb 8, 2021

LGTM

@brancz
Copy link
Member

brancz commented Feb 9, 2021

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 9, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: brancz, lilic

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@lilic lilic deleted the change-waiting-reason-metrics branch February 15, 2022 10:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants