Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should allow aggregation of pod/container metrics by deployment #70

Closed
ghost opened this issue Jan 18, 2017 · 8 comments
Closed

Should allow aggregation of pod/container metrics by deployment #70

ghost opened this issue Jan 18, 2017 · 8 comments
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@ghost
Copy link

ghost commented Jan 18, 2017

Pods and containers export important metrics like health or container restart count. These metrics are most useful when viewed at a deployment aggregation level (i.e., summed over all pods belonging to the same deployment) or on the replica set level. Individual pods are less useful, because a pod might go away for benign reasons.

To do the aggregation, I need labels that reference the deployment. For example, for the standard pod from a deployment named "foo-12345-fpgj", I'd need a label "foo" that doesn't include the replica set identifier ("12345") or the pod identifier ("fpgj").

This bug is for tracking. We're already in touch with the Stackdriver and Kubernetes folks in Google who're hopefully making this happen.

@brancz
Copy link
Member

brancz commented Jan 18, 2017

kube-state-metrics intends to mirror what the Kubernetes API exposes. If there is a clear connection between objects I'm happy to expose that in a label in the kube_pod_info metric. A candidate for that could be an ownerReference, which if a Pod belongs to a ReplicaSet is set to that ReplicaSet.

Alternatively if using Prometheus, you can use relabelling rules to parse these things out of the Pods name.

More generally regarding metric and label exposure these rules apply:

@juliusv
Copy link

juliusv commented Nov 6, 2017

So we have the ReplicaSet information in the kube_pod_info metric, but as also mentioned in #27, one is usually not interested in the ReplicaSet directly, but the Deployment / DaemonSet / ... that created it. That is, the creator of the creator. Though including this two-level info could be seen as somewhat arbitrary (what if there is nothing above the ReplicaSet or a directly created pod, or what if even the Deployment was created by something higher up that you want to track?), it seems that 99% of pods are either directly rooted at something exactly 2 creation levels apart (Deployment, DaemonSet, ...) or started as standalone pods.

So I think it would be useful and not too problematic to include that information in the kube_pod_info metric somehow. The names of those objects should also not change during a pod's lifetime, meaning there shouldn't be a concern about denormalization changing all pod series here.

The question would be what to name the labels for this. We already have created_by_kind and created_by_name labels for the first parent, but what would the labels be called for the grandparent?

@brancz
Copy link
Member

brancz commented Nov 6, 2017

Generally I'm all for this if we can get this information presented in a reasonable way.

The created_by_* labels are deprecated, as the underlying annotation on upstream objects is as well in favor of the OwnerReferences. The problem with that is that an object can have multiple owners, and then it's hard to create a label "owners owner", as that can be a list rather than a single value.

@juliusv
Copy link

juliusv commented Nov 6, 2017

Ah damn, wasn't aware of multiple owners. That makes the whole thing harder indeed. Do you think multiple owners will actually be common, or an exceptional thing?

@brancz
Copy link
Member

brancz commented Nov 7, 2017

Already seeing that today already unfortunately, so we can't just make assumptions. While maybe not as simple as it should be, we can still solve this with a couple of recording rules in Prometheus with joins and group_left statements. I'm guessing that's what you were trying to avoid though 🙂 .

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 7, 2018
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 9, 2018
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

JoaoBraveCoding pushed a commit to JoaoBraveCoding/kube-state-metrics that referenced this issue Nov 24, 2022
…ry-pick-69-to-release-4.10

[release-4.10] Bug 2078835: internal/store: fix potential panic in pod store
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

4 participants