Resource Metrics for CRDs #1210

mrueg · 2020-08-18T07:19:17Z

/kind feature

With the growing use of CRDs in K8s it might be useful to monitor those as well and provide some initial metrics for them.

Idea:
Have kube-state-metrics read a configuration file that allows the user to list further resources.

A list of crds to monitor:

- servicemonitors.monitoring.coreos.com
- certificaterequests.cert-manager.io

could create metrics similar to https://github.com/kubernetes/kube-state-metrics/blob/master/docs/configmap-metrics.md

kube_CRD_NAME_info
kube_CRD_NAME_created
kube_CRD_NAME_metadata_resource_version

The text was updated successfully, but these errors were encountered:

tariq1890 · 2020-08-25T05:45:17Z

Looks like this PR would need to be revived: #515

I agree that it would be a great addition to kube-state-metrics. I am also wondering if we should take out VPA metrics and have them be monitored as CRDs. VPAs aren't considered as k8s primitive APIs.

/help

k8s-ci-robot · 2020-08-25T05:45:18Z

@tariq1890:
This request has been marked as needing help from a contributor.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

Looks like this PR would need to be revived: #515

I agree that it would be a great addition to kube-state-metrics. I am also wondering if we should take out VPA metrics and have them be monitored as CRDs. VPAs aren't considered as k8s primitive APIs.

/help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

dgrisonnet · 2020-10-21T14:28:05Z

Is anyone currently working on this? I would be interested in implementing this feature.

lilic · 2020-10-22T13:37:29Z

Hey @dgrisonnet there is no real plan on the roadmap for this. I would prefer native instrumentation for CRs, while I understand its not always possible, but have my doubts on CR instance metrics being added in kube-state-metrics.

In any case I would prefer to wait for release-2.0 to be out before we do this.

dgrisonnet · 2020-10-22T15:35:53Z

I completely agree with you that CR should not be monitored by kube-state-metrics. However, in my opinion, Custom Resource Definition should be as it is a native Kubernetes resource.
Following what was done in #515 and @mrueg proposition, I would say that adding the following metrics seems useful:

kube_customresourcedefinition_info{group="", name=""}
kube_customresourcedefinition_created{group="", name=""}
kube_customresourcedefinition_metadata_resource_version{group="", name=""}

In any case I would prefer to wait for release-2.0 to be out before we do this.

Sure, no worries. I wasn't intending to put pressure on getting this feature in 2.0.

lilic · 2020-10-23T09:19:25Z

Sure, no worries. I wasn't intending to put pressure on getting this feature in 2.0.

No pressure at all! :)

Seems interesting. @mrueg @dgrisonnet what value does this bring to you, can you describe some use cases how you would use those metrics in queries, alerts or dashboards, etc.? Thanks! We like to get some use cases so we don't just add things for the sake of adding them but folks don't use it and its just extra series that are produced, hope that makes sense.

mrueg · 2020-10-23T09:29:15Z

@lilic
As mentioned in the initial issue post, what immediately comes to my mind are prometheus-operator's servicemonitors.
Having those metrics would allow to alert on a bad configuration if the target does not appear in the prometheus instance it should be (which for example can happen easily if you set up prometheus{,-operator} to only select servicemonitors with specific labels and the servicemonitor is missing it).

Another use case are cert-manager's CRDs where one could use the crd metrics to verify that the actual secrets including the certificate and key get created.

lilic · 2020-10-23T10:00:06Z

As mentioned in the initial issue post, what immediately comes to my mind are prometheus-operator's servicemonitors.
Having those metrics would allow to alert on a bad configuration if the target does not appear in the prometheus instance it should be

Not sure how kube_customresourcedefinition_created{group="monitoring.coreos.com", name="servicemonitors"} 1603447067 , which just gives you when the custom resource definition was registered with the API, can give you that info? Can you explain a bit more, not saying its a bad idea just want a bit more details. You planning on combing that with another metrics for example? I think for this you would need CR instance metrics, no?

brancz · 2020-10-23T11:57:39Z

The prometheus operator already exposes metrics about it's CRs itself, which is how I believe it should be. Certmanager should do the same.

mrueg · 2020-10-26T16:19:22Z

As mentioned in the initial issue post, what immediately comes to my mind are prometheus-operator's servicemonitors.
Having those metrics would allow to alert on a bad configuration if the target does not appear in the prometheus instance it should be

Not sure how kube_customresourcedefinition_created{group="monitoring.coreos.com", name="servicemonitors"} 1603447067 , which just gives you when the custom resource definition was registered with the API, can give you that info? Can you explain a bit more, not saying its a bad idea just want a bit more details. You planning on combing that with another metrics for example? I think for this you would need CR instance metrics, no?

Apologies, should have read more closely. yes I was thinking about instance level metrics.

The generic ones could be interesting to provide info when a CRD was updated (unless the tools using them already provide that info)

lilic · 2020-10-27T09:19:31Z

Apologies, should have read more closely. yes I was thinking about instance level metrics.

Not sure we would do instance level metrics at this points for custom resources, but if you come up with a mini proposal on the pros and cons we can have a look, google doc is fine or just more detailed description in a new issue as I would like to keep this one for CRD metrics. Thanks!

fejta-bot · 2021-01-25T09:35:47Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

lilic · 2021-01-25T09:39:36Z

/remove-lifecycle stale

I believe this is still valid. @mrueg do you want me to label this as help wanted or do you want to tackle the proposal part yourself? Thanks!

fejta-bot · 2021-04-25T10:39:00Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

grzesuav · 2021-04-25T15:28:49Z

Was it implemented ? If not - upvoting ;)

grzesuav · 2021-04-25T15:46:44Z

Maybe my use-case - I have dashboards showing information about resources related to one of the operators - ConfigMap, Secrets and its leaks information about CR's

lilic · 2021-04-26T09:32:30Z

@grzesuav this was not implemented, it's up for grabs if you are interested in tackling it?

grzesuav · 2021-04-28T16:52:30Z

hi @lilic , I can try, I am not familiar with the project though, will try to look at it next weekend and ask more question about feasibility of implementation if that's ok

lilic · 2021-04-29T08:04:31Z

@grzesuav sounds great! Best would be to come up with just a small proposal here on the issue before you start doing a PR, might be good to sync with @mrueg who had an initial idea and wants this as well, so both requests are answered. Thanks again! 🎉

grzesuav · 2021-05-18T21:10:06Z

just one thing @lilic , as I noticed kube-state-metrics uses typed API, is it ok to focus on CRD v1 introduced in k8s 1.16 and omit v1beta1 ?

lilic · 2021-06-03T08:48:15Z

Yes that sounds great! 👍

grzesuav · 2021-06-29T17:46:10Z

hi, I have question related to implementation, I have some easy draft here - #1517.

The challenge I currently have is that typed client for CRD is in k8s.io/apiextensions-apiserver v0.21.0 package, also its interface is bit different :

type func(kubeClient "k8s.io/apiextensions-apiserver/pkg/client/clientset/clientset".Interface, ns string) "k8s.io/client-go/tools/cache".ListerWatcher)

than

func(kubeClient "k8s.io/client-go/kubernetes".Interface, ns string) "k8s.io/client-go/tools/cache".ListerWatcher

therefore I have some questions :

is there nice way to abstract this ? The problem I see it that whole design here heavily uses typed builder, therefore unsure what would be the best way - create internal client interface and make both clienset interfaces compatible// wrapped by it (but it would require a bit of changes with typing in many places)
another idea which I have is to use Unstructured API to get CRD's without using apiextensions dependency - but it will require more extractions and tooling in crd package (however leaving rest of the design untouched)

additional questions :

for now I am adding metric for crd objects, I think it would be beneficial to have also particular cr's in, but would keep it as separate PR - to not make it too big
making prom labels from kubernetes labels - from what I noticed there is certain allowlist to filter which labels should be included - how does it work ? I noticed my metrics have additional labels (on the Secrets) so it is configurable ? I mean how I can extend list of labels to be attached there (on cluster)

That is all what I have, please let me know whad do you think,

Cheers

grzesuav · 2021-07-12T21:47:37Z

⬆️ @lilic @mrueg would appreciate some feedback on that ;)

robbie-demuth · 2021-08-06T14:24:35Z

It looks like @mrueg's feature request was for instance-level (i.e. custom resource) metrics, but the conversation (and in-flight PR) has shifted to focus on custom resource definition metrics. I'm not going to get into technical details, level of effort, etc, but I'd like to give a plus 1 for instance-level metrics. All custom resources have the same metadata (i.e. ObjectMeta). I think it'd be incredibly valuable to serve metrics about that metadata for free (or via some opt-in API)

k8s-triage-robot · 2021-09-05T15:18:56Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot · 2021-10-05T15:39:43Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen
Mark this issue or PR as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-ci-robot · 2021-10-05T15:40:03Z

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen

Mark this issue or PR as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

RafaeLeal · 2022-08-11T01:34:56Z

Hey, was this implemented? I'm not familiar with the codebase, but I can come up with a proposal if that's something the maintainers want to put forward. I'm interested in instance-level metadata metrics, especially kube_CRD_NAME_labels would be very useful for me.

My use case: I maintain a CICD cluster using TektonCD. Tekton controllers manage TaskRuns and PipelineRuns, which are CRDs to use as building blocks for a CI/CD system. Tekton already has some metrics at the PipelineRun and TaskRun levels, but I'd like to be able to add labels on them and aggregate them arbitrarily. To give a quick example, I could add a label on PipelineRuns generated by a repository and calculate the average deployment time, and things like that. Joining the existing metrics with kube_pipelineruns_labels would be enough for me.

grzesuav · 2022-08-11T13:27:59Z

@RafaeLeal yes it was, (not by me) - https://github.com/kubernetes/kube-state-metrics/blob/master/docs/customresourcestate-metrics.md

RafaeLeal · 2022-08-11T13:43:46Z

@grzesuav reading this I'm not sure if we can implement a labels metric with it..
I could have something like this:

kind: CustomResourceStateMetrics
spec:
  resources:
    - groupVersionKind:
        group: myteam.io
        kind: "Foo"
        version: "v1"
      metrics:
        - name: "labels"
          help: "Foo labels"
          each: ???
          labelsFromPath:
            "*": [metadata, labels]

But I don't know exactly what to put on each, because the value is a constant 1 in the kube_*_labels 🤔

Maybe we should extend the API to allow this?

k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Aug 18, 2020

mrueg changed the title ~~Ressource Metrics for CRDs~~ Resource Metrics for CRDs Aug 20, 2020

k8s-ci-robot added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Aug 25, 2020

lilic added after-2.0 and removed help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. labels Oct 22, 2020

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 25, 2021

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 25, 2021

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 25, 2021

grzesuav added a commit to grzesuav/kube-state-metrics that referenced this issue Jun 29, 2021

feat(CRD's): kubernetes#1210 - Add resource metrics for CRD's

aa3fcc7

grzesuav mentioned this issue Jun 29, 2021

feat(CRD's): #1210 - Add resource metrics for CRD's #1517

Closed

robbie-demuth mentioned this issue Aug 6, 2021

Custom resource metrics kubernetes-sigs/kubebuilder#2286

Closed

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Sep 5, 2021

k8s-ci-robot closed this as completed Oct 5, 2021

dgrisonnet mentioned this issue Dec 6, 2021

Extend kube-state-metrics to support Custom Resource metrics #1640

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resource Metrics for CRDs #1210

Resource Metrics for CRDs #1210

mrueg commented Aug 18, 2020

tariq1890 commented Aug 25, 2020

k8s-ci-robot commented Aug 25, 2020

dgrisonnet commented Oct 21, 2020

lilic commented Oct 22, 2020 •

edited

dgrisonnet commented Oct 22, 2020

lilic commented Oct 23, 2020 •

edited

mrueg commented Oct 23, 2020

lilic commented Oct 23, 2020 •

edited

brancz commented Oct 23, 2020

mrueg commented Oct 26, 2020

lilic commented Oct 27, 2020

fejta-bot commented Jan 25, 2021

lilic commented Jan 25, 2021

fejta-bot commented Apr 25, 2021

grzesuav commented Apr 25, 2021

grzesuav commented Apr 25, 2021

lilic commented Apr 26, 2021

grzesuav commented Apr 28, 2021

lilic commented Apr 29, 2021

grzesuav commented May 18, 2021

lilic commented Jun 3, 2021

grzesuav commented Jun 29, 2021

grzesuav commented Jul 12, 2021

robbie-demuth commented Aug 6, 2021

k8s-triage-robot commented Sep 5, 2021

k8s-triage-robot commented Oct 5, 2021

k8s-ci-robot commented Oct 5, 2021

RafaeLeal commented Aug 11, 2022

grzesuav commented Aug 11, 2022

RafaeLeal commented Aug 11, 2022

Resource Metrics for CRDs #1210

Resource Metrics for CRDs #1210

Comments

mrueg commented Aug 18, 2020

tariq1890 commented Aug 25, 2020

k8s-ci-robot commented Aug 25, 2020

dgrisonnet commented Oct 21, 2020

lilic commented Oct 22, 2020 • edited

dgrisonnet commented Oct 22, 2020

lilic commented Oct 23, 2020 • edited

mrueg commented Oct 23, 2020

lilic commented Oct 23, 2020 • edited

brancz commented Oct 23, 2020

mrueg commented Oct 26, 2020

lilic commented Oct 27, 2020

fejta-bot commented Jan 25, 2021

lilic commented Jan 25, 2021

fejta-bot commented Apr 25, 2021

grzesuav commented Apr 25, 2021

grzesuav commented Apr 25, 2021

lilic commented Apr 26, 2021

grzesuav commented Apr 28, 2021

lilic commented Apr 29, 2021

grzesuav commented May 18, 2021

lilic commented Jun 3, 2021

grzesuav commented Jun 29, 2021

grzesuav commented Jul 12, 2021

robbie-demuth commented Aug 6, 2021

k8s-triage-robot commented Sep 5, 2021

k8s-triage-robot commented Oct 5, 2021

k8s-ci-robot commented Oct 5, 2021

RafaeLeal commented Aug 11, 2022

grzesuav commented Aug 11, 2022

RafaeLeal commented Aug 11, 2022

lilic commented Oct 22, 2020 •

edited

lilic commented Oct 23, 2020 •

edited

lilic commented Oct 23, 2020 •

edited