Simplify custom resource metrics API by leveraging jq/CEL #1978

CatherineF-dev · 2023-02-07T03:52:06Z

What would you like to be added:

Doc: #2059

A simplified API for CustomResourceStateMetrics, which only supports values and labels, instead of supporting each, path, labelFromKey, labelsFromPath, valueFrom, commonLabels, labelsFromPath and *.

# new 
kind: CustomResourceStateMetricsV2
spec:
  resources:
    - groupVersionKind:
        group: myteam.io
        kind: "Foo"
        version: "v1"
      metrics:
        - name: "ready_count"
          help: "Number Foo Bars ready"
          values: jq '[.status.sub[].ready]' # valueFrom: [ready] // [2,4]
          labels:
          - jq '[ .status.sub | keys | .[] | {name: .}]' # labelFromKey: type // [{"name": "type-a"}, {"name": "type-b"}]
          - jq '[{ custom_metric:"yes" }]' # custom_metric: "yes" // [{custom_metric="yes"}]
          - jq '[.metadata.labels]' # "*": [metadata, labels] // [{"bar": "baz","qux": "quxx"}]
          - jq '[.metadata.annotations]' # "**": [metadata, annotations] // [{"foo": "bar"}]
          - jq '[{ name: .metadata.name }]' # name: [metadata, name] // [{"name": "foo"}]
          - jq '[{ foo: .metadata.labels.foo }]' # foo: [metadata, labels, foo] // [{foo": "bar"}]
          - jq '[.status.sub[].active | {active: .}]' # labelsFromPath:  active: [active] // [{active": 1}, {"active": 3}]
          
 # old
      metrics:
        - name: "status_phase"
          help: "Foo status_phase"
          each:
            type: StateSet
            stateSet:
              labelName: phase
              path: [status, phase]
              list: [Pending, Bar, Baz]

kube_customresource_ready_count{customresource_group="myteam.io", customresource_kind="Foo", customresource_version="v1", active="1",custom_metric="yes",foo="bar",name="foo",bar="baz",qux="quxx",type="type-a"} 2
kube_customresource_ready_count{customresource_group="myteam.io", customresource_kind="Foo", customresource_version="v1", active="3",custom_metric="yes",foo="bar",name="foo",bar="baz",qux="quxx",type="type-b"} 4

Why is this needed:

simpler and easier for KSM community to maintain. Have seen several issues around corner cases with custom resource metrics (Crash on nonexistent metric paths in custom resources #1992).
easier for users to use and debug custom resource metrics

Describe the solution you'd like

Additional context
Han recommended cel

The text was updated successfully, but these errors were encountered:

chrischdi · 2023-02-07T11:25:37Z

I like the idea of cleaning up the configuration. When doing so we should take care that we are still able to address all use-cases which got addressed currently.

IMHO: if we introduce a new version for the configuration, we should do it in a way to have auto-conversion from the old configuration by using a custom config type as in https://book.kubebuilder.io/component-config-tutorial/config-type.html .

Also related issue: #1948 .

logicalhan · 2023-02-09T17:52:39Z

/triage accepted
/assign @CatherineF-dev

dgrisonnet · 2023-02-09T18:22:43Z

@CatherineF-dev could you perhaps start a design doc highlighting the different options we have to improve the UX of the existing API?

CatherineF-dev · 2023-02-09T18:32:56Z

Okay!

CatherineF-dev · 2023-04-13T13:17:18Z

Verified that can convert k8s objects into yamls https://github.com/kubernetes/kube-state-metrics/compare/main...CatherineF-dev:kube-state-metrics:cr-metrics-2?expand=1.

CatherineF-dev · 2023-04-13T14:31:54Z

Existing problems for KSM custom resource

1. Custom resource API is complicated and not flexible

Now, it supports 7 operations:
each, path, labelFromKey, labelsFromPath, valueFrom, commonLabels and *.

It’s not easy to use and brings some corner case issues.

I am using this custom resource document , but I couldn't find a clear yes or no there. I want to capture replsets. <all-the-elements>. size. Is this possible?
"LabelFromKey" not available #1868
Crash on nonexistent metric paths in custom resources #1992
Can’t aggregate metrics for multiple CRs. For example, It can’t answer “How many CRs under one CRD”?

Existing proposals:
- Pr 2014 proposes a metric generation tool to create configurations from monitored CRD.
- Issue 1978 proposes to simplify the API from 7 operations to 2 operations using jq (JsonPath). It can support some aggregations to answer question “How many CRs under one CRD”?

2. Coupled monitoring pipeline and monitoring target

Need to modify kube-state-metrics agents if you want to monitor one custom resource.

--custom-resource-state-config "inline yaml (see example)" 
--custom-resource-state-config-file /path/to/config.yaml

Existing proposals:
Issue 1948 proposes to support CustomResourceDefinition CRD.

Proposal

Supports
Issue 1948 proposes to support CustomResourceDefinition CRD.
Issue 1978 proposes to simply API from 7 operations to 2 operations.

CatherineF-dev · 2023-04-13T14:32:46Z

cc @dgrisonnet,

What else do I need to add into #1978 (comment)? Thx!

dgrisonnet · 2023-04-13T15:10:28Z

Feel free to open a PR adding your design doc under docs/design and we can review it from there

nathanperkins · 2023-04-19T23:23:35Z

Would this be capable of parsing annotation values as json? In some cases, we have controllers that save state on an object as json annotations. It would be nice to expose fields within those json objects.

chrischdi · 2023-04-25T09:57:10Z

One point which came to my mind we should consider if this gets done: performance!

logicalhan · 2023-04-25T20:49:22Z

I feel like we should use CEL since that's the direction that Kubernetes is moving.

CatherineF-dev · 2023-05-08T14:17:52Z

Reply @nathanperkins, I think CEL can parse annotation values as json. So it's feasible.

Design doc is here: Simplify custom resource state metrics API using CEL(#2059)

Also, it can support counting the number of CRs under one CRD. Or anything else which can be queried using CEL.

CatherineF-dev · 2024-01-16T18:18:37Z

I find we can reuse some codes from Custom Resource Field Selectors

https://github.com/kubernetes/enhancements/blob/b3f29fe1223ebf09858ad3289dbfe3f652dd6069/keps/sig-api-machinery/4358-custom-resource-field-selectors/README.md

CatherineF-dev added the kind/feature Categorizes issue or PR as related to a new feature. label Feb 7, 2023

k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Feb 7, 2023

k8s-ci-robot assigned CatherineF-dev Feb 9, 2023

k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Feb 9, 2023

CatherineF-dev linked a pull request May 7, 2023 that will close this issue

Simplify custom resource state metrics API using CEL #2059

Draft

CatherineF-dev changed the title ~~Simplify custom resource metrics API by leveraging jq or yq~~ Simplify custom resource metrics API by leveraging CEL May 8, 2023

CatherineF-dev changed the title ~~Simplify custom resource metrics API by leveraging CEL~~ Simplify custom resource metrics API by leveraging jq/CEL May 8, 2023

CatherineF-dev mentioned this issue May 8, 2023

CustomResourceStateMetrics Wildcard in paths #2057

Closed

CatherineF-dev mentioned this issue May 18, 2023

Support for Unknown status condition in Custom Resources #2070

Open

CatherineF-dev mentioned this issue Jun 14, 2023

feat: Support list expansions #2068

Open

rexagod mentioned this issue Jun 23, 2023

refactor: KSM Cyclomatic fix for Job and PV files #2092

Closed

david-martin mentioned this issue Aug 17, 2023

metrics: Migrate config to jq/CEL when it's supported Kuadrant/gateway-api-state-metrics#2

Open

CatherineF-dev mentioned this issue Jan 16, 2024

KEP: Custom Resource Field Selectors kubernetes/enhancements#4359

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify custom resource metrics API by leveraging jq/CEL #1978

Simplify custom resource metrics API by leveraging jq/CEL #1978

CatherineF-dev commented Feb 7, 2023 •

edited

chrischdi commented Feb 7, 2023

logicalhan commented Feb 9, 2023

dgrisonnet commented Feb 9, 2023

CatherineF-dev commented Feb 9, 2023

CatherineF-dev commented Apr 13, 2023

CatherineF-dev commented Apr 13, 2023 •

edited

CatherineF-dev commented Apr 13, 2023

dgrisonnet commented Apr 13, 2023

nathanperkins commented Apr 19, 2023 •

edited

chrischdi commented Apr 25, 2023

logicalhan commented Apr 25, 2023

CatherineF-dev commented May 8, 2023 •

edited

CatherineF-dev commented Jan 16, 2024 •

edited

Simplify custom resource metrics API by leveraging jq/CEL #1978

Simplify custom resource metrics API by leveraging jq/CEL #1978

Comments

CatherineF-dev commented Feb 7, 2023 • edited

chrischdi commented Feb 7, 2023

logicalhan commented Feb 9, 2023

dgrisonnet commented Feb 9, 2023

CatherineF-dev commented Feb 9, 2023

CatherineF-dev commented Apr 13, 2023

CatherineF-dev commented Apr 13, 2023 • edited

Existing problems for KSM custom resource

1. Custom resource API is complicated and not flexible

2. Coupled monitoring pipeline and monitoring target

Proposal

CatherineF-dev commented Apr 13, 2023

dgrisonnet commented Apr 13, 2023

nathanperkins commented Apr 19, 2023 • edited

chrischdi commented Apr 25, 2023

logicalhan commented Apr 25, 2023

CatherineF-dev commented May 8, 2023 • edited

CatherineF-dev commented Jan 16, 2024 • edited

CatherineF-dev commented Feb 7, 2023 •

edited

CatherineF-dev commented Apr 13, 2023 •

edited

nathanperkins commented Apr 19, 2023 •

edited

CatherineF-dev commented May 8, 2023 •

edited

CatherineF-dev commented Jan 16, 2024 •

edited