Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add retention policy metrics for KEP-1847 #1876

Merged
merged 1 commit into from Nov 8, 2022

Conversation

mattcary
Copy link
Contributor

@mattcary mattcary commented Nov 2, 2022

What this PR does / why we need it:

Add metrics for new StatefulSet fields from KEP-1847.

How does this change affect the cardinality of KSM: (increases, decreases or does not change cardinality)
Increases it by adding a new metric that has a cardinality of 5 x # stateful sets.

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Nov 2, 2022
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Nov 2, 2022
@mattcary mattcary changed the title Add retention policy metrics for KEP-1847 [WIP] Add retention policy metrics for KEP-1847 Nov 2, 2022
@mattcary
Copy link
Contributor Author

mattcary commented Nov 7, 2022

/retest

@mattcary
Copy link
Contributor Author

mattcary commented Nov 7, 2022

/retest

@mattcary mattcary force-pushed the kep1847 branch 2 times, most recently from c71fd2a to 64a468c Compare November 7, 2022 18:11
@mattcary
Copy link
Contributor Author

mattcary commented Nov 7, 2022

/retest

@mattcary mattcary marked this pull request as ready for review November 7, 2022 18:39
@mattcary mattcary changed the title [WIP] Add retention policy metrics for KEP-1847 Add retention policy metrics for KEP-1847 Nov 7, 2022
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 7, 2022
@mattcary
Copy link
Contributor Author

mattcary commented Nov 7, 2022

Hi, this is failing the e2e tests because the field in question is behind a feature gate.

I have a conversation going on kubernetes/kubernetes#111299 which led to this PR, but haven't gotten any response.

Does anyone here have any hints on how to get around the feature gate problem? Thanks!

@@ -63,6 +63,7 @@ func TestStatefulSetStore(t *testing.T) {
# HELP kube_statefulset_created [STABLE] Unix creation timestamp
# HELP kube_statefulset_labels [STABLE] Kubernetes labels converted to Prometheus labels.
# HELP kube_statefulset_metadata_generation [STABLE] Sequence number representing a specific generation of the desired state for the StatefulSet.
# HELP kube_statefulset_persistent_volume_claim_retention_policy Count of retention policy for StatefulSet template PVCs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation here seems to be off

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx, tabs vs spaces is the gift that keeps on giving...

if err := c.run(); err != nil {
t.Errorf("unexpected collecting result in %vth run:\n%s", i, err)
t.Errorf("unexpected collecting result in %v%s run:\n%s", i+1, ord, err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
t.Errorf("unexpected collecting result in %v%s run:\n%s", i+1, ord, err)
t.Errorf("unexpected collecting result in run %v:\n%s" i, err)

Let's reduce complexity here and rephrase the error message instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, note that we need to use i+1 to have the index match the names used in the test cases (which is the practical way to figure out which test failed).

@mattcary
Copy link
Contributor Author

mattcary commented Nov 8, 2022

@mrueg thanks for the comments, do you have any advice on the e2e test failure? I'm stuck with what to do.

@logicalhan
Copy link
Member

@mrueg thanks for the comments, do you have any advice on the e2e test failure? I'm stuck with what to do.

This seems to the be error:

Error from server (BadRequest): error when creating "tests/manifests/statefulset.yaml": StatefulSet in version "v1" cannot be handled as a StatefulSet: strict decoding error: unknown field "spec.template.spec.persistentVolumeClaimRetentionPolicy"

Is this a new field?

@mattcary
Copy link
Contributor Author

mattcary commented Nov 8, 2022

@mrueg thanks for the comments, do you have any advice on the e2e test failure? I'm stuck with what to do.

This seems to the be error:

Error from server (BadRequest): error when creating "tests/manifests/statefulset.yaml": StatefulSet in version "v1" cannot be handled as a StatefulSet: strict decoding error: unknown field "spec.template.spec.persistentVolumeClaimRetentionPolicy"

Is this a new field?

Yes, exactly.

I guess I can just remove the new field from the testing manifest, and skip e2e testing on this feature?

@@ -21,7 +21,7 @@ spec:
spec:
automountServiceAccountToken: true
containers:
- image: registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.6.0
- image: registry.k8s.io/kube-state-metrics/kube-state-metrics-amd64:v2.6.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This got probably added by accident

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah everytime I run the e2e tests by hand this gets change :-/

internal/store/statefulset_test.go Outdated Show resolved Hide resolved
internal/store/statefulset_test.go Show resolved Hide resolved
@mattcary mattcary force-pushed the kep1847 branch 3 times, most recently from 5f5af02 to 0a8b56d Compare November 8, 2022 18:08
Copy link
Member

@logicalhan logicalhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 8, 2022
@@ -193,6 +193,30 @@ func statefulSetMetricFamilies(allowAnnotationsList, allowLabelsList []string) [
}
}),
),
*generator.NewFamilyGeneratorWithStability(
"kube_statefulset_persistent_volume_claim_retention_policy",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"kube_statefulset_persistent_volume_claim_retention_policy",
"kube_statefulset_persistentvolumeclaim_retention_policy",

Since we are talking about the PersistentVolumeClaim kubernetes resource, it should be put in one word.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -95,6 +97,7 @@ func TestStatefulSetStore(t *testing.T) {
kube_statefulset_status_observed_generation{namespace="ns1",statefulset="statefulset1"} 1
kube_statefulset_replicas{namespace="ns1",statefulset="statefulset1"} 3
kube_statefulset_metadata_generation{namespace="ns1",statefulset="statefulset1"} 3
kube_statefulset_persistent_volume_claim_retention_policy{namespace="ns1",statefulset="statefulset1",when_deleted="nil",when_scaled="nil"} 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the retention policy is nil it would be better to not expose the metric since it doesn't bring much value and the less timeseries we have to store, the better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 8, 2022
@@ -11,6 +11,7 @@
| kube_statefulset_status_observed_generation | Gauge | `statefulset`=&lt;statefulset-name&gt; <br> `namespace`=&lt;statefulset-namespace&gt; | STABLE |
| kube_statefulset_replicas | Gauge | `statefulset`=&lt;statefulset-name&gt; <br> `namespace`=&lt;statefulset-namespace&gt; | STABLE |
| kube_statefulset_metadata_generation | Gauge | `statefulset`=&lt;statefulset-name&gt; <br> `namespace`=&lt;statefulset-namespace&gt; | STABLE |
| kube_statefulset_persistent_volume_claim_retention_policy | Gauge | `statefulset`=&lt;statefulset-name&gt; <br> `namespace`=&lt;statefulset-namespace&gt; <br> `when_deleted`=&lt;statefulset-when-deleted-pvc-policy&gt; <br> `when_scaled`=&lt;statefulset-when-scaled-pvc-policy&gt; | ALPHA |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| kube_statefulset_persistent_volume_claim_retention_policy | Gauge | `statefulset`=&lt;statefulset-name&gt; <br> `namespace`=&lt;statefulset-namespace&gt; <br> `when_deleted`=&lt;statefulset-when-deleted-pvc-policy&gt; <br> `when_scaled`=&lt;statefulset-when-scaled-pvc-policy&gt; | ALPHA |
| kube_statefulset_persistentvolumeclaim_retention_policy | Gauge | `statefulset`=&lt;statefulset-name&gt; <br> `namespace`=&lt;statefulset-namespace&gt; <br> `when_deleted`=&lt;statefulset-when-deleted-pvc-policy&gt; <br> `when_scaled`=&lt;statefulset-when-scaled-pvc-policy&gt; | ALPHA |

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, just saw the presubmit fail on that :-) thx

Copy link
Member

@logicalhan logicalhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 8, 2022
@mattcary
Copy link
Contributor Author

mattcary commented Nov 8, 2022

@mrueg Any other comments? Sorry to be a pest but this is blocking a k8s change that wants to make the code freeze. Thank you for your quick review!

@mrueg
Copy link
Member

mrueg commented Nov 8, 2022

Thanks for the contribution!

/lgtm

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: logicalhan, mattcary, mrueg

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 8, 2022
@k8s-ci-robot k8s-ci-robot merged commit dc1f1f3 into kubernetes:master Nov 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants