Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

migrate kubelet custom metrics to stability framework part 2 #84987

Conversation

RainbowMango
Copy link
Member

What type of PR is this?
/kind feature

What this PR does / why we need it:
Currently, custom metrics emitted from kubelet do not offer any stability guarantees.

And #83062 tries to make it possible to do so.

About metrics stability please refer to KEP .

Which issue(s) this PR fixes:
Fixes #

Special notes for your reviewer:
IMO, we can use the stability framework to offer stability guarantee and drop the traditional versioned endpoint like xxx/v1alpha.

More details please refer to why not traditional versioned endpoint.

Please let me know if we need a follow-up issue or KEP.

Does this PR introduce a user-facing change?:

Following metrics from kubelet are now marked as with the ALPHA stability level:
node_cpu_usage_seconds_total
node_memory_working_set_bytes
container_cpu_usage_seconds_total
container_memory_working_set_bytes
scrape_error

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

- [KEP]: https://github.com/kubernetes/enhancements/blob/master/keps/sig-instrumentation/20190404-kubernetes-control-plane-metrics-stability.md
- [Usage]: <link>
- [Other doc]: <link>

/priority important-soon
/milestone v1.17

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Nov 8, 2019
@k8s-ci-robot k8s-ci-robot added this to the v1.17 milestone Nov 8, 2019
@k8s-ci-robot k8s-ci-robot added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Nov 8, 2019
@k8s-ci-robot k8s-ci-robot added area/kubelet sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. sig/node Categorizes an issue or PR as relevant to SIG Node. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Nov 8, 2019
@RainbowMango
Copy link
Member Author

/cc @dashpole @brancz @logicalhan

@dashpole
Copy link
Contributor

dashpole commented Nov 8, 2019

/lgtm
/approve

I agree that we should migrate away from the versioned endpoint after this PR. I think adding this prometheus handler at /metrics/resource in addition to /metrics/resource/v1alpha1 should be non-controversial. I'm not sure how long we should keep the /metrics/resource/v1alpha1 around, but 3 releases seems reasonable so.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 8, 2019
Copy link
Member

@logicalhan logicalhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@RainbowMango
Copy link
Member Author

/assign @vishh @yujuhong
@kubernetes/sig-node-pr-reviews

@josiahbjorgaard
Copy link
Contributor

A quick note that this PR will need to be merged before code freeze on 11/14 at 5PM pacific time for v1.17.

@derekwaynecarr
Copy link
Member

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dashpole, derekwaynecarr, logicalhan, RainbowMango

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 11, 2019
@k8s-ci-robot k8s-ci-robot merged commit ec86baf into kubernetes:master Nov 11, 2019
@RainbowMango RainbowMango deleted the pr_migrate_custom_collector_kubelet_part2 branch November 11, 2019 07:30
@bethforsyth
Copy link

@RainbowMango Given the reversion of the container_cpu_usage_seconds_total -> container_cpu_usage_seconds change (from #84987) in #89540 , is it still correct that container_cpu_usage_seconds_total is listed as deprecated? e.g.

"container_cpu_usage_seconds_total", // non-counter metrics should not have "_total" suffix

@RainbowMango
Copy link
Member Author

@bethforsyth Thanks. Would like to file a PR for this?

@bethforsyth
Copy link

@RainbowMango Great, I will do - just wanted to confirm I'd not misinterpreted first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubelet cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. sig/node Categorizes an issue or PR as relevant to SIG Node. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants