Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change kubelet metrics to conform metrics guidelines #72470

Merged
merged 5 commits into from
Feb 19, 2019

Conversation

danielqsj
Copy link
Contributor

@danielqsj danielqsj commented Jan 2, 2019

What type of PR is this?

/kind feature
/sig instrumentation

What this PR does / why we need it:

  1. As part of kubernetes metrics overhaul, change kubelet metrics to conform Kubernetes metrics instrumentation guidelines.

  2. Change kubelet metrics to histogram for better aggregation.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

This patch does not remove the existing metrics but mark them as deprecated.
We need 2 releases for users to convert monitoring configuration.

Does this PR introduce a user-facing change?:

Change kubelet metrics to conform metrics guidelines.
The following metrics are deprecated, and will be removed in a future release:
* `kubelet_pod_worker_latency_microseconds`
* `kubelet_pod_start_latency_microseconds`
* `kubelet_cgroup_manager_latency_microseconds`
* `kubelet_pod_worker_start_latency_microseconds`
* `kubelet_pleg_relist_latency_microseconds`
* `kubelet_pleg_relist_interval_microseconds`
* `kubelet_eviction_stats_age_microseconds`
* `kubelet_runtime_operations`
* `kubelet_runtime_operations_latency_microseconds`
* `kubelet_runtime_operations_errors`
* `kubelet_device_plugin_registration_count`
* `kubelet_device_plugin_alloc_latency_microseconds`
Please convert to the following metrics:
* `kubelet_pod_worker_duration_seconds`
* `kubelet_pod_start_duration_seconds`
* `kubelet_cgroup_manager_duration_seconds`
* `kubelet_pod_worker_start_duration_seconds`
* `kubelet_pleg_relist_duration_seconds`
* `kubelet_pleg_relist_interval_seconds`
* `kubelet_eviction_stats_age_seconds`
* `kubelet_runtime_operations_total`
* `kubelet_runtime_operations_duration_seconds`
* `kubelet_runtime_operations_errors_total`
* `kubelet_device_plugin_registration_total`
* `kubelet_device_plugin_alloc_duration_seconds`

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jan 2, 2019
@danielqsj
Copy link
Contributor Author

/assign @brancz

@k8s-ci-robot k8s-ci-robot added area/kubelet sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Jan 2, 2019
@k8s-ci-robot k8s-ci-robot requested review from dims and enj January 2, 2019 03:11
@brancz
Copy link
Member

brancz commented Jan 8, 2019

Could you add the deprecation hint in the description here as well? Thanks!

@danielqsj
Copy link
Contributor Author

@brancz Added Deprecated in metrics description. PTAL

@brancz
Copy link
Member

brancz commented Jan 8, 2019

/retest

4 similar comments
@brancz
Copy link
Member

brancz commented Jan 8, 2019

/retest

@brancz
Copy link
Member

brancz commented Jan 8, 2019

/retest

@brancz
Copy link
Member

brancz commented Jan 8, 2019

/retest

@danielqsj
Copy link
Contributor Author

/retest

@brancz
Copy link
Member

brancz commented Jan 10, 2019

/lgtm
/assign @derekwaynecarr

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 10, 2019
@danielqsj
Copy link
Contributor Author

@derekwaynecarr can you help review this? Thanks

@danielqsj
Copy link
Contributor Author

@derekwaynecarr @vishh ping

@danielqsj
Copy link
Contributor Author

@thockin @derekwaynecarr @vishh can you help review this? Thanks

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 7, 2019
@danielqsj
Copy link
Contributor Author

rebased.
@thockin @derekwaynecarr @vishh : this is the last task of #72333 . Can you help review it then I can close the tracking issue?

@brancz
Copy link
Member

brancz commented Feb 7, 2019

/retest

1 similar comment
@brancz
Copy link
Member

brancz commented Feb 7, 2019

/retest

@ehashman
Copy link
Member

ehashman commented Feb 7, 2019

I believe this fixes #66791.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 17, 2019
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 18, 2019
@danielqsj
Copy link
Contributor Author

PR rebased.

@tallclair : could you help review it ? thanks

KubeletSubsystem = "kubelet"
NodeNameKey = "node_name"
NodeLabelKey = "node"
PodWorkerLatencyKey = "pod_worker_latency_seconds"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noticed this on another PR, these should all be duration instead of latency

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this looks good ? 79a3eb8

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we should do this change to all other metrics ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we should.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, I will change them.

@brancz
Copy link
Member

brancz commented Feb 18, 2019

Thanks for the quick update!

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 18, 2019
@brancz
Copy link
Member

brancz commented Feb 18, 2019

@smarterclayton
Copy link
Contributor

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danielqsj, smarterclayton

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 19, 2019
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@mikedanese
Copy link
Member

I just noticed this on another PR, these should all be duration instead of latency

@brancz this is probably worth adding to https://github.com/kubernetes/community/blob/81ec4af0ed02b4c5c0917a16563250b2f45250c2/contributors/devel/sig-instrumentation/instrumentation.md#naming

@brancz
Copy link
Member

brancz commented Sep 9, 2019

@mikedanese You're totally right. There is this already, but we don't link to it yet. Let me fix that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubelet cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants