Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add kubelet_node_name metrics #72910

Merged
merged 1 commit into from Feb 5, 2019

Conversation

@danielqsj
Copy link
Member

danielqsj commented Jan 15, 2019

What type of PR is this?

/sig instrumentation
/kind feature

What this PR does / why we need it:

This would allow us to join the kubelet metrics against metrics exported by kube-state-metrics.

Which issue(s) this PR fixes:

Fixes #72715

Special notes for your reviewer:

example:

# HELP kubelet_node_name The node's name. The count is always 1.
# TYPE kubelet_node_name gauge
kubelet_node_name{node="nodeName-1"} 1

Does this PR introduce a user-facing change?:

Add kubelet_node_name metrics.
@danielqsj

This comment has been minimized.

Copy link
Member Author

danielqsj commented Jan 15, 2019

@k8s-ci-robot k8s-ci-robot requested a review from brancz Jan 15, 2019

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented Jan 15, 2019

@danielqsj: GitHub didn't allow me to request PR reviews from the following users: tomwilkie.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @tomwilkie @brancz

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@brancz
Copy link
Member

brancz left a comment

just one comment I'm wondering about, otherwise this would lgtm

@@ -1303,6 +1303,7 @@ func (kl *Kubelet) initializeModules() error {
collectors.NewVolumeStatsCollector(kl),
collectors.NewLogMetricsCollector(kl.StatsProvider.ListPodStats),
)
metrics.SetNodeName(kl.nodeName)

This comment has been minimized.

@brancz

brancz Jan 15, 2019

Member

can a node name be changed without a process restart? /cc @derekwaynecarr

This comment has been minimized.

@danielqsj

danielqsj Jan 15, 2019

Author Member

I do not find the place to change node name in code. But not sure.

This comment has been minimized.

@derekwaynecarr

This comment has been minimized.

@brancz

brancz Jan 17, 2019

Member

Thanks for the confirmation!

@danielqsj danielqsj force-pushed the danielqsj:kn branch from 436dffa to 1d73c7d Jan 15, 2019

@danielqsj

This comment has been minimized.

Copy link
Member Author

danielqsj commented Jan 15, 2019

/retest

1 similar comment
@danielqsj

This comment has been minimized.

Copy link
Member Author

danielqsj commented Jan 15, 2019

/retest

@brancz

This comment has been minimized.

Copy link
Member

brancz commented Jan 17, 2019

/lgtm
/retest

@danielqsj

This comment has been minimized.

Copy link
Member Author

danielqsj commented Jan 18, 2019

/retest

@tomwilkie

This comment has been minimized.

Copy link

tomwilkie commented Jan 19, 2019

Thanks @danielqsj!

@danielqsj

This comment has been minimized.

Copy link
Member Author

danielqsj commented Jan 19, 2019

/retest

2 similar comments
@danielqsj

This comment has been minimized.

Copy link
Member Author

danielqsj commented Jan 19, 2019

/retest

@brancz

This comment has been minimized.

Copy link
Member

brancz commented Jan 21, 2019

/retest

@danielqsj

This comment has been minimized.

Copy link
Member Author

danielqsj commented Jan 28, 2019

/retest

1 similar comment
@brancz

This comment has been minimized.

Copy link
Member

brancz commented Jan 28, 2019

/retest

@danielqsj

This comment has been minimized.

Copy link
Member Author

danielqsj commented Jan 29, 2019

/retest

@danielqsj

This comment has been minimized.

Copy link
Member Author

danielqsj commented Jan 29, 2019

@Random-Liu could you help review this, thanks

@danielqsj

This comment has been minimized.

Copy link
Member Author

danielqsj commented Feb 2, 2019

@thockin @Random-Liu @dchen1107 @vishh @yujuhong if you have time, can you help review this? Thanks

@danielqsj

This comment has been minimized.

Copy link
Member Author

danielqsj commented Feb 2, 2019

/retest

1 similar comment
@brancz

This comment has been minimized.

Copy link
Member

brancz commented Feb 4, 2019

/retest

@vishh

This comment has been minimized.

Copy link
Member

vishh commented Feb 4, 2019

It feels strange to expose node name a constant as a time series metric. Instead why not expose node name as a label for all metrics exposed by the kubelet and join using labels?
Along the way, kube-state-metrics should export node-name in addition to node ip address based on the published schema

@brancz

This comment has been minimized.

Copy link
Member

brancz commented Feb 4, 2019

The reasoning as described in #72715 is to be able to get the instance label of a node merged onto kube-state-metrics metrics.

@vishh

This comment has been minimized.

Copy link
Member

vishh commented Feb 4, 2019

@brancz

This comment has been minimized.

Copy link
Member

brancz commented Feb 4, 2019

This is about allowing consistent querying (more specifically joining) no matter what your Prometheus relabeling config is (this enables alert/dashboard sharing without necessity to configure things, and a single metric is extremely cheap so it's worth it).

@vishh

This comment has been minimized.

Copy link
Member

vishh commented Feb 4, 2019

@brancz

This comment has been minimized.

Copy link
Member

brancz commented Feb 5, 2019

Queries such as what is described in kubernetes-monitoring/kubernetes-mixin#135. Not having this metric, makes Kubernetes dictate how the Prometheus configuration must be, when Kubernetes should try to let users express their opinion in their configuration and not stand in their way.

@derekwaynecarr

This comment has been minimized.

Copy link
Member

derekwaynecarr commented Feb 5, 2019

/approve

@derekwaynecarr

This comment has been minimized.

Copy link
Member

derekwaynecarr commented Feb 5, 2019

/hold

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented Feb 5, 2019

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danielqsj, derekwaynecarr

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@derekwaynecarr

This comment has been minimized.

Copy link
Member

derekwaynecarr commented Feb 5, 2019

missed @vishh comment, holding until he has a chance to ack @brancz response.

@vishh

This comment has been minimized.

Copy link
Member

vishh commented Feb 5, 2019

@brancz let's put aside philosophical stance aside for a bit and I'd appreciate if you can help me understand this problem a bit more in detail!

kubernetes-monitoring/kubernetes-mixin#135 (comment) states that there is a mismatch in labels between two metrics. kubelet_running_pod_count is exposed by kubelet & kube_node_status_capacity is exposed by kube-state-metrics. It looks like there is a mismatch in labels.
From the code, kubelet doesn't seem to export any standard labels for most metrics, and kube-state-metrics seems to export node address as a labels which isn't a good for user experience.
I'm asking if we can the do the following, and if it would help address the underlying problem:

  1. Have kubelet expose a standard set of labels for all metrics - we can start with node-name for start with, but I'd expect other ones to show up like kernel version, OS type/distro & version, k'let version, etc.
  2. Extend kube-state-metrics to expose node name in addition to node address as a label.
@brancz

This comment has been minimized.

Copy link
Member

brancz commented Feb 5, 2019

What you're suggesting is almost, the case 🙂 .

  1. Adding a label to all metrics goes against best practices, if that's what a user wants, this should be a label on the target (which would enforce a certain Prometheus configuration, which is what we're trying to prevent with this), as all labels of a target are added to the metrics at ingestion time. All we're looking for is 1 metric on the kubelet that can, regardless of Prometheus configuration, reliably be joined on. All of the other labels you're suggesting, can then reliably be joined onto kubelet metrics via the metrics from kube-state-metrics 🙂 .
  2. I think there was a misunderstanding here, kube-state-metrics already exposes the node label for metrics about node objects, no address.
@vishh

This comment has been minimized.

Copy link
Member

vishh commented Feb 5, 2019

@brancz

This comment has been minimized.

Copy link
Member

brancz commented Feb 5, 2019

That would be wasted bytes transferred over the wire, when Prometheus can just as well attach the label at ingestion time (if the label in question is static, which is what we're talking about here). Even if pushed, the metric added here is useful for the very same reason mentioned above.

You're right the documentation on kube-state-metrics is incorrect, that should not say node-address, but node-name. I'll fix that.

Let me know if there is anything else I can clarify 🙂 .

@vishh

This comment has been minimized.

Copy link
Member

vishh commented Feb 5, 2019

@brancz

This comment has been minimized.

Copy link
Member

brancz commented Feb 5, 2019

Glad everything got cleared up! This type of metric would be an "info" metric, which is already described in the instrumentation guidelines under normalization 🙂 https://github.com/kubernetes/community/blob/master/contributors/devel/sig-instrumentation/instrumentation.md#normalization

/retest

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented Feb 5, 2019

@danielqsj: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-local-e2e-containerized 1d73c7d link /test pull-kubernetes-local-e2e-containerized

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot merged commit a20cd49 into kubernetes:master Feb 5, 2019

17 of 18 checks passed

pull-kubernetes-local-e2e-containerized Job failed.
Details
cla/linuxfoundation danielqsj authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-cross Skipped
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gce-100-performance Job succeeded.
Details
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-e2e-kops-aws Context retired without replacement.
pull-kubernetes-e2e-kubeadm-gce Skipped
pull-kubernetes-godeps Skipped
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce-big Job succeeded.
Details
pull-kubernetes-local-e2e Skipped
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details
tide In merge pool.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.