New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add gauge metric for master of leader election. #71731

Open
wants to merge 1 commit into
base: master
from

Conversation

@cheftako
Member

cheftako commented Dec 5, 2018

What type of PR is this?
/kind feature

What this PR does / why we need it:
Adds a gauge for leader election.
0 indicates standby, 1 indicates master, label indicates which lease.

Which issue(s) this PR fixes:
Fixes #71730

Special notes for your reviewer:
$ curl http://127.0.0.1:10252/metrics | egrep -i leader
# HELP leader_election_master_gauge Gauge of if the reporting system is master of the relevant election.
# TYPE leader_election_master_gauge gauge
leader_election_master_gauge{name="kube-controller-manager"} 1

Does this PR introduce a user-facing change?:

NONE
@cheftako

This comment has been minimized.

Member

cheftako commented Dec 5, 2018

@k8s-ci-robot k8s-ci-robot requested review from eparis and wojtek-t Dec 5, 2018

@@ -214,6 +215,7 @@ func (le *LeaderElector) acquire(ctx context.Context) bool {
klog.Infof("successfully acquired lease %v", desc)
cancel()
}, le.config.RetryPeriod, JitterFactor, true, ctx.Done())
leaderGauge.WithLabelValues(le.config.Name).Set(1.0)

This comment has been minimized.

@logicalhan

logicalhan Dec 5, 2018

Contributor

Shouldn't we check the value of succeeded prior to toggling the gauge? I'm not super familiar with this code, but at least according to the comment for this function, there seems to be a situation where we did not succeed in acquiring the lease.

This comment has been minimized.

@cheftako

cheftako Dec 5, 2018

Member

Good point. If succeeded is false we are about to exit but no point in putting a blip on the graphs.

@@ -245,6 +247,7 @@ func (le *LeaderElector) renew(ctx context.Context) {
klog.V(5).Infof("successfully renewed lease %v", desc)
return
}
leaderGauge.WithLabelValues(le.config.Name).Set(0.0)

This comment has been minimized.

@logicalhan

logicalhan Dec 5, 2018

Contributor

Out of curiosity, what is the value of le.config.Name?

This comment has been minimized.

@cheftako

cheftako Dec 5, 2018

Member

It depends on the lease in question. So for the KCM its "kube-controller-manager" and for the scheduler its "kube-scheduler".

@krmayankk

This comment has been minimized.

Contributor

krmayankk commented Dec 5, 2018

@cheftako is this a per local node metric ? How will i know using this metric which node has the lock ? Today i can do kubectl get endpoints kube-controller-manager -n kube-system -o yaml | grep holderIdentity and it tells me who the master is one location

@cheftako

This comment has been minimized.

Member

cheftako commented Dec 5, 2018

@krmayankk the metric will show up in an individual servers /metrics endpoint. So prometheus or whichever system is collecting the metrics should be associating that with the server from which it gathered the metric. This is common to all the metrics coming out of the /metrics endpoint. (Eg. "daemonset_queue_latency_count 104") Presumably you will have something periodically scraping this endpoint and creating a graph for your HA cluster.

@cheftako cheftako force-pushed the cheftako:leaseMetric branch from 2cc976a to ebf2183 Dec 6, 2018

@k8s-ci-robot

This comment has been minimized.

Contributor

k8s-ci-robot commented Dec 6, 2018

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: cheftako
To fully approve this pull request, please assign additional approvers.
We suggest the following additional approvers: cblecker, mikedanese

If they are not already assigned, you can assign the PR to them by writing /assign @cblecker @mikedanese in a comment when ready.

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@cheftako

This comment has been minimized.

Member

cheftako commented Dec 6, 2018

/test pull-kubernetes-integration

Add gauge metric for master of leader election.
Fixes #71730
0 indicates standby, 1 indicates master, label indicates which lease.
Tweaked name and documentation

@cheftako cheftako force-pushed the cheftako:leaseMetric branch from ebf2183 to 4e26492 Dec 6, 2018

@cheftako

This comment has been minimized.

Member

cheftako commented Dec 7, 2018

/test pull-kubernetes-kubemark-e2e-gce-big

@jpbetz

This comment has been minimized.

Contributor

jpbetz commented Dec 7, 2018

The metric help string now makes perfect sense to me. Thanks @cheftako!

@jpbetz

jpbetz approved these changes Dec 7, 2018

@jpbetz

This comment has been minimized.

Contributor

jpbetz commented Dec 7, 2018

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm label Dec 7, 2018

@logicalhan logicalhan referenced this pull request Dec 8, 2018

Closed

REQUEST: New membership for @logicalhan #292

6 of 6 tasks complete
@logicalhan

/lgtm to me too!

@cheftako

This comment has been minimized.

Member

cheftako commented Dec 12, 2018

/assign @mikedanese

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment