*: support additional '/metrics' endpoints #8242

gyuho · 2017-07-11T16:00:42Z

https://github.com/prometheus/prometheus/wiki/Default-port-allocations

9378 - etcd gRPC Proxy Exporter
9379 - etcd Exporter

gyuho · 2017-07-11T16:01:19Z

Note: /metrics uses the same client TLS, and v2 proxy still uses the same port

TODO: ~~update migration guide to highlight this breaking change~~, add new flags to https://github.com/coreos/container-linux-config-transpiler.

heyitsanthony · 2017-07-12T02:34:20Z

etcdmain/help.go

+		Set level of detail for exported metrics, specify 'extensive' to include histogram metrics.
+	--expose-metrics 'false'
+		Additionally expose '/metrics' endpoints via '/metrics-urls'.
+	--metrics-urls 'http://localhost:9379'


--listen-metrics-urls and default to "" so expose-metrics isn't needed? blank list => no listen

heyitsanthony · 2017-07-12T02:43:51Z

embed/etcd.go

 	// Start a client server goroutine for each listen address
 	var h http.Handler
 	if e.Config().EnableV2 {
-		h = v2http.NewClientHandler(e.Server, e.Server.Cfg.ReqTimeout())
+		h = v2http.NewClientHandler(e.Server, metricsMux, e.Server.Cfg.ReqTimeout())


NewClientHandler and HandleBasic shouldn't need to change. I think all that's needed is:

var metricsMux *http.ServeMux if len(e.cfg.MetricsUrls) > 0 { metricsMux = http.NewServeMux() metricsMux.Handle("metrics", prometheus.Handler()) // maybe etcdhttp.MetricsPath or get the path from the user-provided URL }

heyitsanthony · 2017-07-12T02:46:52Z

etcdmain/grpc_proxy.go

@@ -79,7 +82,8 @@ func newGRPCProxyStartCommand() *cobra.Command {
 	}

 	cmd.Flags().StringVar(&grpcProxyListenAddr, "listen-addr", "127.0.0.1:23790", "listen address")
-	cmd.Flags().StringVar(&grpcProxyDNSCluster, "discovery-srv", "", "DNS domain used to bootstrap initial cluster")
+	cmd.Flags().BoolVar(&grpcProxyExposeMetrics, "expose-metrics", false, "Additionally expose '/metrics' endpoints via '/metrics-url'.")
+	cmd.Flags().StringVar(&grpcProxyMetricsListenAddr, "metrics-url", "http://127.0.0.1:9378", "'/metrics' listen address (secured via client TLS)")


metrics-addr to be consistent with listen-addr and default to nothing?

heyitsanthony · 2017-07-12T02:50:57Z

Documentation/op-guide/monitoring.md

@@ -5,7 +5,8 @@ Each etcd server exports metrics under the `/metrics` path on its client port.
 The metrics can be fetched with `curl`:

 ```sh
-$ curl -L http://localhost:2379/metrics | grep -v debugging # ignore unstable debugging metrics
+# OR http://localhost:9379/metrics when etcd is run with '--expose-metrics' flag
+$ curl -L http://localhost:9379/metrics | grep -v debugging # ignore unstable debugging metrics


Should stick to the default 2379 here since if the user is changing the ports, they probably don't need this guide any more. Instead of hiding an explanation in a comment, the doc can directly say that optionally metrics can be exported to other listen addresses with --listen-metrics-urls where it's talking about the client port

heyitsanthony · 2017-07-12T02:52:09Z

Procfile

-etcd1: bin/etcd --name infra1 --listen-client-urls http://127.0.0.1:2379 --advertise-client-urls http://127.0.0.1:2379 --listen-peer-urls http://127.0.0.1:12380 --initial-advertise-peer-urls http://127.0.0.1:12380 --initial-cluster-token etcd-cluster-1 --initial-cluster 'infra1=http://127.0.0.1:12380,infra2=http://127.0.0.1:22380,infra3=http://127.0.0.1:32380' --initial-cluster-state new --enable-pprof
-etcd2: bin/etcd --name infra2 --listen-client-urls http://127.0.0.1:22379 --advertise-client-urls http://127.0.0.1:22379 --listen-peer-urls http://127.0.0.1:22380 --initial-advertise-peer-urls http://127.0.0.1:22380 --initial-cluster-token etcd-cluster-1 --initial-cluster 'infra1=http://127.0.0.1:12380,infra2=http://127.0.0.1:22380,infra3=http://127.0.0.1:32380' --initial-cluster-state new --enable-pprof
-etcd3: bin/etcd --name infra3 --listen-client-urls http://127.0.0.1:32379 --advertise-client-urls http://127.0.0.1:32379 --listen-peer-urls http://127.0.0.1:32380 --initial-advertise-peer-urls http://127.0.0.1:32380 --initial-cluster-token etcd-cluster-1 --initial-cluster 'infra1=http://127.0.0.1:12380,infra2=http://127.0.0.1:22380,infra3=http://127.0.0.1:32380' --initial-cluster-state new --enable-pprof
+etcd1: bin/etcd --name infra1 --listen-client-urls http://127.0.0.1:2379 --advertise-client-urls http://127.0.0.1:2379 --listen-peer-urls http://127.0.0.1:12380 --initial-advertise-peer-urls http://127.0.0.1:12380 --initial-cluster-token etcd-cluster-1 --initial-cluster 'infra1=http://127.0.0.1:12380,infra2=http://127.0.0.1:22380,infra3=http://127.0.0.1:32380' --initial-cluster-state new --enable-pprof --expose-metrics --metrics-urls http://127.0.0.1:9379


don't change? would like to keep the procfile from being too cluttered

heyitsanthony · 2017-07-12T19:19:35Z

embed/etcd.go

@@ -400,6 +409,17 @@ func (e *Etcd) serve() (err error) {
 			e.errHandler(s.serve(e.Server, &e.cfg.ClientTLSInfo, h, e.errHandler))
 		}(sctx)
 	}
+	for _, murl := range e.cfg.ListenMetricsUrls {


how will Etcd.Close() shut this down?

heyitsanthony · 2017-07-12T19:21:02Z

etcdmain/config.go

@@ -131,6 +133,7 @@ func newConfig() *config {
 	fs.StringVar(&cfg.WalDir, "wal-dir", cfg.WalDir, "Path to the dedicated wal directory.")
 	fs.Var(flags.NewURLsValue(embed.DefaultListenPeerURLs), "listen-peer-urls", "List of URLs to listen on for peer traffic.")
 	fs.Var(flags.NewURLsValue(embed.DefaultListenClientURLs), "listen-client-urls", "List of URLs to listen on for client traffic.")
+	fs.StringVar(&cfg.ListenMetricsUrlsJSON, "listen-metrics-urls", "", "List of URLs to listen on for metrics (secured via client TLS if any).")


is this smart about http/https schemes instead of defaulting to TLS? there may be cases where it's OK to expose the metrics as plaintext but access to the kv store is locked down with tls authentication

heyitsanthony · 2017-07-12T19:22:29Z

etcdmain/grpc_proxy.go

@@ -79,7 +81,7 @@ func newGRPCProxyStartCommand() *cobra.Command {
 	}

 	cmd.Flags().StringVar(&grpcProxyListenAddr, "listen-addr", "127.0.0.1:23790", "listen address")
-	cmd.Flags().StringVar(&grpcProxyDNSCluster, "discovery-srv", "", "DNS domain used to bootstrap initial cluster")
+	cmd.Flags().StringVar(&grpcProxyMetricsListenAddr, "metrics-addr", "", "Additional '/metrics' listen address (secured via client TLS)")


listen for /metrics requests on an additional interface?

heyitsanthony

a few last nits

heyitsanthony · 2017-07-14T17:38:17Z

etcdmain/config.go

@@ -131,6 +133,7 @@ func newConfig() *config {
 	fs.StringVar(&cfg.WalDir, "wal-dir", cfg.WalDir, "Path to the dedicated wal directory.")
 	fs.Var(flags.NewURLsValue(embed.DefaultListenPeerURLs), "listen-peer-urls", "List of URLs to listen on for peer traffic.")
 	fs.Var(flags.NewURLsValue(embed.DefaultListenClientURLs), "listen-client-urls", "List of URLs to listen on for client traffic.")
+	fs.StringVar(&cfg.ListenMetricsUrlsJSON, "listen-metrics-urls", "", "List of URLs to listen on for metrics (non-secured ignoring client TLS).")


just List of URLs to listen on for metrics? The parenthetical will probably cause confusion

heyitsanthony · 2017-07-14T17:39:29Z

etcdmain/grpc_proxy.go

@@ -79,7 +81,7 @@ func newGRPCProxyStartCommand() *cobra.Command {
 	}

 	cmd.Flags().StringVar(&grpcProxyListenAddr, "listen-addr", "127.0.0.1:23790", "listen address")
-	cmd.Flags().StringVar(&grpcProxyDNSCluster, "discovery-srv", "", "DNS domain used to bootstrap initial cluster")


heyitsanthony · 2017-07-14T17:40:03Z

etcdmain/help.go

-	  Set level of detail for exported metrics, specify 'extensive' to include histogram metrics.
+		Set level of detail for exported metrics, specify 'extensive' to include histogram metrics.
+	--listen-metrics-urls ''
+		List of URLs to listen on for metrics (non-secured ignoring client TLS).


remove parenthetical bit here too?

heyitsanthony · 2017-07-14T17:41:35Z

embed/etcd.go

@@ -205,6 +208,11 @@ func (e *Etcd) Close() {
 			e.Clients[i].Close()
 		}
 	}
+	for i := range e.metricsListeners {
+		if e.metricsListeners[i] != nil {


no need for the nil check? metricsListeners is only appended with non-nil listeners

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>

owenthereal · 2017-11-06T22:54:21Z

The changes are not available in the latest stable release v3.2.9. When will a newer version be cut with --listen-metrics-urls?

gyuho · 2017-11-06T22:56:45Z

This will be in v3.3, which is planned for late this year, or early next year.

Quentin-M · 2018-01-03T13:02:35Z

I think it may be a bit unfortunate to couple the ClientTLSInfo with the new /metrics endpoint.

Users may want TLS on the /metrics endpoint.
Users may run the etcd clients (i.e. kube-apiserver) on the same nodes as Prometheus
Users may protect the the etcd client endpoint with TLS Client Cert auth.

This means that both client port and metrics port must be reachable from the the same nodes. If Prometheus is given a TLS Client Certs to access the /metrics endpoint, it will be also be able to reach the client endpoint with the same TLS Client Cert, and with full privileges.

It is also generally a burden and a security risk to carry TLS Client Certs to the monitoring system, as it often run in a separate environment. For instance, in the case of Prometheus on Kubernetes:

While the TLS Client Certs are stored on the master nodes, with host network and host mounts forbidden, scheduler constraints preventing anyone but the administrator to schedule there, and good chmod - these certificates are pretty much unreachable from tenants' pods (unless there is container escape).
To get Prometheus to scrape etcd, the TLS Client Certs will have to be stored in a Secret, now potentially reachable by any tenant of Kubernetes, who manage to read that secret (e.g. bad RBAC permission, or hack of a pod that has permissions for it).

I am of the opinion that if etcd listens over TLS, and the metrics URLs are https://, the /metrics endpoint could also listen over TLS, but without the TLS Client Cert auth. This could be a flag though if opinions are strong that this use case of auth'd /metrics should be auth'd.

Ideally tho, and as mentioned in kubernetes/kubernetes#53405, we could just have another TLS configuration for TLS Client Cert Auth with different certs, to the /metrics endpoint. But I am not sure if that's something you would like to support.

/cc @fabxc @xiang90

brancz · 2018-01-03T13:07:53Z

@Quentin-M we've already talked to @xiang90 about this and he mentioned it is possible to have a separate set of client certs for Prometheus that only has permission to access the /metrics endpoint of etcd, using etcd's authorization. I imagine that's what you are looking for, as then you are serving the metrics via https, and have them authorized, but not full access to etcd.

- etcd-io/etcd#9108 - etcd-io/etcd#8242

etcd v3.3 introduced a new flag to allow serving `/metrics` and `/health` under a different port than e.g. `/v2/keys`. This allows us to protect etcd's data via firewall rules but still let monitoring tools to access the monitoring information. See feature request in etcd repo: etcd-io/etcd#8060. The implementation landed in v3.3: etcd-io/etcd#8242 This PR instructs etcd to serve metrics and health under the additonal port `2381` *unconditionally* **when the used etcd binary is** `>=v3.3.x`. However, if not explicitely set in the `senza.yaml` this port won't be mapped to the outside and therefore isn't accessible. It doesn't expose more information than anything under `2379` already does.

wenjiaswe · 2019-03-18T20:53:18Z

@gyuho Do you think it would be OK to cherrypick this PR back to v3.2 and v3.1? I took a quick look at the fix and it seems fine to backport to me. Would you please let me know if otherwise?

If OK, I can do the cherry pick work. Thanks!

gyuho added the kind/release-note label Jul 11, 2017

gyuho added this to the v3.3.0 milestone Jul 11, 2017

gyuho requested review from xiang90, fanminshi and heyitsanthony July 11, 2017 16:00

gyuho mentioned this pull request Jul 11, 2017

handle separate metrics endpoints etcd-io/etcdlabs#258

Closed

gyuho force-pushed the ppp branch from fc92277 to 2af1249 Compare July 11, 2017 16:07

gyuho changed the title *: expose '/metrics' in separate ports (9378 for etcd, 9379 for gRPC proxy) *: expose '/metrics' in separate ports (9379 for etcd, 9378 for gRPC proxy) Jul 11, 2017

gyuho added the WIP label Jul 11, 2017

gyuho force-pushed the ppp branch 6 times, most recently from 0dc9c80 to aa9cfe8 Compare July 11, 2017 19:55

gyuho removed the WIP label Jul 11, 2017

heyitsanthony suggested changes Jul 12, 2017

View reviewed changes

gyuho added the WIP label Jul 12, 2017

gyuho changed the title *: expose '/metrics' in separate ports (9379 for etcd, 9378 for gRPC proxy) *: optionally duplicate '/metrics' in separate ports Jul 12, 2017

gyuho force-pushed the ppp branch 2 times, most recently from cd82e05 to 893a76a Compare July 12, 2017 17:14

gyuho removed the WIP label Jul 12, 2017

heyitsanthony suggested changes Jul 12, 2017

View reviewed changes

gyuho added the WIP label Jul 12, 2017

gyuho force-pushed the ppp branch 3 times, most recently from 4a4d627 to 21588e0 Compare July 12, 2017 20:46

gyuho changed the title *: optionally duplicate '/metrics' in separate ports *: support additional '/metrics' endpoints with non-TLS Jul 12, 2017

gyuho removed the WIP label Jul 12, 2017

heyitsanthony suggested changes Jul 14, 2017

View reviewed changes

gyuho added 3 commits July 14, 2017 11:14

embed: configure 'ListenMetricsUrls'

9bb5ede

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>

etcdmain: add 'listen-metrics-urls' option

8463b37

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>

etcdmain/grpc-proxy: add 'metrics-addr' option

c47d445

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>

gyuho force-pushed the ppp branch from 9c71d2c to c47d445 Compare July 14, 2017 18:14

heyitsanthony approved these changes Jul 14, 2017

View reviewed changes

gyuho merged commit d283348 into etcd-io:master Jul 14, 2017

gyuho deleted the ppp branch July 14, 2017 22:06

etcd-io deleted a comment from codecov-io Jul 17, 2017

Quentin-M mentioned this pull request Jan 3, 2018

Expose etcd alerts and quota via metrics from the api servers kubernetes/kubernetes#53405

Closed

Quentin-M added a commit to Quentin-M/etcd-cloud-operator that referenced this pull request Jan 7, 2018

{glide.*, vendor/}: update to etcd 6f76e46c0

2236d9d

- etcd-io/etcd#9108 - etcd-io/etcd#8242

Quentin-M added a commit to Quentin-M/etcd-cloud-operator that referenced this pull request Feb 1, 2018

{glide.*, vendor/}: update to etcd 3.3.0

3df0b59

- etcd-io/etcd#9108 - etcd-io/etcd#8242

linki mentioned this pull request Feb 28, 2018

Serve metrics under a dedicated port zalando-stups/stups-etcd-cluster#48

Merged

dghubble mentioned this pull request Mar 29, 2018

Add etcd metrics, Prometheus scrapes, and Grafana dash poseidon/typhoon#175

Merged

sebastien-prudhomme mentioned this pull request Jul 5, 2018

Update Etcd default version to 3.3.x to facilitate Prometheus monitoring rancher/rke#753

Closed

seh mentioned this pull request Sep 7, 2018

Expose /metrics an a separate port #8060

Closed

This was referenced Mar 18, 2019

Explore possibility to enable serving etcd health check on a different port so we can have different tls requirement for etcd server and etcd health #10545

Closed

override ETCD_SERVER with https instead http when mTLS is enabled kubernetes/kubernetes#74690

Merged

wenjiaswe mentioned this pull request Mar 19, 2019

Automated cherry pick of #8242 to release-3.2 #10559

Closed

wenjiaswe added a commit to wenjiaswe/etcd that referenced this pull request Mar 19, 2019

Automated cherry pick of etcd-io#8242

0de9ea3

wenjiaswe mentioned this pull request Mar 19, 2019

Automated cherry pick of #8242 to release-3.1 #10560

Closed

hexfusion mentioned this pull request Apr 8, 2019

*: Change gRPC proxy to expose endpoint /metrics #10618

Merged

anton-johansson mentioned this pull request May 28, 2019

Prometheus metrics for Kubernetes components amimof/kubernetes-the-right-way#46

Closed

wenjiaswe mentioned this pull request Oct 17, 2019

e2e test: remove metricsURLScheme from e2e tests #11277

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

*: support additional '/metrics' endpoints #8242

*: support additional '/metrics' endpoints #8242

gyuho commented Jul 11, 2017

gyuho commented Jul 11, 2017 •

edited

Loading

heyitsanthony Jul 12, 2017

heyitsanthony Jul 12, 2017

heyitsanthony Jul 12, 2017

heyitsanthony Jul 12, 2017

heyitsanthony Jul 12, 2017

heyitsanthony Jul 12, 2017

heyitsanthony Jul 12, 2017

heyitsanthony Jul 12, 2017

heyitsanthony left a comment

heyitsanthony Jul 14, 2017

heyitsanthony Jul 14, 2017

heyitsanthony Jul 14, 2017

heyitsanthony Jul 14, 2017

owenthereal commented Nov 6, 2017

gyuho commented Nov 6, 2017

Quentin-M commented Jan 3, 2018 •

edited

Loading

brancz commented Jan 3, 2018

wenjiaswe commented Mar 18, 2019

*: support additional '/metrics' endpoints #8242

*: support additional '/metrics' endpoints #8242

Conversation

gyuho commented Jul 11, 2017

gyuho commented Jul 11, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

heyitsanthony left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

owenthereal commented Nov 6, 2017

gyuho commented Nov 6, 2017

Quentin-M commented Jan 3, 2018 • edited Loading

brancz commented Jan 3, 2018

wenjiaswe commented Mar 18, 2019

gyuho commented Jul 11, 2017 •

edited

Loading

Quentin-M commented Jan 3, 2018 •

edited

Loading