From 50d38598e8eb83339d89f8771c0df8e1a1249d08 Mon Sep 17 00:00:00 2001 From: Fabian Reinartz Date: Mon, 19 Dec 2016 17:13:08 +0100 Subject: [PATCH] Update metric instrumentation guide --- contributors/devel/instrumentation.md | 204 +++++++++++++++++++++++--- 1 file changed, 186 insertions(+), 18 deletions(-) diff --git a/contributors/devel/instrumentation.md b/contributors/devel/instrumentation.md index b73221a942b..66992e40359 100644 --- a/contributors/devel/instrumentation.md +++ b/contributors/devel/instrumentation.md @@ -1,11 +1,28 @@ -## Instrumenting Kubernetes with a new metric +## Instrumenting Kubernetes -The following is a step-by-step guide for adding a new metric to the Kubernetes -code base. +The following references and outlines general guidelines for metric instrumentation +in Kubernetes components. Components are instrumented using the +[Prometheus Go client library](https://github.com/prometheus/client_golang). For non-Go +components. [Libraries in other languages](https://prometheus.io/docs/instrumenting/clientlibs/) +are available. -We use the Prometheus monitoring system's golang client library for -instrumenting our code. Once you've picked out a file that you want to add a -metric to, you should: +The metrics are exposed via HTTP in the +[Prometheus metric format](https://prometheus.io/docs/instrumenting/exposition_formats/), +which is open and well-understood by a wide range of third party applications and vendors +outside of the Prometheus eco-system. + +The [general instrumentation advice](https://prometheus.io/docs/practices/instrumentation/) +from the Prometheus documentation applies. This document reiterates common pitfalls and some +Kubernetes specific considerations. + +Prometheus metrics are cheap as they have minimal internal memory state. Set and increment +operations are thread safe and take 10-25 nanoseconds (Go & Java). +Thus, instrumentation can and should cover all operationally relevant aspects of an application, +internal and external. + +## Quick Start + +The following describes the basic steps required to add a new metric (in Go). 1. Import "github.com/prometheus/client_golang/prometheus". @@ -22,29 +39,180 @@ the values. labels on the metric. If so, add "Vec" to the name of the type of metric you want and add a slice of the label names to the definition. - https://github.com/kubernetes/kubernetes/blob/cd3299307d44665564e1a5c77d0daa0286603ff5/pkg/apiserver/apiserver.go#L53 - https://github.com/kubernetes/kubernetes/blob/cd3299307d44665564e1a5c77d0daa0286603ff5/pkg/kubelet/metrics/metrics.go#L31 + [Example](https://github.com/kubernetes/kubernetes/blob/cd3299307d44665564e1a5c77d0daa0286603ff5/pkg/apiserver/apiserver.go#L53) + ```go + requestCounter = prometheus.NewCounterVec( + prometheus.CounterOpts{ + Name: "apiserver_request_count", + Help: "Counter of apiserver requests broken out for each verb, API resource, client, and HTTP response code.", + }, + []string{"verb", "resource", "client", "code"}, + ) + ``` 3. Register the metric so that prometheus will know to export it. - https://github.com/kubernetes/kubernetes/blob/cd3299307d44665564e1a5c77d0daa0286603ff5/pkg/kubelet/metrics/metrics.go#L74 - https://github.com/kubernetes/kubernetes/blob/cd3299307d44665564e1a5c77d0daa0286603ff5/pkg/apiserver/apiserver.go#L78 + [Example](https://github.com/kubernetes/kubernetes/blob/cd3299307d44665564e1a5c77d0daa0286603ff5/pkg/apiserver/apiserver.go#L78) + ```go + func init() { + prometheus.MustRegister(requestCounter) + prometheus.MustRegister(requestLatencies) + prometheus.MustRegister(requestLatenciesSummary) + } + ``` 4. Use the metric by calling the appropriate method for your metric type (Set, Inc/Add, or Observe, respectively for Gauge, Counter, or Histogram/Summary), first calling WithLabelValues if your metric has any labels - https://github.com/kubernetes/kubernetes/blob/3ce7fe8310ff081dbbd3d95490193e1d5250d2c9/pkg/kubelet/kubelet.go#L1384 - https://github.com/kubernetes/kubernetes/blob/cd3299307d44665564e1a5c77d0daa0286603ff5/pkg/apiserver/apiserver.go#L87 + [Example](https://github.com/kubernetes/kubernetes/blob/cd3299307d44665564e1a5c77d0daa0286603ff5/pkg/apiserver/apiserver.go#L87) + ```go + requestCounter.WithLabelValues(*verb, *resource, client, strconv.Itoa(*httpCode)).Inc() + ``` + + +## Instrumentation types + +Components have metrics capturing events and states that are inherent to their +application logic. Examples are request and error counters, request latency +histograms, or internal garbage collection cycles. Those metrics are instrumented +directly in the application code. + +Secondly, there are business logic metrics. Those are not about observed application +behavior but abstract system state, such as desired replicas for a deployment. +They are not directly instrumented but collected from otherwise exposed data. + +In Kubernetes they are generally captured in the [kube-state-metrics](https://github.com/kubernetes/kube-state-metrics) +component, which reads them from the API server. +For this types of metric exposition, the +[exporter guidelines](https://prometheus.io/docs/instrumenting/writing_exporters/) +apply additionally. + +## Naming + +Metrics added directly by application or package code should have a unique name. +This avoids collisions of metrics added via dependencies. They also clearly +distinguish metrics collected with different semantics. This is solved through +prefixes: + +``` +_ +``` + +For example, suppose the kubelet instrumented its HTTP requests but also uses +an HTTP router providing its own implementation. Both expose metrics on total +http requests. They should be distinguishable as in: + +``` +kubelet_http_requests_total{path=”/some/path”,status=”200”} +routerpkg_http_requests_total{path=”/some/path”,status=”200”,method=”GET”} +``` + +As we can see they expose different labels and thus a naming collision would +not have been possible to resolve even if both metrics counted the exact same +requests. + +Resource objects that occur in names should inherit the spelling that is used +in kubectl, i.e. daemon sets are `daemonset` rather than `daemon_set`. + +## Dimensionality & Cardinality + +Metrics can often replace more expensive logging as they are time-aggregated +over a sampling interval. The [multidimensional data model](https://prometheus.io/docs/concepts/data_model/) +enables deep insights and all metrics should use those label dimensions +where appropriate. + +A common error that often causes performance issues in the ingesting metric +system is considering dimensions that inhibit or eliminate time aggregation +by being too specific. Typically those are user IDs or error messages. +More generally: one should know a comprehensive list of all possible values +for a label at instrumentation time. + +Notable exceptions are exporters like kube-state-metrics, which expose per-pod +or per-deployment metrics, which are theoretically unbound over time as one could +constantly create new ones, with new names. However, they have +a reasonable upper bound for a given size of infrastructure they refer to and +its typical frequency of changes. + +In general, “external” labels like pod or node name do not belong into the +instrumentation itself. They are to be attached to metrics by the collecting +system that has the external knowledge ([blog post](https://www.robustperception.io/target-labels-are-for-life-not-just-for-christmas/)). + +## Normalization + +Metrics should be normalized with respect to their dimensions. They should +expose the minimal set of labels, each of which provides additional information. +Labels that are composed from values of different labels are not desirable. +For example: + +``` +example_metric{pod=”abc”,container=”proxy”,container_long=”abc/proxy”} +``` + +It often seems feasible to add additional meta information about an object +to all metrics about that object, e.g.: + +``` +kube_pod_container_restarts{namespace=...,pod=...,container=...} +``` + +A common use case is wanting to look at such metrics w.r.t to the node the +pod is scheduled on. So it seems convenient to add a “node” label. + +``` +kube_pod_container_restarts{namespace=...,pod=...,container=...,node=...} +``` + +This however only caters to one specific query use case. There are many more +pieces of metadata that could be added, effectively blowing up the instrumentation. +They are also not guaranteed to be stable over time. What if pods at some +point can be live migrated? +Those pieces of information should be normalized into an info-level metric +([blog post](https://www.robustperception.io/exposing-the-software-version-to-prometheus/)), +which is always set to 1. For example: + +``` +kube_pod_info{pod=...,namespace=...,pod_ip=...,host_ip=..,node=..., ...} +``` + +The metric system can later denormalize those along the identifying labels +“pod” and “namespace” labels. This leads to... + +## Resource Referencing + +It is often desirable to correlate different metrics about a common object, +such as a pod. Label dimensions can be used to match up different metrics. +This is most easy if label names and values are following a common pattern. +For metrics exposed by the same application, that often happens naturally. + +For a system composed of several independent, and also pluggable components, +it makes sense to set cross-component standards to allow easy querying in +metric systems without extensive post-processing of data. +In Kubernetes, those are the resource objects such as deployments, +pods, or services and the namespace they belong to. + +The following should be consistently used: + +``` +example_metric_ccc{pod=”example-app-5378923”, namespace=”default”} +``` + +An object is referenced by its unique name in a label named after the resource +itself (i.e. `pod`/`deployment`/... and not `pod_name`/`deployment_name`) +and the namespace it belongs to in the `namespace` label. +Note: namespace/name combinations are only unique at a certain point in time. +For time series this is given by the timestamp associated with any data point. +UUIDs are truly unique but not convenient to use in user-facing time series +queries. +They can still be incorporated using an info level metric as described above for +`kube_pod_info`. A query to a metric system selecting by UUID via a the info level +metric could look as follows: -These are the metric type definitions if you're curious to learn about them or -need more information: +``` +kube_pod_restarts and on(namespace, pod) kube_pod_info{uuid=”ABC”} +``` -https://github.com/prometheus/client_golang/blob/master/prometheus/gauge.go -https://github.com/prometheus/client_golang/blob/master/prometheus/counter.go -https://github.com/prometheus/client_golang/blob/master/prometheus/histogram.go -https://github.com/prometheus/client_golang/blob/master/prometheus/summary.go