-
Notifications
You must be signed in to change notification settings - Fork 14.1k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add documentation for Component SLIs feature (#37767)
* add component SLIs documentation * Update content/en/docs/reference/instrumentation/slis.md Co-authored-by: Tim Bannister <tim@scalefactory.com> * Update content/en/docs/reference/instrumentation/slis.md Co-authored-by: Tim Bannister <tim@scalefactory.com> * Update content/en/docs/reference/instrumentation/slis.md Co-authored-by: Tim Bannister <tim@scalefactory.com> * Update content/en/docs/reference/instrumentation/slis.md Co-authored-by: Tim Bannister <tim@scalefactory.com> * Update content/en/docs/reference/instrumentation/slis.md Co-authored-by: Tim Bannister <tim@scalefactory.com> * Update content/en/docs/reference/instrumentation/slis.md Co-authored-by: Tim Bannister <tim@scalefactory.com> * Update content/en/docs/reference/instrumentation/slis.md Co-authored-by: Tim Bannister <tim@scalefactory.com> * remove prometheus metric definitions and shell colorization * Update content/en/docs/reference/instrumentation/slis.md Co-authored-by: Tim Bannister <tim@scalefactory.com> * Update content/en/docs/reference/instrumentation/slis.md Co-authored-by: Rey Lejano <rlejano@gmail.com> Co-authored-by: Tim Bannister <tim@scalefactory.com> Co-authored-by: Rey Lejano <rlejano@gmail.com>
- Loading branch information
1 parent
b0fa875
commit 1591d7d
Showing
2 changed files
with
79 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
--- | ||
reviewers: | ||
- logicalhan | ||
title: Kubernetes Component SLI Metrics | ||
linkTitle: Service Level Indicator Metrics | ||
content_type: reference | ||
weight: 20 | ||
--- | ||
|
||
<!-- overview --> | ||
|
||
{{< feature-state for_k8s_version="v1.26" state="alpha" >}} | ||
|
||
As an alpha feature, Kubernetes lets you configure Service Level Indicator (SLI) metrics | ||
for each Kubernetes component binary. This metric endpoint is exposed on the serving | ||
HTTPS port of each component, at the path `/metrics/slis`. You must enable the | ||
`ComponentSLIs` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) | ||
for every component from which you want to scrape SLI metrics. | ||
|
||
<!-- body --> | ||
|
||
## SLI Metrics | ||
|
||
With SLI metrics enabled, each Kubernetes component exposes two metrics, | ||
labeled per healthcheck: | ||
|
||
- a gauge (which represents the current state of the healthcheck) | ||
- a counter (which records the cumulative counts observed for each healthcheck state) | ||
|
||
You can use the metric information to calculate per-component availability statistics. | ||
For example, the API server checks the health of etcd. You can work out and report how | ||
available or unavailable etcd has been - as reported by its client, the API server. | ||
|
||
|
||
The prometheus gauge data looks like this: | ||
|
||
``` | ||
# HELP kubernetes_healthcheck [ALPHA] This metric records the result of a single healthcheck. | ||
# TYPE kubernetes_healthcheck gauge | ||
kubernetes_healthcheck{name="autoregister-completion",type="healthz"} 1 | ||
kubernetes_healthcheck{name="autoregister-completion",type="readyz"} 1 | ||
kubernetes_healthcheck{name="etcd",type="healthz"} 1 | ||
kubernetes_healthcheck{name="etcd",type="readyz"} 1 | ||
kubernetes_healthcheck{name="etcd-readiness",type="readyz"} 1 | ||
kubernetes_healthcheck{name="informer-sync",type="readyz"} 1 | ||
kubernetes_healthcheck{name="log",type="healthz"} 1 | ||
kubernetes_healthcheck{name="log",type="readyz"} 1 | ||
kubernetes_healthcheck{name="ping",type="healthz"} 1 | ||
kubernetes_healthcheck{name="ping",type="readyz"} 1 | ||
``` | ||
|
||
While the counter data looks like this: | ||
|
||
``` | ||
# HELP kubernetes_healthchecks_total [ALPHA] This metric records the results of all healthcheck. | ||
# TYPE kubernetes_healthchecks_total counter | ||
kubernetes_healthchecks_total{name="autoregister-completion",status="error",type="readyz"} 1 | ||
kubernetes_healthchecks_total{name="autoregister-completion",status="success",type="healthz"} 15 | ||
kubernetes_healthchecks_total{name="autoregister-completion",status="success",type="readyz"} 14 | ||
kubernetes_healthchecks_total{name="etcd",status="success",type="healthz"} 15 | ||
kubernetes_healthchecks_total{name="etcd",status="success",type="readyz"} 15 | ||
kubernetes_healthchecks_total{name="etcd-readiness",status="success",type="readyz"} 15 | ||
kubernetes_healthchecks_total{name="informer-sync",status="error",type="readyz"} 1 | ||
kubernetes_healthchecks_total{name="informer-sync",status="success",type="readyz"} 14 | ||
kubernetes_healthchecks_total{name="log",status="success",type="healthz"} 15 | ||
kubernetes_healthchecks_total{name="log",status="success",type="readyz"} 15 | ||
kubernetes_healthchecks_total{name="ping",status="success",type="healthz"} 15 | ||
kubernetes_healthchecks_total{name="ping",status="success",type="readyz"} 15 | ||
``` | ||
|
||
## Using this data | ||
|
||
The component SLIs metrics endpoint is intended to be scraped at a high frequency. Scraping | ||
at a high frequency means that you end up with greater granularity of the gauge's signal, which | ||
can be then used to calculate SLOs. The `/metrics/slis` endpoint provides the raw data necessary | ||
to calculate an availability SLO for the respective Kubernetes component. |