-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Update metric instrumentation guide #195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
piosz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks reasonable for me. Can someone who attended the meeting and participated in the discussion take a look?
| first calling WithLabelValues if your metric has any labels | ||
|
|
||
| https://github.com/kubernetes/kubernetes/blob/3ce7fe8310ff081dbbd3d95490193e1d5250d2c9/pkg/kubelet/kubelet.go#L1384 | ||
| https://github.com/kubernetes/kubernetes/blob/cd3299307d44665564e1a5c77d0daa0286603ff5/pkg/apiserver/apiserver.go#L87 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about including short snippets of the code here instead of having link to source code?
| behavior but abstract system state, such as desired replicas for a deployment. | ||
| They are not directly instrumented but collected from otherwise exposed data. | ||
|
|
||
| In Kubernetes they are generally capture in the kube-state-metrics repository, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe component instead of repository?
| for a label at instrumentation time. | ||
|
|
||
| Notable exceptions are exporters like kube-state-metrics, which expose per-pod | ||
| or per-deployment metrics, which are theoretically unbound. However, they have |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are pretty well bounded: 30 pods per node, 2000 (5000) nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At any point in time, yes. But not over time. Will make that clearer.
DirectXMan12
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some minor nits and missing links
|
|
||
| In general, “external” labels like pod or node name do not belong into the | ||
| instrumentation itself. They are to be attached to metrics by the collecting | ||
| system that has the external knowledge. (blog post) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you mean to put a link here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added, thanks.
| ## Normalization | ||
|
|
||
| Metrics should be normalized with respect to their dimensions. They should | ||
| expose the minimal set of labels everyone of which provides additional information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
set of labels everyone of which should be expose the minimal set of labels, every one of which or expose the minimal set of labels, each of which
| ``` | ||
|
|
||
| This however only caters to one specific query use case. There are many more | ||
| meta infos that could be added effectively blowing up the instrumentation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
meta infos --> pieces of metadata
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that could be added effectively --> that could be added, effectively
| They are also not guaranteed to be stable over time. What if pods at some | ||
| point can be live migrated? | ||
| Those pieces of information should be normalized into an info-level metric | ||
| (blog post), which is always set to 1. For example: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
link?
fabxc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All comments addressed. PTAL.
| for a label at instrumentation time. | ||
|
|
||
| Notable exceptions are exporters like kube-state-metrics, which expose per-pod | ||
| or per-deployment metrics, which are theoretically unbound. However, they have |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At any point in time, yes. But not over time. Will make that clearer.
|
|
||
| In general, “external” labels like pod or node name do not belong into the | ||
| instrumentation itself. They are to be attached to metrics by the collecting | ||
| system that has the external knowledge. (blog post) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one small nit, but in general LGTM 👍
| behavior but abstract system state, such as desired replicas for a deployment. | ||
| They are not directly instrumented but collected from otherwise exposed data. | ||
|
|
||
| In Kubernetes they are generally capture in the [kube-state-metrics](https://github.com/kubernetes/kube-state-metrics) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they are generally captured in the...
|
lgtm 👍 |
|
It seems there is no merge bot in this repo, so merging manually. Thanks for writing it down! |
| ```go | ||
| requestCounter = prometheus.NewCounterVec( | ||
| prometheus.CounterOpts{ | ||
| Name: "apiserver_request_count", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be "_total" according to https://prometheus.io/docs/practices/naming/#metric-names?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes!
Create Flake Report for 2017-03-03
This extends a simple quick start guide by links to reference documentation and reiterations of the most important points to hopefully unify the instrumentation approach across components over time.
The contents were discussed in the last @kubernetes/sig-instrumentation-misc meeting and decided to be added in here.