Skip to content

Commit 71b09fc

Browse files
committed
Proposal: Introduce Custom Metrics API
This proposal details the custom metrics API as proposed in the new monitoring vision. It is designed for use with the HPA, but should be generally useful for controllers that wish to consumer custom metrics.
1 parent 5329551 commit 71b09fc

File tree

1 file changed

+219
-0
lines changed

1 file changed

+219
-0
lines changed
Lines changed: 219 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,219 @@
1+
Custom Metrics API
2+
==================
3+
4+
The new [metrics monitoring vision](monitoring_architecture.md) proposes
5+
an API that the Horizontal Pod Autoscaler can use to access arbitrary
6+
metrics.
7+
8+
Similarly to the [master metrics API](resource-metrics-api.md), the new
9+
API should be structured around accessing metrics by referring to
10+
kubernetes objects (or groups thereof) and a metric name. For this
11+
reason, the API could be useful for other consumers (most likely
12+
controllers) that want to consume custom metrics (similarly to how the
13+
master metrics API is generally useful to multiple cluster components).
14+
15+
The HPA can refer to metrics describing all pods matching a label
16+
selector, as well as an arbitrary named object.
17+
18+
API Paths
19+
---------
20+
21+
The root API path will look like `/apis/general-metrics/v1alpha1`. For
22+
brevity, this will be left off below.
23+
24+
- `/nodes/{node-name}/metrics/{metric-name...}`: retrieve the given metric
25+
on the given node
26+
27+
- `/namespaces/{namespace-name}/metrics/{metric-name...}`: retrieve the
28+
given metric on the given namespace
29+
30+
- `/namespaces/{namespace-name}/object-metrics/{object-type}/{metric-name...}`:
31+
retrieve the given metric for all objects of the given type in the given
32+
namespace.
33+
34+
- `/namespaces/{namespace-name}/object-metrics/{object-type}/{metric-name...}?labelSelector=foo`:
35+
retrieve the given metric for all objects of the given type matching the
36+
given label selector in the given namespace.
37+
38+
- `/namespaces/{namespace-name}/object-metrics/{object-type}/{metric-name...}?names=foo,bar`:
39+
retrieve the given metric for the objects of the given type with the
40+
given names in the given namespace
41+
42+
For example, to retrieve the custom metric "hits-per-second" for all pods
43+
matching "app=frontend` in the namespaces "webapp", the request might look
44+
like:
45+
46+
`/apis/general-metrics/v1alpha1/namespaces/webapp/object-metrics/pods/hits-per-second?labelSelector=app%3Dfrontend`.
47+
48+
API Objects
49+
-----------
50+
51+
The request URLs listed above will return objects of the `Metrics`
52+
type, described below:
53+
54+
```go
55+
56+
// a list of values for a given metric for some set of objects
57+
type Metrics struct {
58+
unversioned.TypeMeta `json:",inline"`
59+
unversioned.ListMeta `json:"metadata,omitempty"`
60+
61+
// the name of the metric
62+
MetricName string `json:"metricName"`
63+
64+
// the value of the metric across the described objects
65+
MetricValues []MetricValue `json:"metricValues"`
66+
}
67+
68+
// a metric value for some object
69+
type MetricValue struct {
70+
// a reference to the described object
71+
DescribedObject ObjectReference `json:"describedObject"`
72+
73+
// indicates the end of the time window containing these metrics (i.e.
74+
// these metrics come from some time in [Timestamp-Window, Timestamp])
75+
Timestamp unversioned.Time `json:"timestamp"`
76+
77+
// indicates the duration of the time window containing these metrics
78+
Window unversioned.Duration `json:"window"`
79+
80+
// the value of the metric for this
81+
Value resource.Quantity
82+
}
83+
```
84+
85+
For instance, the example request above would yield the following object:
86+
87+
```json
88+
{
89+
"kind": "Metrics",
90+
"apiVersion": "general-metrics/v1alpha1",
91+
"metricName": "hits-per-second",
92+
"metricValues": [
93+
{
94+
"describedObject": {
95+
"kind": "Pod",
96+
"name": "server1",
97+
"namespace": "webapp"
98+
},
99+
"timestamp": SOME_TIMESTAMP_HERE,
100+
"window": "10s",
101+
"value": "10"
102+
},
103+
{
104+
"describedObject": {
105+
"kind": "Pod",
106+
"name": "server2",
107+
"namespace": "webapp"
108+
},
109+
"timestamp": SOME_TIMESTAMP_HERE,
110+
"window": "10s",
111+
"value": "15"
112+
}
113+
]
114+
}
115+
```
116+
117+
Semantics
118+
---------
119+
120+
The `object-type` parameter should be the string form of
121+
`unversioned.GroupKind`. Note that we do not include version in this; we
122+
simply wish to uniquely identify all the different types of objects in
123+
Kubernetes.
124+
125+
In the case of cross-group object renames, the adapter should maintain
126+
a list of "equivalent versions" that the monitoring system uses. This is
127+
monitoring-system dependent (for instance, the monitoring system might
128+
record all HorizontalPodAutoscalers as in `autoscaling`, but should be
129+
aware that HorizontalPodAutoscaler also exist in `extensions`).
130+
131+
The returned metrics should be the most recenly available metrics, as with
132+
the resource metrics API. The timestamp and window should indicate to the
133+
consumer what timeframe the metric has come from. The timestamp indicates
134+
the "batch" of metrics, while the window indicates the length of time
135+
between batches.
136+
137+
For metrics systems that support differentiating metrics beyond the Kubernetes
138+
object hierarchy (such as using additional labels), the metrics systems should
139+
have a metric which represents all such series aggregated together.
140+
Additionally, implementors may choose to the individual "sub-metrics" via
141+
the metric name, but this is expected to be fairly rare, since it most
142+
likely requires specific knowledge of individual metrics. For instance,
143+
suppose we record filesystem usage by filesystem inside the container.
144+
There should then be a metric `filesystem/usage`, and the implementors of
145+
the API may choose to expose more detailed metrics like
146+
`filesystem/usage/my-first-filesystem`.
147+
148+
Relationship to HPA v2
149+
----------------------
150+
151+
The URL paths in this API are designed to correspond to different source
152+
types in the [HPA v2](hpa-v2.md). Specifially, the `pods` source type
153+
corresponds to a URL of the form
154+
`/namespaces/$NS/object-metrics/pod/$METRIC_NAME?labelSelector=foo`, while
155+
the `object` source type corresponds to a URL of the form
156+
`/namespaces/$NS/object-metrics/$KIND.$GROUP/$METRIC_NAME?names=$OBJECT_NAME`.
157+
158+
The HPA then takes the results, aggregates them together (in the case of
159+
the former source type), and uses the resulting value to produce a usage
160+
ratio.
161+
162+
The resource source type is taken from the the API provided by the
163+
"metrics" API group (the master/resource metrics API).
164+
165+
The HPA will consume the API as a federated API server.
166+
167+
Mechanical Concerns
168+
-------------------
169+
170+
This API is intended to be implemented by monitoring pipelines (e.g.
171+
inside Heapster, or as an adapter on top of a solution like Prometheus).
172+
It shares many mechanical requirements with normal Kubernetes APIs, such
173+
as needed to support encoding different versions of objects in both JSON
174+
and protobuf, as well as acting as a discoverable API server. For these
175+
reasons, it is expected that implemenators will make use of the Kubernetes
176+
genericapiserver code. If implementors choose not to use this, they must
177+
still follow all of the Kubernetes API server conventions in order to work
178+
properly with consumers of the API.
179+
180+
Location
181+
--------
182+
183+
The types and clients for this API will live in a separate repository
184+
under the Kubernetes organization (e.g. `kubernetes/metrics`). This
185+
respository will most likely also house other metrics-related APIs for
186+
Kubernetes (e.g. historical metrics API definitions, the resource metrics
187+
API definitions, etc).
188+
189+
Note that there will not be a canonical implemenation of the custom
190+
metrics API under Kubernetes, just the types and clients. Implementations
191+
will be left up to the monitoring pipelines.
192+
193+
Alternative Considerations
194+
--------------------------
195+
196+
### Pods vs Objects API ###
197+
198+
Since the HPA itself is only interested in groups of pods (by name or
199+
label selector) or in individual objects, one could potentially argue that
200+
it would be better to have separate endpoints for pods vs other objects.
201+
The complicates the API structure a bit, but does make it possible to
202+
return container-level metrics in the pod results. Container-level
203+
metrics are generally less useful for custom metrics, since the smallest
204+
abstraction that the HPA cares about is pods (for custom metrics, at
205+
least).
206+
207+
### Quantity vs Float ###
208+
209+
In the past, custom metrics were represented as floats. In general,
210+
however, Kubernetes APIs are not supposed to use floats. The API proposed
211+
above thus uses `resource.Quantity`. This adds a bit of encoding
212+
overhead, but makes the API line up nicely with other Kubernetes APIs.
213+
214+
### Labeled Metrics ###
215+
216+
Many metric systems support labeled metrics, allowing for dimenisionality
217+
beyond the Kubernetes object hierarchy. Since the HPA currently doesn't
218+
support specifying metric labels, this is not supported via this API. We
219+
may wish to explore this in the future.

0 commit comments

Comments
 (0)