|
| 1 | +Custom Metrics API |
| 2 | +================== |
| 3 | + |
| 4 | +The new [metrics monitoring vision](monitoring_architecture.md) proposes |
| 5 | +an API that the Horizontal Pod Autoscaler can use to access arbitrary |
| 6 | +metrics. |
| 7 | + |
| 8 | +Similarly to the [master metrics API](resource-metrics-api.md), the new |
| 9 | +API should be structured around accessing metrics by referring to |
| 10 | +kubernetes objects (or groups thereof) and a metric name. For this |
| 11 | +reason, the API could be useful for other consumers (most likely |
| 12 | +controllers) that want to consume custom metrics (similarly to how the |
| 13 | +master metrics API is generally useful to multiple cluster components). |
| 14 | + |
| 15 | +The HPA can refer to metrics describing all pods matching a label |
| 16 | +selector, as well as an arbitrary named object. |
| 17 | + |
| 18 | +API Paths |
| 19 | +--------- |
| 20 | + |
| 21 | +The root API path will look like `/apis/general-metrics/v1alpha1`. For |
| 22 | +brevity, this will be left off below. |
| 23 | + |
| 24 | +- `/nodes/{node-name}/metrics/{metric-name...}`: retrieve the given metric |
| 25 | + on the given node |
| 26 | + |
| 27 | +- `/namespaces/{namespace-name}/metrics/{metric-name...}`: retrieve the |
| 28 | + given metric on the given namespace |
| 29 | + |
| 30 | +- `/namespaces/{namespace-name}/object-metrics/{object-type}/{metric-name...}`: |
| 31 | + retrieve the given metric for all objects of the given type in the given |
| 32 | + namespace. |
| 33 | + |
| 34 | +- `/namespaces/{namespace-name}/object-metrics/{object-type}/{metric-name...}?labelSelector=foo`: |
| 35 | + retrieve the given metric for all objects of the given type matching the |
| 36 | + given label selector in the given namespace. |
| 37 | + |
| 38 | +- `/namespaces/{namespace-name}/object-metrics/{object-type}/{metric-name...}?names=foo,bar`: |
| 39 | + retrieve the given metric for the objects of the given type with the |
| 40 | + given names in the given namespace |
| 41 | + |
| 42 | +For example, to retrieve the custom metric "hits-per-second" for all pods |
| 43 | +matching "app=frontend` in the namespaces "webapp", the request might look |
| 44 | +like: |
| 45 | + |
| 46 | +`/apis/general-metrics/v1alpha1/namespaces/webapp/object-metrics/pods/hits-per-second?labelSelector=app%3Dfrontend`. |
| 47 | + |
| 48 | +API Objects |
| 49 | +----------- |
| 50 | + |
| 51 | +The request URLs listed above will return objects of the `Metrics` |
| 52 | +type, described below: |
| 53 | + |
| 54 | +```go |
| 55 | + |
| 56 | +// a list of values for a given metric for some set of objects |
| 57 | +type Metrics struct { |
| 58 | + unversioned.TypeMeta `json:",inline"` |
| 59 | + unversioned.ListMeta `json:"metadata,omitempty"` |
| 60 | + |
| 61 | + // the name of the metric |
| 62 | + MetricName string `json:"metricName"` |
| 63 | + |
| 64 | + // the value of the metric across the described objects |
| 65 | + MetricValues []MetricValue `json:"metricValues"` |
| 66 | +} |
| 67 | + |
| 68 | +// a metric value for some object |
| 69 | +type MetricValue struct { |
| 70 | + // a reference to the described object |
| 71 | + DescribedObject ObjectReference `json:"describedObject"` |
| 72 | + |
| 73 | + // indicates the end of the time window containing these metrics (i.e. |
| 74 | + // these metrics come from some time in [Timestamp-Window, Timestamp]) |
| 75 | + Timestamp unversioned.Time `json:"timestamp"` |
| 76 | + |
| 77 | + // indicates the duration of the time window containing these metrics |
| 78 | + Window unversioned.Duration `json:"window"` |
| 79 | + |
| 80 | + // the value of the metric for this |
| 81 | + Value resource.Quantity |
| 82 | +} |
| 83 | +``` |
| 84 | + |
| 85 | +For instance, the example request above would yield the following object: |
| 86 | + |
| 87 | +```json |
| 88 | +{ |
| 89 | + "kind": "Metrics", |
| 90 | + "apiVersion": "general-metrics/v1alpha1", |
| 91 | + "metricName": "hits-per-second", |
| 92 | + "metricValues": [ |
| 93 | + { |
| 94 | + "describedObject": { |
| 95 | + "kind": "Pod", |
| 96 | + "name": "server1", |
| 97 | + "namespace": "webapp" |
| 98 | + }, |
| 99 | + "timestamp": SOME_TIMESTAMP_HERE, |
| 100 | + "window": "10s", |
| 101 | + "value": "10" |
| 102 | + }, |
| 103 | + { |
| 104 | + "describedObject": { |
| 105 | + "kind": "Pod", |
| 106 | + "name": "server2", |
| 107 | + "namespace": "webapp" |
| 108 | + }, |
| 109 | + "timestamp": SOME_TIMESTAMP_HERE, |
| 110 | + "window": "10s", |
| 111 | + "value": "15" |
| 112 | + } |
| 113 | + ] |
| 114 | +} |
| 115 | +``` |
| 116 | + |
| 117 | +Semantics |
| 118 | +--------- |
| 119 | + |
| 120 | +The `object-type` parameter should be the string form of |
| 121 | +`unversioned.GroupKind`. Note that we do not include version in this; we |
| 122 | +simply wish to uniquely identify all the different types of objects in |
| 123 | +Kubernetes. |
| 124 | + |
| 125 | +In the case of cross-group object renames, the adapter should maintain |
| 126 | +a list of "equivalent versions" that the monitoring system uses. This is |
| 127 | +monitoring-system dependent (for instance, the monitoring system might |
| 128 | +record all HorizontalPodAutoscalers as in `autoscaling`, but should be |
| 129 | +aware that HorizontalPodAutoscaler also exist in `extensions`). |
| 130 | + |
| 131 | +The returned metrics should be the most recenly available metrics, as with |
| 132 | +the resource metrics API. The timestamp and window should indicate to the |
| 133 | +consumer what timeframe the metric has come from. The timestamp indicates |
| 134 | +the "batch" of metrics, while the window indicates the length of time |
| 135 | +between batches. |
| 136 | + |
| 137 | +For metrics systems that support differentiating metrics beyond the Kubernetes |
| 138 | +object hierarchy (such as using additional labels), the metrics systems should |
| 139 | +have a metric which represents all such series aggregated together. |
| 140 | +Additionally, implementors may choose to the individual "sub-metrics" via |
| 141 | +the metric name, but this is expected to be fairly rare, since it most |
| 142 | +likely requires specific knowledge of individual metrics. For instance, |
| 143 | +suppose we record filesystem usage by filesystem inside the container. |
| 144 | +There should then be a metric `filesystem/usage`, and the implementors of |
| 145 | +the API may choose to expose more detailed metrics like |
| 146 | +`filesystem/usage/my-first-filesystem`. |
| 147 | + |
| 148 | +Relationship to HPA v2 |
| 149 | +---------------------- |
| 150 | + |
| 151 | +The URL paths in this API are designed to correspond to different source |
| 152 | +types in the [HPA v2](hpa-v2.md). Specifially, the `pods` source type |
| 153 | +corresponds to a URL of the form |
| 154 | +`/namespaces/$NS/object-metrics/pod/$METRIC_NAME?labelSelector=foo`, while |
| 155 | +the `object` source type corresponds to a URL of the form |
| 156 | +`/namespaces/$NS/object-metrics/$KIND.$GROUP/$METRIC_NAME?names=$OBJECT_NAME`. |
| 157 | + |
| 158 | +The HPA then takes the results, aggregates them together (in the case of |
| 159 | +the former source type), and uses the resulting value to produce a usage |
| 160 | +ratio. |
| 161 | + |
| 162 | +The resource source type is taken from the the API provided by the |
| 163 | +"metrics" API group (the master/resource metrics API). |
| 164 | + |
| 165 | +The HPA will consume the API as a federated API server. |
| 166 | + |
| 167 | +Mechanical Concerns |
| 168 | +------------------- |
| 169 | + |
| 170 | +This API is intended to be implemented by monitoring pipelines (e.g. |
| 171 | +inside Heapster, or as an adapter on top of a solution like Prometheus). |
| 172 | +It shares many mechanical requirements with normal Kubernetes APIs, such |
| 173 | +as needed to support encoding different versions of objects in both JSON |
| 174 | +and protobuf, as well as acting as a discoverable API server. For these |
| 175 | +reasons, it is expected that implemenators will make use of the Kubernetes |
| 176 | +genericapiserver code. If implementors choose not to use this, they must |
| 177 | +still follow all of the Kubernetes API server conventions in order to work |
| 178 | +properly with consumers of the API. |
| 179 | + |
| 180 | +Location |
| 181 | +-------- |
| 182 | + |
| 183 | +The types and clients for this API will live in a separate repository |
| 184 | +under the Kubernetes organization (e.g. `kubernetes/metrics`). This |
| 185 | +respository will most likely also house other metrics-related APIs for |
| 186 | +Kubernetes (e.g. historical metrics API definitions, the resource metrics |
| 187 | +API definitions, etc). |
| 188 | + |
| 189 | +Note that there will not be a canonical implemenation of the custom |
| 190 | +metrics API under Kubernetes, just the types and clients. Implementations |
| 191 | +will be left up to the monitoring pipelines. |
| 192 | + |
| 193 | +Alternative Considerations |
| 194 | +-------------------------- |
| 195 | + |
| 196 | +### Pods vs Objects API ### |
| 197 | + |
| 198 | +Since the HPA itself is only interested in groups of pods (by name or |
| 199 | +label selector) or in individual objects, one could potentially argue that |
| 200 | +it would be better to have separate endpoints for pods vs other objects. |
| 201 | +The complicates the API structure a bit, but does make it possible to |
| 202 | +return container-level metrics in the pod results. Container-level |
| 203 | +metrics are generally less useful for custom metrics, since the smallest |
| 204 | +abstraction that the HPA cares about is pods (for custom metrics, at |
| 205 | +least). |
| 206 | + |
| 207 | +### Quantity vs Float ### |
| 208 | + |
| 209 | +In the past, custom metrics were represented as floats. In general, |
| 210 | +however, Kubernetes APIs are not supposed to use floats. The API proposed |
| 211 | +above thus uses `resource.Quantity`. This adds a bit of encoding |
| 212 | +overhead, but makes the API line up nicely with other Kubernetes APIs. |
| 213 | + |
| 214 | +### Labeled Metrics ### |
| 215 | + |
| 216 | +Many metric systems support labeled metrics, allowing for dimenisionality |
| 217 | +beyond the Kubernetes object hierarchy. Since the HPA currently doesn't |
| 218 | +support specifying metric labels, this is not supported via this API. We |
| 219 | +may wish to explore this in the future. |
0 commit comments