HPA on custom metrics #3134

josephburnett · 2019-02-07T23:58:51Z

Proposal

HPA-class PodAutoscalers should be able to autoscale on any custom metric emitted from the user-container. For example, if the user serves JVM memory usage on a Promethus endpoint from their application container, they should be able to wire that up to a Kubernetes HPA with custom metrics.

For example, given this Service, Knative would create an HPA which points to the "average.memory.heartshapedbox.com" metric.

apiVersion: serving.knative.dev/v1alpha1
kind: Service
metadata:
  name: love
spec:
  runLatest:
    configuration:
      revisionTemplate:
        metadata:
          annotations:
            autoscaling.knative.dev/class: "hpa.autoscaling.knative.dev"
            autoscaling.knative.dev/metric: "custom"
            autoscaling.knative.dev/customMetric: "average.memory.heartshapedbox.com"
        spec:
          container:
            image: gcr.io/joe-does-knative/love:latest

Non-requirements

Knative would not configure Prometheus to scrape the right metrics from the right path. That would be left to the operator to setup. Re-configuring a custom metrics server implementation on the fly is out of scope for what Knative autoscaler should do out-of-the-box. This plumbing is just to unlock a full custom metrics solution if necessary.

Note: #3132 does propose configuring Prometheus to scrape a metric off the queue-proxy. But since it's our metric and well known, it doesn't require reconfiguration.

The text was updated successfully, but these errors were encountered:

josephburnett · 2019-04-11T12:09:36Z

TODO:

upgrade Knative to autoscaling/v2beta2 for custom metrics
point the HPA to a custom metric if found in the PodAutoscaler annotations in the makeHpa method.

josephburnett · 2019-04-11T12:10:32Z

Note: you'll put the actual annotation key strings in register.go.

josephburnett · 2019-04-12T10:03:12Z

@kevinswiber is working on this.

kevinswiber · 2019-04-15T18:25:45Z

Based on the latest autoscaling API version supported by GKE, I'm upgrading to autoscaling/v2beta1, which is similar though not identical.

To match the fields required to create an HPA resource, I propose using the following annotations.

autoscaling.knative.dev/class: "hpa.autoscaling.knative.dev"
autoscaling.knative.dev/metric: "custom"
autoscaling.knative.dev/metricSourceType: "resource" # or... pods, object, external
autoscaling.knative.dev/metricName: "average.memory.heartshapedbox.com"
autoscaling.knative.dev/targetAverageUtilization: "50" # valid on resource type
autoscaling.knative.dev/targetAverageValue: 400m # valid on object, resource, pods, external types
autoscaling.knative.dev/targetValue: 800m # valid on object, external types

autoscaling.knative.dev/targetName: main-route # valid on object types
autoscaling.knative.dev/targetKind: Ingress # valid on object types
autoscaling.knative.dev/targetAPIVersion: extensions/v1beta1 # valid on object types

Reference: https://godoc.org/k8s.io/api/autoscaling/v2beta1#MetricSpec

This does add a non-trivial amount of complexity to this feature. Happy to receive feedback.

markusthoemmes · 2019-04-16T05:37:58Z

Thanks for taking this on @kevinswiber.

In the last Autoscaling WG we had a discussion on the API surface of our autoscaling configuration. The bottomline was a question-mark to whether or not it is a good idea to offload the whole API surface into what is essentially a map[string]string. It'll be a lot to document, hard to get right (typos...) and hard to validate.

In this case it also becomes almost easier to leave the HPA spec alone and tell the user to manipulate that herself vs. configuring it through annotations and reconciliation of the revision.

@josephburnett What's your take here? Should we think about a new API surface as part of this?

kevinswiber · 2019-04-16T17:29:45Z

@markusthoemmes Thanks for your input. This all makes sense. The set-it-and-forget-it flow of just deploying a serving.knative.dev/v1alpha1/Service is nice, but I agree that this is a copy/paste of hefty config that's owned--and will always be owned--by the Kubernetes autoscaling API.

I'm not sure of the simplest solution. I think we'd also want the PodAutoscaler to retain ownership over the HPA. It might be possible for the user to add an annotation to a manually created HPA and have a reconciler auto-associate the Knative PodAutoscaler resource with the HPA. At that point, Knative could take control to include useful functionality like cascading deletes.

kevinswiber · 2019-04-16T20:28:17Z

FYI: I have the above proposal mostly working. I hesitate to update docs until we get through the design-prototyping cycle. I can put up a WIP PR if you like or hold off until design is solidified.

josephburnett · 2019-05-08T07:16:45Z

I don't think we need all those annotations. We can focus on pod metrics and leave object metrics unimplemented. With that limitation, we would need the following annotations:

autoscaling.knative.dev/class: "hpa.autoscaling.knative.dev" # preexisting
autoscaling.knative.dev/metric: "custom" # preexisting
autoscaling.knative.dev/target: 400m # preexisting, translates to averageValue
autoscaling.knative.dev/metricName: "average.memory.heartshapedbox.com" # NEW

This adds only one new annotation, the metricName. If someone needs access to the full HPA spec, they can implement a custom controller.

@markusthoemmes @kevinswiber what do you think?

kevinswiber · 2019-05-14T20:48:24Z

@josephburnett This is fine. I'll have to add some defaults for the other fields. I'm happy to be involved in the discussion for the next round of design on this, as well. Piecing together a PR today.

vagababov · 2019-06-26T21:01:19Z

@markusthoemmes , does the work you did for HPA cover this?

markusthoemmes · 2019-06-27T06:06:46Z

@vagababov it does not, no.

vagababov · 2019-11-14T22:30:46Z

@kevinswiber I presume you're no longer working on this?

gadelkareem · 2020-01-29T15:04:15Z

Is it already possible to disable the custom-metrics-server and switch the class to https://github.com/kubernetes-sigs/metrics-server ?

knative-housekeeping-robot · 2020-04-29T00:01:45Z

Issues go stale after 90 days of inactivity.
Mark the issue as fresh by adding the comment /remove-lifecycle stale.
Stale issues rot after an additional 30 days of inactivity and eventually close.
If this issue is safe to close now please do so by adding the comment /close.

Send feedback to Knative Productivity Slack channel or file an issue in knative/test-infra.

/lifecycle stale

markusthoemmes · 2020-05-27T09:01:02Z

I'm leaning towards closing this as we haven't talked about it in ages and nobody asked about it in ages. Feel free to reopen if you disagree.

siddharth-mitra · 2021-07-18T13:07:31Z

Has this feature been incorporated? I am trying to use it the definition of a kfserving inference service and the following is the error I get.

julz · 2021-07-19T09:00:17Z

Hi @siddharth-mitra - I don't believe this feature was ever implemented (the issue was closed in May 2020 since there weren't many requests for the feature). The HPA autoscaler class in knative currently only supports "cpu" (and, when #11668 lands, "memory") as a metric

siddharth-mitra · 2021-07-22T08:13:27Z

Hi @julz - Thank you for letting me know about the same.
Do you know of any workaround I could use to configure custom metrics to the HPA in the Knative environment?

knative-prow-robot added area/autoscale kind/feature Well-understood/specified features, ready for coding. labels Feb 7, 2019

josephburnett added this to To do in Scaling: Pluggability and HPA via automation Feb 7, 2019

mattmoor added this to the Ice Box milestone Feb 11, 2019

tanzeeb mentioned this issue Apr 1, 2019

Autoscale on OPS #3416

Closed

kevinswiber mentioned this issue May 15, 2019

Adding support for custom metrics in HPA autoscalers #4112

Closed

dprotaso mentioned this issue Oct 17, 2019

Autoscaling support in the core runtime projectriff/system#143

Closed

knative-prow-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 29, 2020

markusthoemmes closed this as completed May 27, 2020

Scaling: Pluggability and HPA automation moved this from To do to Done May 27, 2020

dprotaso removed this from the Ice Box milestone Oct 6, 2021

enoodle mentioned this issue Nov 15, 2021

Custom Metrics HPA Autoscaler #12277

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HPA on custom metrics #3134

HPA on custom metrics #3134

josephburnett commented Feb 7, 2019

josephburnett commented Apr 11, 2019

josephburnett commented Apr 11, 2019

josephburnett commented Apr 12, 2019

kevinswiber commented Apr 15, 2019 •

edited

markusthoemmes commented Apr 16, 2019

kevinswiber commented Apr 16, 2019

kevinswiber commented Apr 16, 2019

josephburnett commented May 8, 2019

kevinswiber commented May 14, 2019

vagababov commented Jun 26, 2019

markusthoemmes commented Jun 27, 2019

vagababov commented Nov 14, 2019

gadelkareem commented Jan 29, 2020

knative-housekeeping-robot commented Apr 29, 2020

markusthoemmes commented May 27, 2020

siddharth-mitra commented Jul 18, 2021 •

edited

julz commented Jul 19, 2021

siddharth-mitra commented Jul 22, 2021 •

edited

HPA on custom metrics #3134

HPA on custom metrics #3134

Comments

josephburnett commented Feb 7, 2019

Proposal

Non-requirements

josephburnett commented Apr 11, 2019

josephburnett commented Apr 11, 2019

josephburnett commented Apr 12, 2019

kevinswiber commented Apr 15, 2019 • edited

markusthoemmes commented Apr 16, 2019

kevinswiber commented Apr 16, 2019

kevinswiber commented Apr 16, 2019

josephburnett commented May 8, 2019

kevinswiber commented May 14, 2019

vagababov commented Jun 26, 2019

markusthoemmes commented Jun 27, 2019

vagababov commented Nov 14, 2019

gadelkareem commented Jan 29, 2020

knative-housekeeping-robot commented Apr 29, 2020

markusthoemmes commented May 27, 2020

siddharth-mitra commented Jul 18, 2021 • edited

julz commented Jul 19, 2021

siddharth-mitra commented Jul 22, 2021 • edited

kevinswiber commented Apr 15, 2019 •

edited

siddharth-mitra commented Jul 18, 2021 •

edited

siddharth-mitra commented Jul 22, 2021 •

edited