diff --git a/keps/prod-readiness/sig-autoscaling/2021.yaml b/keps/prod-readiness/sig-autoscaling/2021.yaml new file mode 100644 index 00000000000..7f77331cf17 --- /dev/null +++ b/keps/prod-readiness/sig-autoscaling/2021.yaml @@ -0,0 +1,3 @@ +kep-number: 2021 +alpha: + approver: "@johnbelamaric" diff --git a/keps/sig-autoscaling/2021-scale-from-zero/README.md b/keps/sig-autoscaling/2021-scale-from-zero/README.md new file mode 100644 index 00000000000..2190de5b574 --- /dev/null +++ b/keps/sig-autoscaling/2021-scale-from-zero/README.md @@ -0,0 +1,971 @@ + +# KEP-2021: HPA supports scaling to/from zero pods for object/external metrics + + + + + + +- [Release Signoff Checklist](#release-signoff-checklist) +- [Summary](#summary) +- [Motivation](#motivation) + - [Goals](#goals) + - [Non-Goals](#non-goals) +- [Proposal](#proposal) + - [User Stories (Optional)](#user-stories-optional) + - [Story 1: Scale a heavy queue consumer on-demand](#story-1-scale-a-heavy-queue-consumer-on-demand) + - [Notes/Constraints/Caveats (Optional)](#notesconstraintscaveats-optional) + - [Risks and Mitigations](#risks-and-mitigations) +- [Design Details](#design-details) + - [Test Plan](#test-plan) + - [Prerequisite testing updates](#prerequisite-testing-updates) + - [Unit tests](#unit-tests) + - [Integration tests](#integration-tests) + - [e2e tests](#e2e-tests) + - [Graduation Criteria](#graduation-criteria) + - [Alpha](#alpha) + - [Beta](#beta) + - [GA](#ga) + - [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy) + - [Version Skew Strategy](#version-skew-strategy) +- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire) + - [Feature Enablement and Rollback](#feature-enablement-and-rollback) + - [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning) + - [Monitoring Requirements](#monitoring-requirements) + - [Dependencies](#dependencies) + - [Scalability](#scalability) + - [Troubleshooting](#troubleshooting) +- [Implementation History](#implementation-history) +- [Drawbacks](#drawbacks) +- [Alternatives](#alternatives) +- [Infrastructure Needed (Optional)](#infrastructure-needed-optional) + + +## Release Signoff Checklist + + + +Items marked with (R) are required *prior to targeting to a milestone / release*. + +- [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR) +- [ ] (R) KEP approvers have approved the KEP status as `implementable` +- [ ] (R) Design details are appropriately documented +- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors) + - [ ] e2e Tests for all Beta API Operations (endpoints) + - [ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) + - [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free +- [ ] (R) Graduation criteria is in place + - [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) +- [ ] (R) Production readiness review completed +- [ ] (R) Production readiness review approved +- [ ] "Implementation History" section is up-to-date for milestone +- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io] +- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes + + + +[kubernetes.io]: https://kubernetes.io/ +[kubernetes/enhancements]: https://git.k8s.io/enhancements +[kubernetes/website]: https://git.k8s.io/website + +## Summary + +[Horizontal Pod Autoscaler][] (HPA) automatically scales the number of pods in any resource which supports the `scale` subresource based on observed CPU or memory utilization +(or, with custom metrics support, on some other application-provided metrics) from one to many replicas. This proposal adds support for scaling from zero to many replicas and back to zero for object and external metrics. + +Scaling to zero is particularly effective for cost reduction when individual pods demand substantial resource requests, such as dedicated CPUs or GPUs. Since CPU and memory utilization can only be measured on running pods, scaling to zero will be limited to object and external metrics. + +[Horizontal Pod Autoscaler]: https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/ + + + +## Motivation + + + +With the addition of scaling based on object and external metrics it became possible to automatically adjust the number of running replicas based on an application +provided metric. A typical use-case case for this is scaling the number of queue consumers based on the length of the consumed queue. + +In cases of a frequently idle queue or a less latency-sensitive workload, there is no need to keep one replica running at all times. +Instead, you can dynamically scale to zero replicas, especially for workloads with high resource demands, such as those requiring GPUs. +This approach not only reduces costs but also has significant energy-saving potential, particularly as GPU workloads become more prevalent. +When replicas are scaled to zero, the HPA must also be capable of scaling back up as soon as messages become available. + +### Goals + + + +- Provide scaling to zero replicas for object and external metrics +- Provide scaling from zero replicas for object and external metrics + +### Non-Goals + + + +- Provide scaling to/from zero replicas for resource metrics +- Provide request buffering at the Kubernetes Service level + +## Proposal + + + +Allow the HPA to scale from and to zero using `minReplicas: 0` and a HPA status condition. + +### User Stories (Optional) + + + +#### Story 1: Scale a heavy queue consumer on-demand + +As the operator of a video processing pipeline, I would like to reduce costs. While video processing is CPU intensive, it is not a latency sensitive workload. Therefore I want +my video processing workers to only be created if there is actually a video to be processed and terminated afterwards. + +### Notes/Constraints/Caveats (Optional) + + + +Currently disabling HPA is possible by manually setting the scaled resource to `replicas: 0`. This works as the HPA itself could never reach this state itself. +As `replicas: 0` is now a possible state when using `minReplicas: 0` it can no longer be used to differentiate between manually disabled or automatically scaled to zero. + +Additionally the `replicas: 0` state is problematic as updating a HPA object `minReplicas` from `0` to `1` has different behavior. If `replicas` was `0` during the update, HPA +will be disabled for the resource, if it was `> 0`, HPA will continue with the new `minReplicas` value. + +To resolve these issues the KEP is introducing an explicit `ScaledToZero` condition inside the `HorizontalPodAutoscalerStatus`. When `ScaledToZero=True` was recorded the HPA will scale +up a workload from `0 ~> 1` and remove the condition `ScaledToZero=True`. If the condition is not found, the HPA maintains the current behavior of performing no change. + +When the HPA scales a workload from `1 ~> 0`, it records the `ScaledToZero=True` condition inside the status. + +### Risks and Mitigations + + + +As `ScaledToZero` is no explicit property, applying a new Deployment with `replicas: 0` and HPA `minReplicas: 0` can be confusing as the Deployment will never scale. + +This needs should be documented and is detectable by looking at the existing `ScalingActive` condition. + +In the future pausing the HPA can become an explicit feature and the implicit pausing via `replicas: 0` can be deprecate to remove this confusing. + +## Design Details + + + +Add `ScaledToZero` as HPA `HorizontalPodAutoscalerConditionType` + +```golang +const ( + // ScaledToZero indicates that the HPA controller scaled the workload to zero. + ScaledToZero HorizontalPodAutoscalerConditionType = "ScaledToZero" +) +``` + +### Test Plan + + + +[x] I/we understand the owners of the involved components may require updates to +existing tests to make this code solid enough prior to committing the changes necessary +to implement this enhancement. + +Most logic related to this KEP is contained in the HPA controller so the testing of +the various `minReplicas`, `replicas` and `ScaledToZero` should be achievable with unit tests. + +Additionally integration tests should be added for enable scale to zero by, setting + `ScaledToZero: true`, setting `minReplicas: 1` and waiting for `replicas` to become `0` and another test for increasing `minReplicas: 1` and observing that `replicas` became `1` again and confirming that `ScaledToZero: true` has been removed. + +##### Prerequisite testing updates + + + +##### Unit tests + + + + + +- `/pkg/controller/podautoscaler`: `2025-02-06` - `96.4` + +##### Integration tests + + + + + +- N/A in this case. + +##### e2e tests + + + +E2E tests will be added under to test scale down to `0` and scale up +with this feature enabled and scale down `1` without this feature. + +- : + +### Graduation Criteria + + + +#### Alpha + +- Implement the `ScaleToZero` condition recording +- Ensure that all `minReplicas` state transitions from `0` to `1` are working as expected +- E2E tests are passing without flakiness + +#### Beta + +- Allowing time for feedback + +#### GA + +- Allowing time for feedback + +### Upgrade / Downgrade Strategy + + + +As this KEP changes the allowed values for `minReplicas`, special care is required for the downgrade case to not prevent any kind of updates for HPA objects using `minReplicas: 0`. The alpha code already accepts `minReplicas: 0` with the flag enabled or disabled since Kubernetes version 1.16 downgrades to any version >= 1.16 aren't an issue. + +Before downgrading all HPAs need to be set to `minReplicas: 1` to avoid any deployments being stuck at `replicas: 1`. + +### Version Skew Strategy + + + +## Production Readiness Review Questionnaire + + + +### Feature Enablement and Rollback + + + +###### How can this feature be enabled / disabled in a live cluster? + + + +- [x] Feature gate (also fill in values in `kep.yaml`) + - Feature gate name: `HPAScaleToZero` + - Components depending on the feature gate: `kube-apiserver` +- [x] Other + - Describe the mechanism: + + When HPAScaleToZero feature gate is enabled HPA supports scaling to zero pods based on object or external metrics. HPA remains active as long as at least one metric value available. + + - Will enabling / disabling the feature require downtime of the control + plane? + + No + + - Will enabling / disabling the feature require downtime or reprovisioning + of a node? + + No + +###### Does enabling the feature change any default behavior? + + + + HPA creation/update with `minReplicas: 0` is no longer rejected. + +###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)? + + + +Yes. To downgrade the cluster to version that does not support scale-to-zero feature or to disable to feature gate: + +1. Make sure there are no hpa objects with minReplicas=0 and maxReplicas=0. Here is a oneliner to update it to 1: + + `$ kubectl get hpa --all-namespaces --no-headers=true | awk '{if($6==0) printf "kubectl patch hpa/%s --namespace=%s -p \"{\\\"spec\\\":{\\\"minReplicas\\\":1,\\\"maxReplicas\\\":1}}\"\n", $2, $1 }' | sh` +2. Disable `HPAScaleToZero` feature gate +3. In case step 1. has been omitted, workloads might be stuck with `replicas: 0` and need to be manually scaled up to `replicas: 1` to re-enable autoscaling. + +###### What happens if we reenable the feature if it was previously rolled back? + +Nothing, the feature can be re-enabled without problems and workload with `replicas: 0` targeted by a HPA will be scaled again. + +###### Are there any tests for feature enablement/disablement? + + + +There currently unit tests for the alpha cases and tests planned to be added for the new functionality. + +### Rollout, Upgrade and Rollback Planning + + + +As this is a new field every usage is opt-in. In case the kubernetes version is downgraded, currently scaled to 0 workloads might need to be manually scaled to 1 as the controller would treat them as +paused otherwise. + +If a rollback is planned, the following steps should be performed before downgrading the kubernetes version: + +1. Make sure there are no hpa objects with minReplicas=0 and maxReplicas=0. Here is a oneliner to update it to 1: + + `$ kubectl get hpa --all-namespaces --no-headers=true | awk '{if($6==0) printf "kubectl patch hpa/%s --namespace=%s -p \"{\\\"spec\\\":{\\\"minReplicas\\\":1,\\\"maxReplicas\\\":1}}\"\n", $2, $1 }' | sh` +2. Disable `HPAScaleToZero` feature gate +3. Downgrade the Kubernetes version + +###### How can a rollout or rollback fail? Can it impact already running workloads? + + + +There are no expected side-effects when the rollout fails as the new `ScaleToZero` condition should only be enabled once the version upgraded completed. + +If the `kube-apiserver` has been upgraded before the `kube-controller-manager`, an HPA object has been updated to `minReplicas: 0` and the workload is already scaled down to 0 replicas, you must manually scale the workload to at least one replica. + +You can detect this situation in one of two ways: + +- Manually, by checking the HPA status and verifying that all entries show ScalingActive set to true and do not mention ScalingDisabled, or + +- Automatically, by using the `kube_horizontalpodautoscaler_status_condition` metric provided by [kube-state-metrics](https://github.com/kubernetes/kube-state-metrics) + to ensure the `ScalingActive` condition is `true.` + +If an rollback is attempted, all HPAs should be updated to `minReplicas: 1` as otherwise HPA for deployments with zero replicas will be disabled until +replicas have been raised explicitly to at least `1`. + +###### What specific metrics should inform a rollback? + + + +If workloads an unexpected number of HPA entities contain a the status `ScalingActive` `false` and mention `ScalingDisable` the feature isn't working as desired and all HPA objects should be updated to > 0 again and their managed workloads should be scaled to at least 1. + +This condition can also be detected using the `kube_horizontalpodautoscaler_status_condition` metric provided by [kube-state-metrics](https://github.com/kubernetes/kube-state-metrics), but reason should be manually confirmed for flagged HPA objects. + +###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested? + + + +No yet as no implementation based on the new condition is available. + +###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.? + + + +No. + +### Monitoring Requirements + + + +###### How can an operator determine if the feature is in use by workloads? + + + +The new status will be visible inside the `kube_horizontalpodautoscaler_status_condition` metric provided by [kube-state-metrics](https://github.com/kubernetes/kube-state-metrics) as +and the `minReplicas: 0` setting reflected in `kube_horizontalpodautoscaler_spec_min_replicas`. + +###### How can someone using this feature know that it is working for their instance? + + + +When this feature is enabled for a workload scaled based on an object or external metric, the workload should be scaled to 0 replicas when the metric is 0. + +###### What are the reasonable SLOs (Service Level Objectives) for the enhancement? + + + +No changes to the autoscaling SLOs. + +###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service? + + + +No changes to the autoscaling SLIs. + +###### Are there any missing metrics that would be useful to have to improve observability of this feature? + + + +No, in regards to this KEP. + +### Dependencies + + + +###### Does this feature depend on any specific services running in the cluster? + + + +The addition has the same dependencies as the current autoscaling controller. + +### Scalability + + + +###### Will enabling / using this feature result in any new API calls? + + + +No, the amount of autoscaling related API calls will remain unchanged. No other components are affected. + +###### Will enabling / using this feature result in introducing new API types? + + + +No, this only modifies the existing API types. + +###### Will enabling / using this feature result in any new calls to the cloud provider? + + + +No, the amount of autoscaling related cloud provider calls will remain unchanged. No other components are affected. + +###### Will enabling / using this feature result in increasing size or count of the existing API objects? + + + +Yes, one additional boolean field inside the `spec` of every `HorizontalPodAutoscaler` resource. + +###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs? + + + +No, the are no visible latency changes expected for existing autoscaling operations. + +###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components? + + + +No, the are no visible changes expected for existing autoscaling operations. + +###### Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)? + + +No. + +### Troubleshooting + + + +###### How does this feature react if the API server and/or etcd is unavailable? + +Autoscaling will not occur, this is the same as the current behaviour. + +###### What are other known failure modes? + + + +- Failed to fetch the relevant object or external metrics. + - Detection: `ScalingActive: false` condition with `FailedGetExternalMetric` or `FailedGetObjectMetric` reason. + - Mitigations: manually scale the resource. + - Diagnostics: Related errors should be printed as the messages of `ScalingActive: false`. + - Testing: + +###### What steps should be taken if SLOs are not being met to determine the problem? + +Check `metric_computation_duration_seconds` to see which metric encountered the latency issue. +If the latency problem is caused by metrics used for scaling to zero, you can remove those metrics again from your HPA(s). + +## Implementation History + + + +- (2019/02/25) Original design doc: +- (2019/07/16) Alpha implementation () merged for Kubernetes 1.16 + +## Drawbacks + + + +## Alternatives + + + +Third-party solutions like [KEDA][] already support scaling to zero for various resource (e.g. [RabbitMQ Queues](https://keda.sh/docs/2.16/scalers/rabbitmq-queue/)). However, these solutions often introduce additional paradigms and complexity. Since Horizontal Pod Autoscaling is already a core feature of Kubernetes and supports scaling to one, adding native support for scaling to zero would be a valuable and low-complexity enhancement. + +[KEDA]: https://keda.sh/ + +## Infrastructure Needed (Optional) + + diff --git a/keps/sig-autoscaling/2021-scale-from-zero/kep.yaml b/keps/sig-autoscaling/2021-scale-from-zero/kep.yaml new file mode 100644 index 00000000000..394666e9747 --- /dev/null +++ b/keps/sig-autoscaling/2021-scale-from-zero/kep.yaml @@ -0,0 +1,43 @@ +title: HPA supports scaling to/from zero pods for object/external metrics +kep-number: 2021 +authors: + - "@johanneswuerbach" +owning-sig: sig-autoscaling +participating-sigs: +status: implementable +creation-date: "2020-09-26" +reviewers: + - "@gjtempleton" + - "@adrianmoisey" +approvers: + - "@gjtempleton" + +see-also: +replaces: + +# The target maturity stage in the current dev cycle for this KEP. +stage: alpha + +# The most recent milestone for which work toward delivery of this KEP has been +# done. This can be the current (upcoming) milestone, if it is being actively +# worked on. +latest-milestone: "v1.35" + +# The milestone at which this feature was, or is targeted to be, at each stage. +milestone: + alpha: "v1.16" + beta: "x.y" + stable: "x.y" + +# The following PRR answers are required at alpha release +# List the feature gate name and the components for which it must be enabled +feature-gates: + - name: HPAScaleToZero + components: + - kube-apiserver +disable-supported: true + +# The following PRR answers are required at beta release +metrics: + - kube_horizontalpodautoscaler_spec_min_replicas + - kube_horizontalpodautoscaler_status_condition