-
Notifications
You must be signed in to change notification settings - Fork 38.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix HPA feedback from writing status.replicas to spec.replicas. #79035
Conversation
Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA. It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Hi @josephburnett. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/ok-to-test |
2df8e2e
to
d63688d
Compare
New comment. |
Done. |
/test pull-kubernetes-integration |
/test pull-kubernetes-verify |
/lgtm |
Cherry pick #79035 to 1.13 (Fix HPA feedback from writing status.replicas to spec.replicas)
Cherry pick #79035 to 1.14 (Fix HPA feedback from writing status.replicas to spec.replicas)
…-#79035-upstream-release-1.15 Automated cherry pick of #79035: There are various reasons that the HPA will decide not the
Hi, is this currently released ? |
Handle replica counts derived from spec and status as separate types so we don't accidentally write observed replicas from status back into spec, causing a positive feedback loop (kubernetes#79035).
During a Deployment update there may be more Pods in the scale target ref status than in the spec. This test verifies that we do not scale to the status value. Instead we should stay at the spec value. Fails before kubernetes#79035 and passes after.
During a Deployment update there may be more Pods in the scale target ref status than in the spec. This test verifies that we do not scale to the status value. Instead we should stay at the spec value. Fails before kubernetes#79035 and passes after.
@juanpmarin it looks like this is gonna be in 1.16+
|
@juanpmarin This is also cherry picked to 1.13(#79707), 1.14(#79708), 1.15(#79709, #79727). Latest version of them may have this fix too. |
I would just like to draw some attention to #72775. Many are still facing issues with rolling updates. |
What this PR does / why we need it:
There are various reasons that the HPA will decide not the change the current scale. Two important ones are when missing metrics might change the direction of scaling, and when the recommended scale is within tolerance of the current scale.
The way that
ReplicaCalculator
signals it's desire to not change the current scale is by returning the current scale. However the current scale is fromscale.Status.Replicas
and can be larger thanscale.Spec.Replicas
(e.g. during Deployment rollout with configured surge). This causes a positive feedback loop becausescale.Status.Replicas
is written back intoscale.Spec.Replicas
, further increasing the current scale.This PR fixes the feedback loop by plumbing the replica count from spec through
horizontal.go
andreplica_calculator.go
so the calculator can punt with the right value.It also introduces separate types for replica counts derived from
scale.Spec
andscale.Status
(specReplicas
andstatusReplicas
respectively) to guard against this kind of cross-talk. When returning a desired scale to be written into spec, the calculator must either return the given, current scale, or explicitly callnewSpecReplicas
.With separate types, other sources of cross-talk became compiler errors. E.g. recording Status.Replicas as an initial recommendation. This would manifest if a deployment was in the process of rolling out when the HPA reboots.
Which issue(s) this PR fixes:
Fixes #78712
Fixes #72775
Special notes for your reviewer:
Does this PR introduce a user-facing change?: