-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix deployment status propagation when scaling from zero #15550
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #15550 +/- ##
==========================================
- Coverage 84.49% 84.32% -0.18%
==========================================
Files 219 219
Lines 13608 13662 +54
==========================================
+ Hits 11498 11520 +22
- Misses 1740 1769 +29
- Partials 370 373 +3 ☔ View full report in Codecov by Sentry. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: skonto The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
// Mark resource unavailable if we are scaling back to zero, but we never achieved the required scale | ||
// and deployment status was not updated properly by K8s. For example due to an image pull error. | ||
if ps.ScaleTargetNotScaled() { | ||
condScaled := ps.GetCondition(autoscalingv1alpha1.PodAutoscalerConditionScaleTargetScaled) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could set ContainerHealthyFalse here too but we need #15503
Fixes #14157
Proposed Changes
Introduces a new PA condition (
PodAutoscalerConditionScaleTargetScaled
) that detects failures during scaling to zero, covering the K8s gaps where deployment status is not updated correctly. The condition is set tofalse
just before we scale down to zero (before the deployment update happens) and if pods are crashing. We set it back totrue
when we scale from zero and we have enough ready pods.Previously when deployment was scaled down to zero, revision ready status would be true (and stay that way), but with this patch the pod error is detected and propagated: