-
Notifications
You must be signed in to change notification settings - Fork 38.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
validate hpa target values #120373
validate hpa target values #120373
Conversation
Welcome @Lukasz-AWS! |
Hi @Lukasz-AWS. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/sig autoscaling |
allErrs = append(allErrs, field.Invalid(fldPath.Child("value"), mt.Value, "must be set for metric target type Value")) | ||
} | ||
|
||
if mt.Type == autoscaling.AverageValueMetricType && mt.AverageValue == nil && mt.Value != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only care that when Type is AverageValue, averageValue is specified and when type is Value, value is specified. I think we can ignore if Value is set or not in this specific check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only value of additionally checking if Value
is set or not for type AverageValue
is to show a more helpful message saying , you set value but instead you need to set averageValue
, which i dont see in the code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking a look! I've made the requested change to only care about the mapping we're looking for, and removed some now redundant tests as they're calling actually ended up just fixing these just in case.validateMetricTarget
beforehand
/ok-to-test |
/triage accepted |
@kubernetes/sig-autoscaling-leads for review |
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of nits to remove deprecated methods and make the linter happy.
/retest |
Hey @liggitt could you take another look please. |
Sorry, ~all PRs got preempted by design reviews through design freeze today. A separate PR to fix the nil panic we can backport would be good regardless. This one is in my queue to review now that design freeze is upon us. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: Lukasz-AWS The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
No worries! And understood thanks! I moved the nil check change to its own PR here #121015 |
hey @liggitt bumping this PR again in case it got lost in the shuffle. |
Paging this back in. We have five types of metrics: object, pod, resource, containerResource, and external For each type, we have existing controller behavior and existing validation behavior. When validation allows in things that make the controller fail at runtime, that's a bug (🐞) and we can tighten validation in a ratcheting way. When validation allows in things that are weird / underspecified but the controller also doesn't check for or tolerates, people could currently be successful setting those weird / underspecified things, so we should (at most) issue a warning ( I swept and annotated the current behavior:
|
@@ -36,6 +37,33 @@ const ( | |||
MaxStabilizationWindowSeconds int32 = 3600 | |||
) | |||
|
|||
// HPAValidationOptions contains the different settings for HPA validation | |||
type HPAValidationOptions struct { | |||
AllowMismatchedTargetTypeAndValue bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since the only thing we're ratcheting here is objectMetrics, make the name explicitly about that metric type:
AllowMismatchedTargetTypeAndValue bool | |
AllowMismatchedObjectMetricTypeAndValue bool |
opts := HPAValidationOptions{ | ||
AllowMismatchedTargetTypeAndValue: false, | ||
} | ||
// Don't allow mismatched target type/value fields for new HPAs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Don't allow mismatched target type/value fields for new HPAs | |
// Don't allow mismatched objectMetric target type/value fields for new HPAs |
// Check for existing bad mismatched target type/value fields for ObjectMetricSource | ||
// If UtilizationMetricType was set previously then either one of the values could have been used to pass validation before | ||
for _, ms := range oldAutoscaler.Spec.Metrics { | ||
if ms.Object != nil && ((ms.Object.Target.Type == autoscaling.ValueMetricType && ms.Object.Target.Value == nil) || | ||
(ms.Object.Target.Type == autoscaling.AverageValueMetricType && ms.Object.Target.AverageValue == nil) || | ||
(ms.Object.Target.Type == autoscaling.UtilizationMetricType)) { | ||
opts.AllowMismatchedTargetTypeAndValue = true | ||
return opts | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This bit would be easier to read in a standalone method that could return early:
func hasMismatchedObjectMetricType(hpa *autoscaling.HorizontalPodAutoscaler) bool {
if hpa == nil {
return false
}
for _, ms := range oldAutoscaler.Spec.Metrics {
switch {
case ms.Object == nil:
continue
case ms.Object.Target.Type == autoscaling.ValueMetricType && ms.Object.Target.Value == nil:
return true
case ms.Object.Target.Type == autoscaling.AverageValueMetricType && ms.Object.Target.AverageValue == nil:
return true
case ms.Object.Target.Type == autoscaling.UtilizationMetricType:
// If object metrics accept UtilizationMetricType in the future, expand this case to also check ms.Object.Target.AverageUtilizationValue == nil
return true
}
}
return false
}
if src.Target.AverageValue != nil { | ||
allErrs = append(allErrs, field.Invalid(fldPath.Child("target").Child("averageValue"), src.Target.AverageValue, "must not set target averageValue for value metric type")) | ||
} | ||
if src.Target.AverageUtilization != nil { | ||
allErrs = append(allErrs, field.Invalid(fldPath.Child("target").Child("averageUtilization"), src.Target.AverageUtilization, "must not set target averageUtilization for value metric type")) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drop the AverageValue / AverageUtilization checks here... these go further than the controller does and can fail an existing object that is currently working:
object:
...
target:
type: Value
value: "..."
averageValue: "..."
averageUtilization: "..."
That would pass validation today, work successfully with the controller today (averageValue and averageUtilization would be ignored), result in AllowMismatchedTargetTypeAndValue=false, and then get rejected on update by validation, which is not what we want.
allErrs = append(allErrs, field.Invalid(fldPath.Child("target").Child("type"), src.Target.Type, "must not set Utilization type for Object source type")) | ||
case autoscaling.ValueMetricType: | ||
if src.Target.Value == nil { | ||
allErrs = append(allErrs, field.Required(fldPath.Child("target").Child("value"), "must set target value for value metric type")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
allErrs = append(allErrs, field.Required(fldPath.Child("target").Child("value"), "must set target value for value metric type")) | |
allErrs = append(allErrs, field.Required(fldPath.Child("target").Child("value"), "required when type=Value")) |
if src.Target.AverageUtilization != nil { | ||
allErrs = append(allErrs, field.Invalid(fldPath.Child("target").Child("averageUtilization"), src.Target.AverageUtilization, "must not set target averageUtilization for value metric type")) | ||
} | ||
case autoscaling.AverageValueMetricType: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
drop the Value / AverageUtilization checks in this case for the same reason as mentioned above
allErrs = append(allErrs, field.Invalid(fldPath.Child("target").Child("value"), src.Target.Value, "must not set target value for average value metric type")) | ||
} | ||
if src.Target.AverageValue == nil { | ||
allErrs = append(allErrs, field.Required(fldPath.Child("target").Child("averageValue"), "must set target averageValue for average value metric type")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
allErrs = append(allErrs, field.Required(fldPath.Child("target").Child("averageValue"), "must set target averageValue for average value metric type")) | |
allErrs = append(allErrs, field.Required(fldPath.Child("target").Child("averageValue"), "required when type=AverageValue")) |
if !opts.AllowMismatchedTargetTypeAndValue { | ||
switch src.Target.Type { | ||
case autoscaling.UtilizationMetricType: | ||
allErrs = append(allErrs, field.Invalid(fldPath.Child("target").Child("type"), src.Target.Type, "must not set Utilization type for Object source type")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
allErrs = append(allErrs, field.Invalid(fldPath.Child("target").Child("type"), src.Target.Type, "must not set Utilization type for Object source type")) | |
allErrs = append(allErrs, field.Invalid(fldPath.Child("target").Child("type"), src.Target.Type, "must be either Value, or AverageValue")) |
for the gaps flagged as being warning-worthy ( |
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close |
@k8s-triage-robot: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What type of PR is this?
/kind bug
What this PR does / why we need it:
Fixes a bug where if the corresponding value (value, averageValue) of the the type of target (Value, AverageValue) is not set but the opposite type is, HPA object would pass validation but then cause kube-controller-manager to crash loop due to nil pointer dereference looking for that corresponding value. This PR adds an additional check to make sure the corresponding value gets set.
Which issue(s) this PR fixes:
N/A
Special notes for your reviewer:
Observed in 1.27 and 1.28, possibly might be an issue in earlier versions as well.
Can be replicated by deploying an hpa with a valid scaleTargetRef and adding this object to the metrics
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: