-
Notifications
You must be signed in to change notification settings - Fork 233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[mce-2.3] Allow MachinePool autoscaler maxReplicas < #AZs #2218
[mce-2.3] Allow MachinePool autoscaler maxReplicas < #AZs #2218
Conversation
/assign @lleshchi |
/lgtm |
Our algorithm to spread MachinePool.Spec.Autoscaling.MinReplicas and .MaxReplicas across spoke MachineAutoscalers previously assumed that it was sane to create one MachineAutoscaler per AZ and set maxReplicas to zero if that's what our computation came out with. Not so. MachineAutoscaler.Spec.MaxReplicas -- in contrast to MachineSet.Spec.Replicas -- is [not allowed to be zero](https://github.com/openshift/cluster-autoscaler-operator/blob/67999a5e79d0200ee0a4aab3dcfbfd18e097b514/pkg/apis/autoscaling/v1beta1/machineautoscaler_types.go#L18). The resulting behavior would manifest as a hive-controllers error similar to: ``` time="2024-02-21T16:33:56.802Z" level=error msg="unable to create machine autoscaler" controller=machinepool error="MachineAutoscaler.autoscaling.openshift.io \"efried-rg2wn-worker-test-us-east-1c\" is invalid: spec.maxReplicas: Invalid value: 0: spec.maxReplicas in body should be greater than or equal to 1" machinePool=efried/efried-worker-test reconcileID=nwpxskln ``` So instead we have to include a special case for this and delete such MachineAutoscalers instead. (Further, since this error causes us to bail out of the machinepool controller's reconcile loop before updating the MachinePool status, the user doesn't have a great way to discover what went wrong. They just have to notice that MachineSets et al stop responding. We'll address this in a separate commit.) HIVE-2415 (cherry picked from commit 80d9694) (cherry picked from commit e70ff98)
1d256cf
to
5f668b7
Compare
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: 2uasimojo, lleshchi The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## mce-2.3 #2218 +/- ##
========================================
Coverage 57.81% 57.82%
========================================
Files 186 186
Lines 25226 25226
========================================
+ Hits 14585 14587 +2
+ Misses 9401 9400 -1
+ Partials 1240 1239 -1
|
/test e2e NAT gateway quota 😕 |
@2uasimojo: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Our algorithm to spread MachinePool.Spec.Autoscaling.MinReplicas and .MaxReplicas across spoke MachineAutoscalers previously assumed that it was sane to create one MachineAutoscaler per AZ and set maxReplicas to zero if that's what our computation came out with.
Not so.
MachineAutoscaler.Spec.MaxReplicas -- in contrast to MachineSet.Spec.Replicas -- is
not allowed to be zero.
The resulting behavior would manifest as a hive-controllers error similar to:
So instead we have to include a special case for this and delete such MachineAutoscalers instead.
(Further, since this error causes us to bail out of the machinepool controller's reconcile loop before updating the MachinePool status, the user doesn't have a great way to discover what went wrong. They just have to notice that MachineSets et al stop responding. We'll address this in a separate commit.)
HIVE-2415
(cherry picked from commit 80d9694) (cherry picked from commit e70ff98)