AWS: Add note about suspending AZRebalance #1802
According to some user reports , you can actually run
NOTE: I've only been running in this configuration for about 1 day, so I can't personally vouch for the correctness of this workaround. As mentioned in the commit text above, other users have reported running in this configuration without issues. Would be great to get confirmation from an expert though ;)
According to some user reports , you can actually run cluster-autoscaler against an ASG that spans multiple AZs, you just have to suspend the AZRebalance scaling process to avoid unexpected node termination.  https://kubernetes.slack.com/archives/C8SH2GSL9/p1552600210276600?thread_ts=1552420686.257000&cid=C8SH2GSL9
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by:
If they are not already assigned, you can assign the PR to them by writing
The full list of commands accepted by this bot can be found here.
The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing
Rebalance is not the only reason why CA doesn't support multi-AZ nodegroups. The core logic of CA works by taking a random existing node and assuming any new node in the same ASG will look exactly the same. In multi-AZ ASG the new node can be in a different zone than CA assumes it will be, which can lead to incorrect autoscaling decisions (unnecessary scale-up/no scale-up). This commonly leads to issues, especially when using PVs in multi-AZ clusters. A recent example: kubernetes/kubernetes#75402.
It may work ok-ish with multi-AZ ASG if you disable rebalancing, don't use storage, don't use podAffinity with topology other than host, don't use nodeAffinity on zone label, never scale any zone to 0, ...