-
Notifications
You must be signed in to change notification settings - Fork 295
Re: cluster-autoscaler support #629
Re: cluster-autoscaler support #629
Conversation
I'm trying to complete the cluster-autoscaler support according to our roadmap for v0.9.7 https://github.com/kubernetes-incubator/kube-aws/blob/master/ROADMAP.md#v097 |
Codecov Report
@@ Coverage Diff @@
## master #629 +/- ##
=========================================
- Coverage 37.14% 37.1% -0.05%
=========================================
Files 51 52 +1
Lines 3201 3210 +9
=========================================
+ Hits 1189 1191 +2
- Misses 1836 1842 +6
- Partials 176 177 +1
Continue to review full report at Codecov.
|
This work depends on kubernetes/autoscaler#11 |
e2e/run
Outdated
@@ -156,13 +163,17 @@ customize_cluster_yaml() { | |||
worker: | |||
nodePools: | |||
- name: asg1 | |||
clusterAutoscalerSupport: | |||
enabled: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clusterAutoscalerSupport
is meant to provide enough permissions to call AWS APIs required to host cluster-autoscaler. Then, probably we should provide a nodeSelector for cluster-autoscaler so that we can ensure CA to be scheduled to nodes with the enough permissions?
Edit: And node labels correspond to clusterAutoscalerSupport.enabled
accordingly to the nodeSelector?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add a validation which emits an error when there are no worker node pool or controller node whose clusterAutoscalerSupport
is enabled.
Otherwise cluster-autoscaler can be unable to work/be scheduled due to missing IAM permissions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't we run CA on controller nodes? they already have elevated permissions, adding more wont make it much worse
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@redbaron Yes, we can. kube-aws as of today already supports controller.clusterAutoscalerSupport.enabled
to provide appropriate iam permissions to controller nodes. It would be a matter of just adding appropriate node labels and node selectors to stick cluster-autoscaler to whatever nodes(worker or controller) clusterAutoscalerSupport is enabled.
@@ -145,6 +145,13 @@ | |||
], | |||
"MinSize": "{{.MinCount}}", | |||
"Tags": [ | |||
{{if gt .ClusterAutoscaler.MaxSize 0}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be if .ClusterAutoscaler.Enabled
- name: AWS_REGION | ||
value: {{.Region}} | ||
volumeMounts: | ||
- name: ssl-certs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This volumeMount and the corresponding volume will be unnecessary once kubernetes/autoscaler#48 is merged to CA
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR is merged
Assuming I have no control over when a docker image for cluster-autoscaler containing improvements on which this PR depends is released, I have opened https://github.com/kube-aws/autoscaler and https://quay.io/repository/kube-aws/cluster-autoscaler for hosting our own. Update: This is how our docker image is built and released.
And images can be found at https://quay.io/repository/kube-aws/cluster-autoscaler?tab=tags |
kubernetes/autoscaler#11 is now merged but a docker image containing it isn't released yet. Update: #629 (comment) |
27e9c6a
to
f112823
Compare
@@ -203,12 +203,13 @@ | |||
"Resource": [ "*" ] | |||
}, | |||
{{end}} | |||
{{if .Experimental.ClusterAutoscalerSupport.Enabled}} | |||
{{if .Addons.ClusterAutoscaler.Enabled}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{
"Effect": "Allow",
"Action": [
"autoscaling:DescribeAutoScalingGroups",
"autoscaling:DescribeAutoScalingInstances",
"autoscaling:DescribeTags"
],
"Resource": "*"
},
{
"Action": [
"autoscaling:SetDesiredCapacity",
"autoscaling:TerminateInstanceInAutoScalingGroup"
],
"Condition": {
"Null": { "autoscaling:ResourceTag/kubernetes.io/cluster/{{.ClusterName}}": "false" }
},
"Resource": "*",
"Effect": "Allow"
},
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the Null
condition? Could you point a doc for that for me?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Conditional keys supported by AWS Autoscaling:
http://docs.aws.amazon.com/autoscaling/latest/userguide/control-access-using-iam.html
Null condition:
http://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_elements.html#Conditions_Null
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! So it permits an action only when the targeted resource has the tag? Great - I will incorporate it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done for controller nodes.
I'm still unsure if we'd want to support the customization to schedule CA to worker nodes #629 (comment)
@@ -361,6 +368,7 @@ | |||
"Action": [ | |||
"autoscaling:DescribeAutoScalingGroups", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't understand, what is the benfit of running cluster autoscaler on worker nodes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@redbaron Thanks, good question 👍
The bigger a k8s cluster is, the more CA's resource usage is.
So, I guess one wants a dedicated node pool with a single worker node which can be recreated easier than controller nodes and used to run CA which is possibly scaled-up/down by a vertical-pod-autoscaler.
That's why I tried to support running CA in worker nodes, even though it isn't the default setup.
What do you think? 😃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For an usual use-case, running CA in a controller node is recommended and that's the default setup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On larger clusters you'd have large controller nodes too :) Anyway, I see the point and I guess adding this support isn't much of the work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On larger clusters you'd have large controller nodes too :)
Yea, that's certainly true!
Do you think we can just recommend users to make controller large enough, instead of complication cluster.yaml like this?
} | ||
sort.Strings(keys) | ||
for _, k := range keys { | ||
v := l[k] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
v := l[k]
if len(v) > 0 {
labels = append(labels, fmt.Sprintf("%s=%s", k, v))
}
else {
labels = append(labels, fmt.Sprintf("%s", k))
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@redbaron Thanks!
Is a node label spec like mykey=
is invalid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no it isn't, but seeing args on a command line with empty =
is a little confusing, at least for me. end result is the same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the confirmation. Ok, I will make the suggested change 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
I've updated the PR description accordingly to the current state of the work. |
Added a cluster-autoscaling section in the doc |
Testing E2E after merging with the master branch locally |
n kubernetes-retired#151, we've introduced the initial, incomplete, almost theoretical support for cluster-autoscaler. The situation has changed considerably since then - Now we can finally complete the cluster-autoscaler support. * Automatically scale out a node group by adding node(s) when one or more pods in the group become unschedulable due to insufficient resource. * Automatically scales in a node group by removing node(s) that is safe to be done so. Basically, a node group is safe to be removed when it is not running a critical k8s component * CA is now deployed automatically * More concretely, a k8s deployment for cluster-autoscaler is automatically created on a k8s cluster when cluster-autoscaler addon is enabled in cluster.yaml A valid cluster.yaml for a CA-enabled cluster would now look like: ```yaml addons: clusterAutoscaler: enabled: true worker: nodePools: - name: scaled # Make this node pool an autoscaling target autoscaling: clusterAutoscaler: enabled: true - name: notScaled # This node pool is not an autoscaling target ``` * The former `experimental.clusterAutoscalerSupport.enabled` is dropped for controller nodes in favor of `addons.clusterAutoscaler.enabled` * `worker.nodePools[].clusterAutoscaler.minSize` and `maxSize` are dropped in favor of `worker.nodePools[].autoscaling.clustserAutoscaler.enabled` and the auto discovery feature of cluster-autoscaler * `worker.nodePools[].clusterAutoscalerSupport` is kept as-is, but not necessarily be `true` because when you've enabled `addons.clusterAutoscaler`, kube-aws by default gives enough IAM permissions to only controller nodes and CA is scheduled there. * This work currently relies on the docker image built from a fork of cluster-autoscaler which supports the automatic node group discovery feature * Add cluster-autoscaler deployment to be created when the CA addon is enabled * The former `ClusterAutoscalerImage` is renamed to `ClusterProportionalAutoscalerImage` * Introduce `ClusterAutoscalerImage`(`clusterAutoscalerImage` in cluster.yaml) for the cluster-autoscaler docker image reference * `ClusterAutoscalerSupport` is no-op in controller nodes, used only for worker nodes. `Addons.ClusterAutoscaler` is used instead to give controller nodes appropriate IAM permissions and deploy CA to them. * Most of CA related types and funcs are moved from `core/controlplane/config` to the `model` package * `autoscaling:DescribeTags` is allowed in IAM to allow enabling the automatic node group discovery feature of cluster-autoscaler Note that a node running cluster-autoscaler or kube-resources-autosave can not be scaled in: ``` 09 05:29:21.523426 1 cluster.go:74] Fast evaluation: ip-10-0-0-68.ap-northeast-1.compute.internal for removal I0509 05:29:21.523467 1 cluster.go:88] Fast evaluation: node ip-10-0-0-68.ap-northeast-1.compute.internal cannot be removed: non-deamons set, non-mirrored, kube-system pod present: cluster-autoscaler-998591511-thcpj I0509 05:29:21.523479 1 cluster.go:74] Fast evaluation: ip-10-0-0-133.ap-northeast-1.compute.internal for removal I0509 05:29:21.523488 1 cluster.go:103] Fast evaluation: node ip-10-0-0-133.ap-northeast-1.compute.internal may be removed I0509 05:29:21.523493 1 cluster.go:74] Fast evaluation: ip-10-0-0-150.ap-northeast-1.compute.internal for removal I0509 05:29:21.523530 1 cluster.go:88] Fast evaluation: node ip-10-0-0-150.ap-northeast-1.compute.internal cannot be removed: non-deamons set, non-mirrored, kube-system pod present: kube-resources-autosave-2845171460-1cbs5 ```
8363215
to
166c34b
Compare
Whoa, all the tests have passed. |
|
Sorry to comment on a closed ticket (happy to open a new one if required). What happens if the condition is missing for the IAM role:
I don't have much control over IAM roles and my roles are shared between all my clusters. |
@Vincemd If it was missing for However, if it was missing for In your case, my suggestion is to tag EC2 instances with something like Does my explanation make sense? 😃 |
@mumoshu thanks for the clear explanation. I got it. Good news is that my production Kubernetes cluster will have its own IAM role, so I can be specific and use the conditions properly (can only terminate/setDesired for the prod cluster based on specific tag) It's only my non prod clusters, which are non public facing and sharing same IAM role at the moment. I will work with a tag, as suggested. Problem solved! |
…oscaler Re: cluster-autoscaler support
In #151, we've introduced the initial, incomplete, almost theoretical support for cluster-autoscaler. The situation has changed considerably since then - Now we can finally complete the cluster-autoscaler support.
Features
kube-aws scope changes
Configuration changes
A valid cluster.yaml for a CA-enabled cluster would now look like:
experimental.clusterAutoscalerSupport.enabled
is dropped for controller nodes in favor ofaddons.clusterAutoscaler.enabled
worker.nodePools[].clusterAutoscaler.minSize
andmaxSize
are dropped in favor ofworker.nodePools[].autoscaling.clustserAutoscaler.enabled
and the auto discovery feature of cluster-autoscalerworker.nodePools[].clusterAutoscalerSupport
is kept as-is, but not necessarily betrue
because when you've enabledaddons.clusterAutoscaler
, kube-aws by default gives enough IAM permissions to only controller nodes and CA is scheduled there.cloud-config-controller changes
Go changes
ClusterAutoscalerImage
is renamed toClusterProportionalAutoscalerImage
ClusterAutoscalerImage
(clusterAutoscalerImage
in cluster.yaml) for the cluster-autoscaler docker image referenceClusterAutoscalerSupport
is no-op in controller nodes, used only for worker nodes.Addons.ClusterAutoscaler
is used instead to give controller nodes appropriate IAM permissions and deploy CA to them.core/controlplane/config
to themodel
packageIAM changes
autoscaling:DescribeTags
is allowed in IAM to allow enabling the automatic node group discovery feature of cluster-autoscalerGotchas
Note that a node running cluster-autoscaler or kube-resources-autosave can not be scaled in: