Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS: Can't scale up from 0 #2418

Closed
mgalgs opened this issue Oct 2, 2019 · 19 comments
Assignees

Comments

@mgalgs
Copy link
Contributor

@mgalgs mgalgs commented Oct 2, 2019

Possibly related: #1754

I recently added three new node groups to my cluster using AWS spot instances. I initially set the minSize on each of the three new groups to 0, but CA was refusing to scale them up from 0. If I go into the EC2 console and manually force the ASG minSize up to 1 then CA gets unstuck and will continue scaling the group up as new requests come in.

I'm attaching the following files:

  • ca_logs.txt :: At this point I had forced one of my ASGs to have a minSize of 1 and maxSize of 4. That group filled up so CA was unable to scale it up any further. At this point it should have been scaling up the other two node groups, but they still had minSize=0 and thus CA refused to scale them up.
  • ca_logs_after_setting_min.txt :: This is after manually forcing the two other ASGs to have minSize=1. At this point CA starts scaling them up as expected.
  • ca_pod.txt :: Full get pod -o yaml of my CA

Is it not supported to have minSize=0 on AWS?

I'm running CA v1.14.5.

@mgalgs

This comment has been minimized.

Copy link
Contributor Author

@mgalgs mgalgs commented Oct 2, 2019

I should also mention that all of our workers do have the ec2:DescribeLaunchTemplateVersions permission. Our workers (including the one running CA) all have the following IAM policy attached:

image (1)

So I think we're satisfying the requirements from Scaling a node group to 0 in the docs.

FWIW, the cluster was created with eksctl v0.5.2 with nodeGroups[].iam.withAddonPolicies.autoScaler = true for all nodegroups.

@chnsh

This comment has been minimized.

Copy link

@chnsh chnsh commented Oct 3, 2019

I'm facing this as well - additionally, I have 2 nodegroups (1 is an on-demand instance on AWS running the autoscaler), the other group is supposed to be a spot group where i want to deploy jobs - it does not scale up from 0.

I'm using autodiscovery feature

@chnsh

This comment has been minimized.

Copy link

@chnsh chnsh commented Oct 3, 2019

so I upgraded to k8s.gcr.io/cluster-autoscaler:v1.16.1 and it did trigger auto scaling, now the problem is that I never see those nodes when I execute kubectl get nodes I do not see the nodes that have been spawned

@mgalgs

This comment has been minimized.

Copy link
Contributor Author

@mgalgs mgalgs commented Oct 3, 2019

Interesting. Hopefully if there's a fix for the 1.16 line it can be cherry picked back to 1.14 etc. since the docs recommend matching your CA version with your k8s version (maybe that's why yours isn't working, @chnsh).

@chnsh

This comment has been minimized.

Copy link

@chnsh chnsh commented Oct 4, 2019

aah - possibly, I'll try tomorrow and update the thread.

So, I too am on v1.14.5 now and it did trigger autoscaling for me - that bit works fine, I still don't see those nodes in get nodes

@chnsh

This comment has been minimized.

Copy link

@chnsh chnsh commented Oct 4, 2019

Okay so I am on CA v1.14.5 and I finally got it working - I had to upgrade eksctl from v0.5.2 to 0.6.0 - it scales from 0 now!

@mgalgs

This comment has been minimized.

Copy link
Contributor Author

@mgalgs mgalgs commented Oct 7, 2019

Interesting. I just tried eksctl v0.6.0 as well and it's still not scaling up from 0...

As a workaround I guess I'll set the minimum on these guys to 1, but it would be great for one of the devs to take a look at this.

@Jeffwan

This comment has been minimized.

Copy link
Contributor

@Jeffwan Jeffwan commented Oct 8, 2019

@mgalgs Can you share eksctl cluster config? I can help reproduce the issue on our side.
I assume you have one OnDemand instance group to host your CA? You'd like to scale up spot instance group? Did you use node affinity or node selector in your tests? Did you use MixedInstancePolicy for your instance group?

@mgalgs

This comment has been minimized.

Copy link
Contributor Author

@mgalgs mgalgs commented Oct 8, 2019

@Jeffwan Sure, here's my config:

cluster.yml
# A simple example of ClusterConfig object:
---
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata:
  name: mycluster
  region: us-west-2

vpc: {cidr: 10.42.0.0/16}

# cluster AZs must be set explicitly for single AZ nodegroup example to
# work
# https://github.com/weaveworks/eksctl/blob/c37657c1f4ff55ffed40139cf74aa828b37c2a1b/examples/05-advanced-nodegroups.yaml#L43
availabilityZones: ["us-west-2a", "us-west-2b", "us-west-2c"]

# Need separate nodegroups for cluster-autoscaler to work reliably.
# See https://github.com/kubernetes/autoscaler/pull/1802#issuecomment-474295002
nodeGroups:
  - name: ng-2a-ami-038a987c6425a84ad-m5-2xlarge-v2
    taints:
      spotty: "false:PreferNoSchedule"
    instanceType: m5.2xlarge
    availabilityZones: ["us-west-2a"]
    ami: ami-038a987c6425a84ad
    minSize: 1
    maxSize: 15
    privateNetworking: true
    ssh:
      publicKeyName: mykey
    iam:
      withAddonPolicies:
        autoScaler: true
  - name: ng-2b-ami-038a987c6425a84ad-m5-2xlarge-v2
    taints:
      spotty: "false:PreferNoSchedule"
    instanceType: m5.2xlarge
    availabilityZones: ["us-west-2b"]
    ami: ami-038a987c6425a84ad
    minSize: 1
    maxSize: 15
    privateNetworking: true
    ssh:
      publicKeyName: mykey
    iam:
      withAddonPolicies:
        autoScaler: true
  - name: ng-2c-ami-038a987c6425a84ad-m5-2xlarge-v2
    taints:
      spotty: "false:PreferNoSchedule"
    instanceType: m5.2xlarge
    availabilityZones: ["us-west-2c"]
    ami: ami-038a987c6425a84ad
    minSize: 1
    maxSize: 15
    privateNetworking: true
    ssh:
      publicKeyName: mykey
    iam:
      withAddonPolicies:
        autoScaler: true
  - name: ng-2a-ami-038a987c6425a84ad-spotty-v1
    labels:
      spotty: "true"
    taints:
      spotty: "true:NoSchedule"
    availabilityZones: ["us-west-2a"]
    ami: ami-038a987c6425a84ad
    minSize: 1
    maxSize: 10
    privateNetworking: true
    ssh:
      publicKeyName: mykey
    iam:
      withAddonPolicies:
        autoScaler: true
    instancesDistribution:
      maxPrice: 0.2
      instanceTypes: ["m4.2xlarge", "m5.2xlarge"]
      onDemandBaseCapacity: 0
      onDemandPercentageAboveBaseCapacity: 0  # no on-demand! only spot!
      spotInstancePools: 2  # use the 2 lowest price spot pools
  - name: ng-2b-ami-038a987c6425a84ad-spotty-v1
    labels:
      spotty: "true"
    taints:
      spotty: "true:NoSchedule"
    availabilityZones: ["us-west-2b"]
    ami: ami-038a987c6425a84ad
    minSize: 1
    maxSize: 10
    privateNetworking: true
    ssh:
      publicKeyName: mykey
    iam:
      withAddonPolicies:
        autoScaler: true
    instancesDistribution:
      maxPrice: 0.2
      instanceTypes: ["m4.2xlarge", "m5.2xlarge"]
      onDemandBaseCapacity: 0
      onDemandPercentageAboveBaseCapacity: 0  # no on-demand! only spot!
      spotInstancePools: 2  # use the 2 lowest price spot pools
  - name: ng-2c-ami-038a987c6425a84ad-spotty-v1
    labels:
      spotty: "true"
    taints:
      spotty: "true:NoSchedule"
    availabilityZones: ["us-west-2c"]
    ami: ami-038a987c6425a84ad
    minSize: 1
    maxSize: 10
    privateNetworking: true
    ssh:
      publicKeyName: mykey
    iam:
      withAddonPolicies:
        autoScaler: true
    instancesDistribution:
      maxPrice: 0.2
      instanceTypes: ["m4.2xlarge", "m5.2xlarge"]
      onDemandBaseCapacity: 0
      onDemandPercentageAboveBaseCapacity: 0  # no on-demand! only spot!
      spotInstancePools: 2  # use the 2 lowest price spot pools

I assume you have one OnDemand instance group to host your CA?

I have three OnDemand instance groups, each with a handful of instances inside, and yes, CA is hosted there.

You'd like to scale up spot instance group?

Yes

Did you use node affinity or node selector in your tests?

I tainted the spot nodes and added tolerations to workloads that can run on spots. It's all working as expected (spot-tolerant workloads are scheduled on spot nodes, non-spot-tolerant workloads avoid spot nodes) as long as I set the minSize of the group to 1.

Did you use MixedInstancePolicy for your instance group?

I believe eksctl uses that under the hood, yes. The resulting groups do seem to be using the feature:

image

@faheem-cliqz

This comment has been minimized.

Copy link

@faheem-cliqz faheem-cliqz commented Oct 9, 2019

I get a similar error with kops provisioning mixed instance groups. I am scaling from zero.

Cluster-autoscaler: v1.14.5
kops: 1.14.0

Error:
Unable to build proper template node for <masked_asg_name>: Unable to get instance type from launch config or launch template

Kops instance group config:

kind: InstanceGroup
metadata:
  creationTimestamp: 2019-10-09T23:43:07Z
  generation: 4
  labels:
    kops.k8s.io/cluster: <masked_cluster_name>
  name: nodes-us-east-1a-gp-mix
spec:
  cloudLabels:
    k8s.io/cluster-autoscaler/enabled: "true"
    kubernetes.io/cluster/<masked_cluster_name>: "true"
  image: kope.io/k8s-1.14-debian-stretch-amd64-hvm-ebs-2019-08-16
  machineType: t3a.xlarge
  maxSize: 2
  minSize: 0
  mixedInstancesPolicy:
    instances:
    - t3a.xlarge
    - m5a.xlarge
    onDemandAboveBase: 50
    spotInstancePools: 2
  nodeLabels:
    kops.k8s.io/instancegroup: nodes-us-east-1a-gp-mix
  role: Node
  rootVolumeSize: 50
  rootVolumeType: gp2
  subnets:
  - us-east-1a

IAM role policy attached to cluster autoscaler:

    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "autoscaling:DescribeAutoScalingGroups",
                "autoscaling:DescribeAutoScalingInstances",
                "autoscaling:DescribeTags",
                "autoscaling:DescribeLaunchConfigurations",
                "autoscaling:SetDesiredCapacity",
                "autoscaling:TerminateInstanceInAutoScalingGroup",
                "ec2:DescribeLaunchTemplateVersions"
            ],
            "Resource": "*"
        }
    ]
}```

If I can help with debugging or providing any further logs / configurations to resolve this, please let me know :) 
@Jeffwan

This comment has been minimized.

Copy link
Contributor

@Jeffwan Jeffwan commented Oct 11, 2019

@mgalgs
I did one test and it can scale up from 0 with your cluster spec. I use on ASG with onDemand and the other one with Spot.

Checking your logs, did you use any node selectors? Seems it fails on GeneralPredicates.
If you use node selector, you need to tag your ASG to let CA know the node labels since there's 0 node available to be used as template

I1002 20:08:23.584696       1 scale_up.go:411] No pod can fit to eksctl-streks3-nodegroup-ng-2b-ami-038a987c6425a84ad-spotty-v1-NodeGroup-GK80456OQZIA
I1002 20:08:23.584720       1 utils.go:237] Pod str-debug-pod-mgalgs-1570046860-579cb979f4-bgh25 can't be scheduled on eksctl-streks3-nodegroup-ng-2c-ami-038a987c6425a84ad-m5-2xlarge-NodeGroup-1HAI8RHKZ1X45, predicate failed: GeneralPredicates predicate mismatch, reason: node(s) didn't match node selector
I1002 20:08:23.584738       1 scale_up.go:411] No pod can fit to eksctl-streks3-nodegroup-ng-2c-ami-038a987c6425a84ad-m5-2xlarge-NodeGroup-1HAI8RHKZ1X45
I1002 20:08:23.584762       1 utils.go:237] Pod str-debug-pod-mgalgs-1570046860-579cb979f4-bgh25 can't be scheduled on eksctl-streks3-nodegroup-ng-2c-ami-038a987c6425a84ad-spotty-v1-NodeGroup-S9U82ABUOL3M, predicate failed: GeneralPredicates predicate mismatch, reason: node(s) didn't match node selector

@Jeffwan

This comment has been minimized.

Copy link
Contributor

@Jeffwan Jeffwan commented Oct 11, 2019

@faheem-cliqz I probably already fix the issue you meet. Please check this 58f3f23#diff-ade7b95627ea0dd6b6f4deee7f24fa7eR323-R331

We will have a release next week

@Jeffwan

This comment has been minimized.

Copy link
Contributor

@Jeffwan Jeffwan commented Oct 11, 2019

/assign @Jeffwan

@Jeffwan

This comment has been minimized.

Copy link
Contributor

@Jeffwan Jeffwan commented Oct 11, 2019

/area provider/aws

@mgalgs

This comment has been minimized.

Copy link
Contributor Author

@mgalgs mgalgs commented Oct 11, 2019

@Jeffwan
Hmm, if it was a nodeSelector problem why does it work fine if I put a minSize of 1 on the groups? Wouldn't it still refuse to schedule on that group if it was a nodeSelector problem?

Regarding the logs:

Checking your logs, did you use any node selectors? Seems it fails on GeneralPredicates.
If you use node selector, you need to tag your ASG to let CA know the node labels since there's 0 node available to be used as template

I1002 20:08:23.584696       1 scale_up.go:411] No pod can fit to eksctl-streks3-nodegroup-ng-2b-ami-038a987c6425a84ad-spotty-v1-NodeGroup-GK80456OQZIA
I1002 20:08:23.584720       1 utils.go:237] Pod str-debug-pod-mgalgs-1570046860-579cb979f4-bgh25 can't be scheduled on eksctl-streks3-nodegroup-ng-2c-ami-038a987c6425a84ad-m5-2xlarge-NodeGroup-1HAI8RHKZ1X45, predicate failed: GeneralPredicates predicate mismatch, reason: node(s) didn't match node selector

This one is expected since I had a nodeSelector on this pod to force it onto a node from the spot group (and this isn't the spot group).

I1002 20:08:23.584738       1 scale_up.go:411] No pod can fit to eksctl-streks3-nodegroup-ng-2c-ami-038a987c6425a84ad-m5-2xlarge-NodeGroup-1HAI8RHKZ1X45
I1002 20:08:23.584762       1 utils.go:237] Pod str-debug-pod-mgalgs-1570046860-579cb979f4-bgh25 can't be scheduled on eksctl-streks3-nodegroup-ng-2c-ami-038a987c6425a84ad-spotty-v1-NodeGroup-S9U82ABUOL3M, predicate failed: GeneralPredicates predicate mismatch, reason: node(s) didn't match node selector

This one is strange because that nodeGroup should have had the necessary labels to allow that pod (with its nodeSelector) to be scheduled on that node... Again, I don't see this problem when minSize of the group is set to 1, and if this was a nodeSelector problem it seems like I'd still have an issue scheduling the pod...

Are you testing with 1.14? If this is definitely fixed in 1.15 it might not even be worth troubleshooting here since we have a workaround (setting the group's minSize to 1).

@faheem-cliqz

This comment has been minimized.

Copy link

@faheem-cliqz faheem-cliqz commented Oct 12, 2019

@faheem-cliqz I probably already fix the issue you meet. Please check this 58f3f23#diff-ade7b95627ea0dd6b6f4deee7f24fa7eR323-R331

We will have a release next week

Cool will get back to you with updates once you guys releases :)

@Jeffwan

This comment has been minimized.

Copy link
Contributor

@Jeffwan Jeffwan commented Oct 12, 2019

@mgalgs
The difference between two is

  1. scale from 0 - CA build template from ASG LaunchTemplate or LaunchConfiguration, it won't know some of the kubernetes node labels. In order to properly build it, we need to add ASG tags and CA will convert tags to labels to construct the node template.
  2. scale from 1 - Since node is already there, CA will use real node as a template.

Do you have tag in your ASG? check here https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler/cloudprovider/aws#scaling-a-node-group-to-0

If you still have the issues, I will try to see anything wrong in 1.14.

@mgalgs

This comment has been minimized.

Copy link
Contributor Author

@mgalgs mgalgs commented Oct 24, 2019

Do you have tag in your ASG?

I see... So node labels and taints need to be applied to the ASGs themselves as well. Looking at my aws autoscaling describe-tags output it appears that my ASGs were not tagged with corresponding tags for labels and taints. If anything it sounds like this might be a bug in eksctl. I'll file an issue over there.

@mgalgs mgalgs closed this Oct 24, 2019
@mgalgs

This comment has been minimized.

Copy link
Contributor Author

@mgalgs mgalgs commented Oct 24, 2019

doh... This has already been raised on the eksctl project with a solution proposed.

Thank you for your help!

mgalgs added a commit to mgalgs/eksctl that referenced this issue Oct 24, 2019
When the cluster-autoscaler adds a new node to a group, it grabs an
existing node in the group and builds a "template" to launch a new node
identical to the one it grabbed from the group.

However, when scaling up from 0 there aren't any live nodes to reference to
build this template.  Instead, the cluster-autoscaler relies on tags in the
ASG to build the new node template.  This can cause unexpected behavior if
the pods triggering the scale-out are using node selectors or taints; CA
doesn't have sufficient information to decide if a new node launched in the
group will satisfy the request.

The long and short of it is that for CA to do its job properly we must tag
our ASGs corresponding to our labels and taints.  Add a note in the docs
about this since scaling up from 0 is a fairly common use case.

References:

  - kubernetes/autoscaler#2418
  - weaveworks#1066
mgalgs added a commit to mgalgs/eksctl that referenced this issue Oct 24, 2019
When the cluster-autoscaler adds a new node to a group, it grabs an
existing node in the group and builds a "template" to launch a new node
identical to the one it grabbed from the group.

However, when scaling up from 0 there aren't any live nodes to reference to
build this template.  Instead, the cluster-autoscaler relies on tags in the
ASG to build the new node template.  This can cause unexpected behavior if
the pods triggering the scale-out are using node selectors or taints; CA
doesn't have sufficient information to decide if a new node launched in the
group will satisfy the request.

The long and short of it is that for CA to do its job properly we must tag
our ASGs corresponding to our labels and taints.  Add a note in the docs
about this since scaling up from 0 is a fairly common use case.

References:

  - kubernetes/autoscaler#2418
  - weaveworks#1066
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.