Kops commands fail in AWS with a cluster with more than 50 IGs #6045

KierranM · 2018-11-05T19:32:07Z

1. What kops version are you running? The command kops version, will display
this information.
Version 1.10.0 (git-8b52ea6d1)

2. What Kubernetes version are you running? kubectl version will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops flag.
Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.3", GitCommit:"2bba0127d85d5a46ab4b778548be28623b32d0b0", GitTreeState:"clean", BuildDate:"2018-05-21T09:17:39Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"darwin/amd64"}

3. What cloud provider are you using?
AWS

4. What commands did you run? What is the simplest way to reproduce this issue?
Create a cluster with more than 50 InstanceGroups. Attempt to validate the cluster or delete an instance group.

5. What happened after the commands executed?
error finding CloudInstanceGroups: unable to find autoscale groups: error listing autoscaling groups: ValidationError: The number of group names that may be passed in is limited to 50 status code: 400, request id: 2784b0f9-e12f-11e8-9d96-370abf567283

6. What did you expect to happen?
The instance group would be deleted

7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml to display your cluster manifest.
You may want to remove your cluster name and other sensitive information.
I don't believe this would be helpful in this case. The cluster does indeed have that many ASGs.

8. Please run the commands with most verbose logging by adding the -v 10 flag.
Paste the logs into this report, or in a gist and provide the gist link here.

I1106 08:15:24.556177    7579 factory.go:68] state store s3://my-bucket
I1106 08:15:26.077252    7579 s3context.go:198] Checking default bucket encryption "my-bucket"
I1106 08:15:26.077337    7579 s3context.go:203] Calling S3 GetBucketEncryption Bucket="my-bucket"
I1106 08:15:26.583856    7579 s3context.go:182] Found bucket "my-bucket" in region "ap-southeast-2" with default encryption set to true
I1106 08:15:26.583922    7579 s3fs.go:219] Reading file "s3://my-bucket/my-cluster/config"
I1106 08:15:26.764982    7579 s3fs.go:219] Reading file "s3://my-bucket/my-cluster/instancegroup/t2.small.spotNodes"
I1106 08:15:26.899568    7579 aws_cloud.go:984] Querying EC2 for all valid zones in region "ap-southeast-2"
I1106 08:15:26.899792    7579 request_logger.go:45] AWS request: ec2/DescribeAvailabilityZones
InstanceGroup "t2.small.spotNodes" found for deletion
I1106 08:15:27.165806    7579 aws_cloud.go:424] Listing all Autoscaling groups matching cluster tags
I1106 08:15:27.166400    7579 request_logger.go:45] AWS request: autoscaling/DescribeTags
I1106 08:15:27.647303    7579 request_logger.go:45] AWS request: autoscaling/DescribeTags
I1106 08:15:27.744627    7579 request_logger.go:45] AWS request: autoscaling/DescribeAutoScalingGroups

9. Anything else do we need to know?
Context: We have a cluster with an ASG per instance type, and I was moving to 1 ASG per instance type per availability zone so that cluster-autoscaler could scale more sensibly while taking AZs into account. While updating the cluster a whole new set of InstanceGroups were created, bringing the total to over 50, but I can't delete the old instance groups because there are too many for the AWS AI call.

I believe this is the location where the fix needs to be applied. If I get a chance I'll have a go at fixing this myself.

kops/upup/pkg/fi/cloudup/awsup/aws_cloud.go

Lines 494 to 496 in 1fbc633

    
           request := &autoscaling.DescribeAutoScalingGroupsInput{ 
        
           	AutoScalingGroupNames: asgNames, 
        
           }

The text was updated successfully, but these errors were encountered:

KierranM mentioned this issue Nov 7, 2018

Request AWS ASGs in batches #6056

Merged

k8s-ci-robot closed this as completed in #6056 Nov 17, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kops commands fail in AWS with a cluster with more than 50 IGs #6045

Kops commands fail in AWS with a cluster with more than 50 IGs #6045

KierranM commented Nov 5, 2018

Kops commands fail in AWS with a cluster with more than 50 IGs #6045

Kops commands fail in AWS with a cluster with more than 50 IGs #6045

Comments

KierranM commented Nov 5, 2018