Skip to content
This repository has been archived by the owner on Nov 1, 2023. It is now read-only.

Create fails for ASG - Unable to satisfy 100% MinSuccessfulInstancesPercent requirement #35

Closed
sompnd opened this issue Jun 6, 2018 · 23 comments

Comments

@sompnd
Copy link

sompnd commented Jun 6, 2018

Hi,

I am running the stack for existing VPC in eu-central-1. Unfortunately every time it fails when creating the autoscaling group with the message:

BastionAutoScalingGroup Received 1 FAILURE signal(s) out of 1. Unable to satisfy 100% MinSuccessfulInstancesPercent requirement

I am using all the default values apart from the QSS3KeyPrefix value.
Thanks.

@andrew-glenn
Copy link
Member

What OS Are you using with your stack? Are there any logs available from the instance before you terminated it?

@pnomolos
Copy link

This is likely due to the fact that AWS Auto Scaling is only available in Ireland (of the European regions), I would think? I'm running in to the same issue with an ASG in ca-central-1 and I believe it's the same problem.

@andrew-glenn
Copy link
Member

andrew-glenn commented Jul 11, 2018 via email

@rniksch
Copy link

rniksch commented Jul 26, 2018

I have not been able to replicate this issue.
I have launched thus quick start in a new VPC as well as existing VPCs in us-west-1, us-east-1 eu-central and ca-central. In all tests the stack completes as expected and auto scaling behaves as expected. I can confirm that the above is not related to Autoscaling support in eu-central or ca-central.

At this stage I suspect something in the existing VPC being launched into may be hindering connectivity for the CFN-init process.

Please would you confirm if this is still manifesting.

@hierynomus
Copy link

@sompnd We ran into the same, also indeed only changing the QSS3KeyPrefix. We managed to fix it by ensuring that the QSS3KeyPrefix ends with a /! If it doesn't the URL of the init script does not match up and subsequently the bastion host will not initialize. I've submitted #50 to change the regex to validate that the last character is a /.

@itskaranshah
Copy link

Couldn't get around this error :-(

@tomiszili
Copy link

This bug is still in the script, if I run from my own S3 bucket. Even if the QSSKeyPrefix ends with '/'. I run the quickstart-eks nested stack but it fails because this error.

@schottsfired
Copy link

https://github.com/aws-quickstart/quickstart-amazon-eks/issues/9 looks related. I was able to work around it using the technique mentioned in the ticket.

@tomiszili
Copy link

tomiszili commented Aug 23, 2019

Thanks @schottsfired!
I copied the entire eks-quickstart repo with submodules to my S3 bucket, and then ran the new VPC master template from CloudFormation and fails. But if I run the guide https://docs.aws.amazon.com/quickstart/latest/amazon-eks-architecture/welcome.html completes properly.

@vsnyc
Copy link
Member

vsnyc commented Aug 23, 2019

This bug is still in the script, if I run from my own S3 bucket. Even if the QSSKeyPrefix ends with '/'. I run the quickstart-eks nested stack but it fails because this error.

S3 sig v2 is on the path to deprecation, however at this time cfn-init does not make sig v4 requests when fetching files from S3. When you test quick starts with taskcat and have used the "$[taskcat_autobucket]" token for QSS3BucketName parameter, you'll run into this error if you have not passed in --enable-sig-v2 argument to taskcat, since taskcat applies a bucket policy on the autogenerated bucket that disables sig v2. See source code.

There are two options when testing with taskcat:

  1. pass --enable-sig-v2 when testing from the autogenerated bucket
  2. Use your own bucket with taskcat and ensure it doesn't have any policies that disallow sig v2. To do this you can add s3bucket: <your-bucket-name> property in the global section in taskcat.yml, and also use the same bucket as the value for QSS3BucketName parameter.

@vsnyc
Copy link
Member

vsnyc commented Aug 23, 2019

See also comment #44 (comment) and a screen recording I had created for this issue.

@tomiszili
Copy link

Is it possible to create this stack without taskcat? I can't find any taskcat script inside the quickguide CFN files which could run during the stack creation and indicate this problem.
What is the main difference if i'm using the native AWS deploy with default parameters or using the eks-quickstart repo?

@vsnyc
Copy link
Member

vsnyc commented Aug 23, 2019

Yes, absolutely. Taskcat is not a requirement. Please follow the instructions on how to run from your own bucket in our contributor's guide.

What is the main difference if i'm using the native AWS deploy with default parameters or using the eks-quickstart repo?

There is no difference as far as the bastion stack goes, it is just launched as a nested stack. The three issues I most commonly see are 1) stack failing due to sig v2 errors with cfn-init ; 2) repo not being recursively cloned and running into errors at the time the nested stack is launched; 3) resource limit errors on the account.

I am happy to create a screen recording for EKS Quick Start if it helps.

Also, looking at the original issue reported "I am using all the default values apart from the QSS3KeyPrefix value." - I have to assume that QSS3BucketName was different as well, else it would never work.

@tomiszili
Copy link

Thanks for your support @vsnyc!

Also, looking at the original issue reported "I am using all the default values apart from the QSS3KeyPrefix value." - I have to assume that QSS3BucketName was different as well, else it would never work.

Yes both of the parameters were set related to my S3 bucket structure.

If it is possible please record an EKS Quick Start with CloudFormation and with a new S3 bucket. I'm getting really desperate about the failed stack creation, because I did everthing as the contributor's guide says and my whole bucket is public and objects as well.

@vsnyc
Copy link
Member

vsnyc commented Aug 26, 2019

@tomiszili I just realized something that hasn't come up in a while. When hosting from your own bucket, please make sure to upload it in a Region that supports sig v2 authentication. us-east-1 is a good option.

I'll post the screen recording as soon as it becomes available on youtube.

Edit to add: the screen recording is now available at: https://youtu.be/EugmjAzF5rw. I didn't do much post processing, increase the playback speed to 2x to go through it fast.

@tomiszili
Copy link

@vsnyc thanks for the video and support.
I find one solution for this problem, the bucket URL in https://github.com/aws-quickstart/quickstart-amazon-eks/blob/master/templates/amazon-eks.template.yaml#L244 should be: https://s3.${S3BucketRegion}.amazonaws.com/${QSS3BucketName}/${QSS3KeyPrefix}scripts/bastion_bootstrap.sh

Currently i don't know how to get the region of a bucket from cloudformation automatically, so i hardcoded it in the yaml template like this: https://s3.eu-central-1.amazonaws.com/${QSS3BucketName}/${QSS3KeyPrefix}scripts/bastion_bootstrap.sh

I didn't try but an idea: maybe the bastion instance should download it by linux commands with latest awscli instead of AWS::CloudFormation::Init:config:files:source in the template.

@tonynv
Copy link
Member

tonynv commented Oct 15, 2019

Version 2 will use cfn-init calls

You can get the region using a Conditional like so:

Add conditional

Conditions:
  GovCloudCondition: !Equals
    - !Ref 'AWS::Region'
    - us-gov-west-1

Build the s3 path using conditional

      UserData: !Base64
        Fn::Sub:
          - |
            #!/bin/bash -x
            https://${QSS3BucketName}.${S3Region}.amazonaws.com/${QSS3KeyPrefix}
          -
            S3Region: !If [ GovCloudCondition, s3-us-gov-west-1, s3] 

@tonynv
Copy link
Member

tonynv commented Oct 15, 2019

Closing this issue. Please track version 2 for progress. If any issues are still open at release please open a new issue

@tonynv tonynv closed this as completed Oct 15, 2019
@aryak007
Copy link

None of the workarounds are working. This issue shouldn't be closed.

@schottsfired
Copy link

I hit this error message last week in our Quick Start, and fixed it with https://github.com/aws-quickstart/quickstart-cloudbees-core/commit/6902e697c7419677bafd80425260aa569278fe3a. The problem was that CFN never received the signal that the instance was running. Hope it helps!

@nathalieDOXA
Copy link

I hit this error message last week in our Quick Start, and fixed it with aws-quickstart/quickstart-cloudbees-core@6902e69. The problem was that CFN never received the signal that the instance was running. Hope it helps!

Hi @schottsfired, could you help me explain more on that? It seems the linux-bastion is different from the cloudbees-core, maybe my knowledge is limited to that.
Thank you in advance.

@schottsfired
Copy link

Hi @nathalieDOXA, for sure quickstart-cloudbees-core is a separate project, but we interact with quickstart-linux-bastion via submodules. Our QS submodules quickstart-amazon-eks, and that QS submodules quickstart-linux-bastion (and quickstart-aws-vpc).

@eyedean
Copy link

eyedean commented Jan 19, 2021

It happened to me and after an hour of debugging, I found that the "banner text" cannot be a simple Welcome to bastion! string as it's directly passed as a bash argument!

To debug:

  1. Disable rollback
  2. When the Instance is up (even during "initializing...") SSH into it and do cat /var/log/cfn-init.log

Mine was:

...
2021-01-19 06:37:04,609 [DEBUG] Running command b-bootstrap
2021-01-19 06:37:04,609 [DEBUG] No test for command b-bootstrap
2021-01-19 06:37:07,672 [ERROR] Command b-bootstrap (./bastion_bootstrap.sh --banner Welcome to The Bastion! --enable true --tcp-forwarding true --x11-forwarding false) failed
2021-01-19 06:37:07,672 [DEBUG] Command b-bootstrap output: checkos Ended
which: no aws in ((null))
...

Maybe in the future, a validator (like the one for RemoteAccessCIDR) in the template would save other folks' time debugging.

Hope it helps. :)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests