Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error launching source instance: timeout while waiting for state to become 'success'. last error: %!s(<nil>) #8857

Closed
RoryKiefer opened this issue Sep 15, 2016 · 7 comments

Comments

@RoryKiefer
Copy link

RoryKiefer commented Sep 15, 2016

Hey guys - this is an easy one. It looks like when theres no more capacity for an instance type in an AZ, and TF tries to make an instance and fails because of it, it fails to pass that message down through TF, and all you get is the subject line instead.

I was able to discover the nature of the error by attempting to replicate what TF was doing manually in the WebUI. The WebUI returned the proper error message: "We currently do not have sufficient (instance_type (m4.large, in my scenario)) capacity in the Availability Zone you requested (AZ). Our system will be working on provisioning additional capacity. You can currently get a (instance_type) by not specifying an Area Zone in your request or choosing (accounts_other_AZs)."

This is in version 0.7.0

@jen20
Copy link
Contributor

jen20 commented Sep 20, 2016

Hi @kieferrj! This definitely looks like a failure to handle the specific error type. Unfortunately it's also very hard to test for since AWS don't run out of capacity that often or in a deterministic fashion! We'll investigate this since there is clearly a bug given the error message construction. Thanks for opening an issue!

@akoeb-zalando
Copy link

requesting m3.large in eu-central-1a currently produces this error (and did so in the past).

running with TF_LOG=debug, I can see following response from AWS:

2016/09/20 11:54:43 [DEBUG] plugin: terraform: -----------------------------------------------------
2016/09/20 11:54:43 [DEBUG] plugin: terraform: aws-provider (internal) 2016/09/20 11:54:43 [DEBUG] [aws-sdk-go] DEBUG: Response ec2/RunInstances Details:
2016/09/20 11:54:43 [DEBUG] plugin: terraform: ---[ RESPONSE ]--------------------------------------
2016/09/20 11:54:43 [DEBUG] plugin: terraform: HTTP/1.1 500 Internal Server Error
2016/09/20 11:54:43 [DEBUG] plugin: terraform: Connection: close
2016/09/20 11:54:43 [DEBUG] plugin: terraform: Transfer-Encoding: chunked
2016/09/20 11:54:43 [DEBUG] plugin: terraform: Date: Tue, 20 Sep 2016 09:54:43 GMT
2016/09/20 11:54:43 [DEBUG] plugin: terraform: Server: AmazonEC2
2016/09/20 11:54:43 [DEBUG] plugin: terraform: 
2016/09/20 11:54:43 [DEBUG] plugin: terraform: 1fa
2016/09/20 11:54:43 [DEBUG] plugin: terraform: <?xml version="1.0" encoding="UTF-8"?>
2016/09/20 11:54:43 [DEBUG] plugin: terraform: <Response><Errors><Error><Code>InsufficientInstanceCapacity</Code><Message>We currently do not have sufficient m3.large capacity in the Availability Zone you requested (eu-central-1a). Our system will be working on provisioning additional capacity. You can currently get m3.large capacity by not specifying an Availability Zone in your request or choosing eu-central-1b.</Message></Error></Errors><RequestID>5cbe8339-9fed-49d7-bde0-d4b1c15eb1e3</RequestID></Response>
2016/09/20 11:54:43 [DEBUG] plugin: terraform: 0
2016/09/20 11:54:43 [DEBUG] plugin: terraform: 
2016/09/20 11:54:43 [DEBUG] plugin: terraform: 
2016/09/20 11:54:43 [DEBUG] plugin: terraform: -----------------------------------------------------

@Bowbaq
Copy link
Contributor

Bowbaq commented Oct 31, 2016

Ran into this today. It seems that in addition to failing the terraform run, in some cases AWS still provisions the instance. This causes a subsequent call to terraform destroy to fail since security groups have "dependent objects" that aren't tracked in the state file.

@archmangler
Copy link

I'm getting this with TF 0.8.8 and 0.9.1. With r4.16xlarge instances, regardless of whether I try launch 1 or 10 the deploy fails with timeout waiting for instance. Is there a workaround for this ?

@twelcome
Copy link

twelcome commented Mar 22, 2017

Some additional notes:

  • I can consistently launch 10 r4.8xlarge instances ok
  • I get consistent timeouts with launching any number of r4.16xlarge instances.

Definitely seems related to instance deploy time - if there was a reliable tunable in TF for this I'm sure the problem could be resolved.

@radeksimko
Copy link
Member

Hi folks,
we have improved the error message by adding the state reason, #14479 was just merged and will be part of the next release (0.9.6), therefore I'm closing this issue.

Let me know if you have any suggestions on further improvement.

@ghost
Copy link

ghost commented Apr 12, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Apr 12, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants