-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AWS metadata and STS API calls are slow #873
Comments
I encountered an issue somewhat related to this. I was trying to spin up the basic example from Setting Here are some of my environment info in case you need them.
It was stuck at
Last entries from logs:
|
Interesting, @stack72 is @uLan08's issue the same as #814? And pulumi/pulumi#3604? |
Assigning to @pgavlin during M32 to triage as part of the overall performance push. On slow networks, this is by far the dominant performance issue (when deploying to AWS), as far as I can tell. |
I'm already full up for M32--@leezen I'm going to bounce this to you and we can figure out where to put it. |
The commentary in pulumi/pulumi#3671 (comment) is really mostly about the specific issue tracked here. Copying below as well: A few more observations looking at some detailed logs of the initialization sequence:
Total of above is that |
Given the above - I see a few things we could consider changing:
I don't love changing our defaults on these things. It will no doubt lead to confusion in less common cases if we do. But it does feel like defaults that don't penalize 100% of usage in favor of relatively corner case needs are sensible? Thoughts? |
@lukehoban FWIW, there are well known people in the Terraform ecosystem that set these defaults to be different by default as it's known to be slow upstream as well I would suggest we set these and just document that it's the case |
Seeing as this issue is coming up on nearly 12 months, would it be worth considering some action on pulumi's side? I'm running into this issue pretty frequently myself with a fairly basic (standard?) cross-account setup using assume role everywhere with AWS profiles. Setting |
Hi @shousper So I've actually been looking at this issue tonight to see what we can set the default variables to. Do you happen to know how long your pulumi run was before setting that value? I am trying to gauge if you are getting the same behaviour as me Paul |
Fixes: #873 * `skipCredentialsValidation` now defaults to `true`. * `skipGetEc2Platforms` now defaults to `true`. * `skipMetadataApiCheck` now defaults to `true`. * `skipRegionValidation` now defaults to `true`.
Sorry @stack72, doesn't look like this is my issue. It ramped up, so I dug deeper.. looks like I'm in this trap: hashicorp/terraform#27350 golang/go#42700 😞 |
This PR explores reverting the default `aws:skipMetadataApiCheck=false` setting to enable the provider to be able to seamlessly authenticate against an IMDS(v2) endpoints in the AWS environment. It appears that doing so no longer slows down the provider startup time perceptibly. The way I tested the speed delta was by measuring local empty preview of an AWS s3 Bucket using AWS_PROFILE authentication with local <-> us-east-1 there is no perceptible difference. Fixes: #1692 An integration test is added that exercises `pulumi preview` on an EC2 instance with IMDSv2 and asserts that the provider can authenticate successfully. Background: - #873 - #1288
During the configuration of the AWS provider, it calls into multiple AWS APIs to check various endpoints and identity metadata. Times vary quite a bit depending on the speed of your network, however it's not uncommon for these to add up to 10-15 seconds of lag before an update even begins running.
Here are the specific calls:
Note that you can set config variables to skip this logic:
pulumi config set aws:skipCredentialsValidation true
pulumi config set aws:skipMetadataApiCheck true
pulumi config set aws:skipRequestingAccountId true
For some reason, the provider seems to call these APIs twice, one of which ignores the config. @lukehoban posited that this could be due to the way we do a prepass over configuration to validate and check for defaults. If so, that seems like it's a bug that we do it without having first applied the configuration.
Note also that you can set
AWS_METADATA_TIMEOUT=0
which shortens the timeouts of the AWS metadata API calls and does have a small noticeable effect.I don't know precisely what to do here, but we could consider setting our own defaults differently than the underlying Terraform provider. I don't know enough about what those APIs are doing -- it appears, for instance, that the metadata API check is determining whether the update is happening from within an AWS data center (though why the code needs to know that, I'm not quite sure).
The text was updated successfully, but these errors were encountered: