-
-
Notifications
You must be signed in to change notification settings - Fork 953
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Terragrunt apply fails (could not find aws credentails ) #2730
Comments
Hello, |
Hi denis256, Terragrunt and terraform remained same version in local machines and jenkins I don't think AWS credentails removed if it was it shouldn't execute eny modules but some modules are being executed. |
I've also been encountering this. Jenkins job does run-all init, validate, plan on many directories in parallel, and some of them (not the same ones, not necessarily at the same points in the process) error out saying there are no credentials. I suspect AWS' behavior has changed (rate-limiting maybe?) because the Terragrunt version hasn't. Trying to see if auto-retry for this error helps now. |
Update: auto-retry tuning is dicy. I got it to sometimes work by also setting the number of retries to 5, but occasionally that wasn't enough, so I also increased the delay, and then it started failing the job after only one error. So I haven't been able to come up with a consistent method to avoid this. |
Changing auto-retry doesn't seem to work, which probably is because the error Terragrunt surfaces is its own and not caught from elsewhere? I have:
but it doesn't retry at all. |
Contacted AWS support, who told me that they don't publish the throttling/rate limiting numbers because "they're internal" (so, they don't publish the numbers because they don't publish the numbers?) and that Terragrunt should implement a retry with exponential backoff. The AWS support person indicated that the limit might change at any point, which I suspect means they did recently change it. Experimentally: we've got about 150 modules and we hit a few denials each time; setting TERRAGRUNT_PARALLELISM to 100 seems to prevent the failures, though I haven't got many runs to prove it. UPDATE: no, we see it at 100. I think the limit must be under 70. |
@denis256 given the latest information, is there anything that should be looked at from this point? |
I will do more tests, but so far I have been thinking about:
|
It would be great help denis, we're facing this issue for a while |
Hi, |
Hi @denis256 |
It is still complicated on my side to reproduce this issue, I tried to setup something in https://github.com/denis256/terragrunt-tests/tree/master/aws-rate-limit but still not getting the same error as reported. Will be helpful to share an example repository where this error happens. |
Hi denis, |
I got this working my issue was resolved by updating the terragurnt and also increasing the RAM. it would be nice that if it highlighted the memory error and also limited the terrragrunt/terraform memory usage. articles/blogs from online about limiting RAM usage, shows there are good few of them experience this issue because of the modules and providers sizes. the problem is not with the module api calls but with aws provider processes that are heavy because they support lot of aws services at once. In our case for environments like NFT and others which have a lot of resources to deploy which require a lot of provider versions and doing all that stuff at once would require a good bit of RAM. So 8GB would crash terraform. Any plans in the future to throttle it without breaking it. |
Hi All,
We use Terraform and Terragrunt to manage AWS infrastructure. when I run the terragrunt locally it seems fine and no issues in deploying infrastructure but it errors out while deploying through Jenkins as no AWS creds were found and it only happens to some of the folders rest all other services in other folders deploy successfully. it was working fine till a week ago but all of a sudden there is an issue. Not sure what went wrong any suggestions pls?
Previously we used to save .terraform.lock.hcl in SCM along with terragrunt.hcl but we’ve removed in some folders and there is inconsistenyc so we've reinitailised and saved .terraform.lock.hcl in folders. is it causing issues?
Exact Errors
Versions
Any suggestions please?
The text was updated successfully, but these errors were encountered: