Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resource/aws_cloudfront_distribution: Allow final creation and update timeout retry for AWS Go SDK retries #7809

Merged
merged 1 commit into from
Mar 5, 2019

Conversation

bflad
Copy link
Member

@bflad bflad commented Mar 4, 2019

Closes #6197

When using resource.Retry() for handling eventual consistency, it will timebox the inner function to the configured timeout, which we generally set to a minute or two. The AWS Go SDK, when it encounters recoverable conditions such as 5XX errors or throttling errors, will automatically retry within itself up to the configured session MaxRetries (Terraform AWS Provider max_retries configuration) before returning to the calling code. For heavily utilized AWS accounts, the throttling errors will cause the outer timeout, which does not give the resource the opportunity to keep retrying outside the timebox.

Here we implement this final retry by checking for timeout error from resource.Retry() outside the timeboxing, so the AWS Go SDK can return the proper error messaging in these situations or (hopefully) finally succeed in the case of throttling. Since this error handling condition would require extraneous amounts of resources to only potentially trigger the handling, we do not generally implement covering acceptance testing for this code, but it may be a good candidate for special Terraform AWS Provider handling within a future planned Terraform Provider linting tool.

Output from acceptance testing:

--- PASS: TestAccAWSCloudFrontDistribution_Origin_EmptyOriginID (2.08s)
--- PASS: TestAccAWSCloudFrontDistribution_Origin_EmptyDomainName (2.08s)
--- PASS: TestAccAWSCloudFrontDistribution_ViewerCertificate_AcmCertificateArn (1821.71s)
--- PASS: TestAccAWSCloudFrontDistribution_ViewerCertificate_AcmCertificateArn_ConflictsWithCloudFrontDefaultCertificate (1821.72s)
--- PASS: TestAccAWSCloudFrontDistribution_noCustomErrorResponseConfig (2086.99s)
--- PASS: TestAccAWSCloudFrontDistribution_orderedCacheBehavior (2090.63s)
--- PASS: TestAccAWSCloudFrontDistribution_HTTP11Config (2092.43s)
--- PASS: TestAccAWSCloudFrontDistribution_noOptionalItemsConfig (2092.72s)
--- PASS: TestAccAWSCloudFrontDistribution_IsIPV6EnabledConfig (2097.43s)
--- PASS: TestAccAWSCloudFrontDistribution_S3Origin (2277.83s)
--- PASS: TestAccAWSCloudFrontDistribution_multiOrigin (2280.49s)
--- PASS: TestAccAWSCloudFrontDistribution_customOrigin (2282.05s)
--- PASS: TestAccAWSCloudFrontDistribution_S3OriginWithTags (3345.90s)

… timeout retry for AWS Go SDK retries

Reference:
* #6197

When using `resource.Retry()` for handling eventual consistency, it will timebox the inner function to the configured timeout, which we generally set to a minute or two. The AWS Go SDK, when it encounters recoverable conditions such as 5XX errors or throttling errors, will automatically retry within itself up to the configured session `MaxRetries` (Terraform AWS Provider `max_retries` configuration) before returning to the calling code. For heavily utilized AWS accounts, the throttling errors will cause the outer timeout, which does not give the resource the opportunity to keep retrying outside the timebox.

Here we implement this final retry by checking for timeout error from `resource.Retry()` outside the timeboxing, so the AWS Go SDK can return the proper error messaging in these situations or (hopefully) finally succeed in the case of throttling. Since this error handling condition would require extraneous amounts of resources to only potentially trigger the handling, we do not generally implement covering acceptance testing for this code, but it may be a good candidate for special Terraform AWS Provider handling within a future planned Terraform Provider linting tool.

Output from acceptance testing:

```
--- PASS: TestAccAWSCloudFrontDistribution_Origin_EmptyOriginID (2.08s)
--- PASS: TestAccAWSCloudFrontDistribution_Origin_EmptyDomainName (2.08s)
--- PASS: TestAccAWSCloudFrontDistribution_ViewerCertificate_AcmCertificateArn (1821.71s)
--- PASS: TestAccAWSCloudFrontDistribution_ViewerCertificate_AcmCertificateArn_ConflictsWithCloudFrontDefaultCertificate (1821.72s)
--- PASS: TestAccAWSCloudFrontDistribution_noCustomErrorResponseConfig (2086.99s)
--- PASS: TestAccAWSCloudFrontDistribution_orderedCacheBehavior (2090.63s)
--- PASS: TestAccAWSCloudFrontDistribution_HTTP11Config (2092.43s)
--- PASS: TestAccAWSCloudFrontDistribution_noOptionalItemsConfig (2092.72s)
--- PASS: TestAccAWSCloudFrontDistribution_IsIPV6EnabledConfig (2097.43s)
--- PASS: TestAccAWSCloudFrontDistribution_S3Origin (2277.83s)
--- PASS: TestAccAWSCloudFrontDistribution_multiOrigin (2280.49s)
--- PASS: TestAccAWSCloudFrontDistribution_customOrigin (2282.05s)
--- PASS: TestAccAWSCloudFrontDistribution_S3OriginWithTags (3345.90s)
```
@bflad bflad added bug Addresses a defect in current functionality. service/cloudfront Issues and PRs that pertain to the cloudfront service. labels Mar 4, 2019
@bflad bflad requested a review from a team March 4, 2019 22:46
@ghost ghost added the size/S Managed by automation to categorize the size of a PR. label Mar 4, 2019
Copy link
Member

@nywilken nywilken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@bflad bflad added this to the v2.1.0 milestone Mar 5, 2019
@bflad bflad merged commit 0b5b4e6 into master Mar 5, 2019
@bflad bflad deleted the b-aws_cloudfront_distribution-resourcetimeout-retry branch March 5, 2019 15:58
bflad added a commit that referenced this pull request Mar 5, 2019
@bflad
Copy link
Member Author

bflad commented Mar 8, 2019

This has been released in version 2.1.0 of the AWS provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

@ghost
Copy link

ghost commented Mar 31, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!

@ghost ghost locked and limited conversation to collaborators Mar 31, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Addresses a defect in current functionality. service/cloudfront Issues and PRs that pertain to the cloudfront service. size/S Managed by automation to categorize the size of a PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Inconsistent Timeout on CloudFront distribution creation (150+ distros)
2 participants