Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry Calls on HTTP 5XX Response Status Codes #210

Closed
bflad opened this issue Apr 23, 2019 · 7 comments · Fixed by #1704
Closed

Retry Calls on HTTP 5XX Response Status Codes #210

bflad opened this issue Apr 23, 2019 · 7 comments · Fixed by #1704
Labels
Status: Stale Used by stalebot to clean house Status: Up for grabs Issues that are ready to be worked on by anyone Type: Feature New feature or request

Comments

@bflad
Copy link
Contributor

bflad commented Apr 23, 2019

Terraform Version

Terraform v0.11.13
+ provider.github v1.3.0

Affected Resource(s)

In our use case, mainly:

  • github_issue_label

Terraform Configuration Files

# Terraform module used across dozens of repositories

resource "github_issue_label" "breaking-change" {
  repository = "${github_repository.provider.name}"
  name       = "breaking-change"
  color      = "d93f0b"
}

resource "github_issue_label" "bug" {
  repository = "${github_repository.provider.name}"
  name       = "bug"
  color      = "f7c6c7"
}

resource "github_issue_label" "crash" {
  repository = "${github_repository.provider.name}"
  name       = "crash"
  color      = "e11d21"
}

resource "github_issue_label" "documentation" {
  repository = "${github_repository.provider.name}"
  name       = "documentation"
  color      = "fef2c0"
}

resource "github_issue_label" "dependencies" {
  repository = "${github_repository.provider.name}"
  name       = "dependencies"
  color      = "fad8c7"
}

resource "github_issue_label" "enhancement" {
  repository = "${github_repository.provider.name}"
  name       = "enhancement"
  color      = "d4c5f9"
}

resource "github_issue_label" "good_first_issue" {
  repository = "${github_repository.provider.name}"
  name       = "good first issue"
  color      = "128A0C"
}

resource "github_issue_label" "hashibot_ignore" {
  repository = "${github_repository.provider.name}"
  name       = "hashibot/ignore"
  color      = "000000"
}

resource "github_issue_label" "help_wanted" {
  repository = "${github_repository.provider.name}"
  name       = "help wanted"
  color      = "128A0C"
}

resource "github_issue_label" "new-data-source" {
  repository = "${github_repository.provider.name}"
  name       = "new-data-source"
  color      = "d4c5f9"
}

resource "github_issue_label" "new-resource" {
  repository = "${github_repository.provider.name}"
  name       = "new-resource"
  color      = "d4c5f9"
}

resource "github_issue_label" "provider" {
  repository = "${github_repository.provider.name}"
  name       = "provider"
  color      = "bfd4f2"
}

resource "github_issue_label" "question" {
  repository = "${github_repository.provider.name}"
  name       = "question"
  color      = "cc317c"
}

resource "github_issue_label" "regression" {
  repository = "${github_repository.provider.name}"
  name       = "regression"
  color      = "e11d21"
}

resource "github_issue_label" "stale" {
  repository = "${github_repository.provider.name}"
  name       = "stale"
  color      = "e11d21"
}

resource "github_issue_label" "technical-debt" {
  repository = "${github_repository.provider.name}"
  name       = "technical-debt"
  color      = "1d76db"
}

resource "github_issue_label" "waiting-response" {
  repository = "${github_repository.provider.name}"
  name       = "waiting-response"
  color      = "5319e7"
}

resource "github_issue_label" "upstream" {
  repository = "${github_repository.provider.name}"
  name       = "upstream"
  color      = "fad8c7"
}

resource "github_issue_label" "upstream-terraform" {
  repository = "${github_repository.provider.name}"
  name       = "upstream-terraform"
  color      = "cccccc"
}

variable "sizes" {
  default = ["XS", "S", "M", "L", "XL", "XXL"]
}

resource "github_issue_label" "size" {
  count      = "${length(var.sizes)}"
  repository = "${github_repository.provider.name}"
  name       = "size/${var.sizes[count.index]}"
  color      = "ffffff"
}

Debug Output

Issue is intermittent, can provide if necessary.

Expected Behavior

The Terraform resource should retry the HTTP request on retryable HTTP response status codes (e.g. 5XX).

Actual Behavior

Error: Error refreshing state: 4 error(s) occurred:

* module.fortios.github_issue_label.stale: 1 error(s) occurred:

* module.fortios.github_issue_label.stale: github_issue_label.stale: GET https://api.github.com/repos/terraform-providers/terraform-provider-fortios/labels/stale: 502 Server Error []
* module.brightbox.github_issue_label.size: 1 error(s) occurred:

* module.brightbox.github_issue_label.size[2]: github_issue_label.size.2: GET https://api.github.com/repos/terraform-providers/terraform-provider-brightbox/labels/size/M: 502 Server Error []
* module.rancher.github_issue_label.question: 1 error(s) occurred:

* module.rancher.github_issue_label.question: github_issue_label.question: GET https://api.github.com/repos/terraform-providers/terraform-provider-rancher/labels/question: 502 Server Error []
* module.tfe.github_issue_label.regression: 1 error(s) occurred:

* module.tfe.github_issue_label.regression: github_issue_label.regression: GET https://api.github.com/repos/terraform-providers/terraform-provider-tfe/labels/regression: 502 Server Error []

Steps to Reproduce

  1. terraform apply

Important Factoids

The Terraform state above has 2600+ github_issue_label resources. (Occurs more often on larger amounts of github_issue_label resources.)

References

@bflad bflad added the Type: Feature New feature or request label Apr 23, 2019
@majormoses
Copy link
Contributor

Ya I have seen this a few times as well, adding some very basic retries would be good. We might want to consider putting in some kind of circuit breaker with that many objects as this could easily contribute to being rate limited with a small outage once its over and being left in an odd in between state.

@captn3m0
Copy link

The number of failures we've been getting due to these random 502s is steadily increasing. Now every plan takes 3-5 attempts before we get lucky.

@captn3m0
Copy link

I reached out to GitHub support, and they mentioned that having the x-github-request-id Header alongside such failures would be helpful. Would be nice if we could add that to terraform error message, in case of failures.

@bflad
Copy link
Contributor Author

bflad commented May 24, 2019

@captn3m0 that sounds like a great enhancement! It should probably be added as a top level feature request so it doesn't get lost in here since they would likely require two different implementations. 👍

@mattmichal
Copy link

I recently had a failure due to a 503 error. Due to API limits, we're using read delay and write delay which means that plans can take a long time to run. A single 503 error requires that the plan run again, taking up additional time that is better spent on other things. An HTTP call retry would be a very valuable feature.

@kfcampbell kfcampbell added Status: Up for grabs Issues that are ready to be worked on by anyone Priority: High labels Nov 28, 2022
@dcfranca
Copy link
Contributor

Funny thing, ChatGPT things it is already implemented:

Yes, Terraform supports retrying operations for the GitHub provider by setting the retryable_errors and max_retries configuration options in the provider block.

Here's an example of how to configure retrying for the GitHub provider in Terraform:

provider "github" {
  token = "YOUR_GITHUB_TOKEN"
  retryable_errors = ["429", "500", "502", "503", "504"]
  max_retries = 5
}

dcfranca added a commit to dcfranca/terraform-provider-github that referenced this issue May 29, 2023
In order to address the issue integrations#210
I have added 3 new parameters to the provider

- retry_delay_ms
- max_retries
- retryable_errors

In case max_retries is > 0 (defaults to zero)
it will retry the request in case it is an error
the retryable_errors defaults to [500, 502, 503, 504]

It retries after the ms specified in retry_delay_ms (defaults to 1000)
kfcampbell added a commit that referenced this issue Jan 10, 2024
* Add retryable transport for errors

In order to address the issue #210
I have added 3 new parameters to the provider

- retry_delay_ms
- max_retries
- retryable_errors

In case max_retries is > 0 (defaults to zero)
it will retry the request in case it is an error
the retryable_errors defaults to [500, 502, 503, 504]

It retries after the ms specified in retry_delay_ms (defaults to 1000)

* Update documentation with new parameters for retry

* Change default of max_retries to 3

* Fix typo in naming

* Update github/transport_test.go

* Add error check for data seek

* Consolidate the default retriable errors on a function

* Fix typo on the comments

Co-authored-by: Keegan Campbell <me@kfcampbell.com>

* Update vendor

* Fix merging conflicts

* Update documentation with new parameters for retry

* Change default of max_retries to 3

* Fix typo in naming

* Add error check for data seek

* Update github/transport_test.go

* Consolidate the default retriable errors on a function

* Fix typo on the comments

Co-authored-by: Keegan Campbell <me@kfcampbell.com>

* Don't run go mod tidy on release (#1788)

* Don't run go mod tidy on release

* Be more specific about releases

* Fix lint with APIMeta deprecation

---------

Co-authored-by: Keegan Campbell <me@kfcampbell.com>
Co-authored-by: Nick Floyd <139819+nickfloyd@users.noreply.github.com>
kfcampbell added a commit that referenced this issue Feb 16, 2024
* Add retryable transport for errors (#1704)

* Add retryable transport for errors

In order to address the issue #210
I have added 3 new parameters to the provider

- retry_delay_ms
- max_retries
- retryable_errors

In case max_retries is > 0 (defaults to zero)
it will retry the request in case it is an error
the retryable_errors defaults to [500, 502, 503, 504]

It retries after the ms specified in retry_delay_ms (defaults to 1000)

* Update documentation with new parameters for retry

* Change default of max_retries to 3

* Fix typo in naming

* Update github/transport_test.go

* Add error check for data seek

* Consolidate the default retriable errors on a function

* Fix typo on the comments

Co-authored-by: Keegan Campbell <me@kfcampbell.com>

* Update vendor

* Fix merging conflicts

* Update documentation with new parameters for retry

* Change default of max_retries to 3

* Fix typo in naming

* Add error check for data seek

* Update github/transport_test.go

* Consolidate the default retriable errors on a function

* Fix typo on the comments

Co-authored-by: Keegan Campbell <me@kfcampbell.com>

* Don't run go mod tidy on release (#1788)

* Don't run go mod tidy on release

* Be more specific about releases

* Fix lint with APIMeta deprecation

---------

Co-authored-by: Keegan Campbell <me@kfcampbell.com>
Co-authored-by: Nick Floyd <139819+nickfloyd@users.noreply.github.com>

* fix: remove repository topic from state if it doesnt exist in GitHub anymore (#1918)

* remove repository topic if they cannot be found in GitHub anymore

* Fix build error by bumping package version in offending file

---------

Co-authored-by: Keegan Campbell <me@kfcampbell.com>

* Bump version to v6 (#2106)

* Upgrade to Terraform Plugin SDK v2 (#1780)

* go mod tidy -go=1.16 && go mod tidy -go=1.17

* Run go mod vendor

* Attempt v2 upgrade

* Plugin compiling

* Fix some provider test errors

* Fix test compilation error

* ValidateFunc --> ValidateDiagFunc

* Fix casing

* Sprinkle toDiagFunc everywhere

* More fixes for validation functions

* State --> StateContext

* %s --> %v when printing diags

* ConfigureFunc --> ConfigureContextFunc

* Checking results of d.Set, round one

* Continue checking d.Set results

* Check results of d.Set, round three

* Checking d.Set results, round four

* d.Set round five

* In tests, export GITHUB_TEST_ORGANIZATION

* Remove unnecessary MaxItems on computed value

* Go build now works

* Resolve linting errors

* Apply diag.FromErr twice more

* Pass key names into toDiagFunc helper

* Construct cty.Path from strings

* Tests now working

* Update terraform-plugin-sdk version

* Remove commented attribute setting in resource_github_team.go

* Fix restrict pushes on github_branch_protection. Fix branch protection tests (#2045)

* Update restrict pushes. Fix branch protection tests

Change blocks_creations default to true. Fix broken build.

* add state migration

* rename push_restrictions to push_allowances

* correct state migration issue

* add docs clarification

* update migration func args

* fix test args

* cleanup tests

* Pass context.Background() in test

* fix timestamp fields

---------

Co-authored-by: Keegan Campbell <me@kfcampbell.com>

* Set group_id correctly (#2133)

* Run go get -u github.com/golangci/golangci-lint

* Separate github_team_members import from github_team as create_default_maintainers is not defined for members resource (#2126)

Co-authored-by: Keegan Campbell <me@kfcampbell.com>

* Add missing variable definition for test case

---------

Co-authored-by: Daniel França <github.t6297kgphp.dv@koderama.com>
Co-authored-by: Nick Floyd <139819+nickfloyd@users.noreply.github.com>
Co-authored-by: Felix Luthman <34520175+felixlut@users.noreply.github.com>
Co-authored-by: georgekaz <1391828+georgekaz@users.noreply.github.com>
Co-authored-by: Rich Young <richjyoung@users.noreply.github.com>
Copy link

👋 Hey Friends, this issue has been automatically marked as stale because it has no recent activity. It will be closed if no further activity occurs. Please add the Status: Pinned label if you feel that this issue needs to remain open/active. Thank you for your contributions and help in keeping things tidy!

@github-actions github-actions bot added the Status: Stale Used by stalebot to clean house label Apr 25, 2024
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale May 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Stale Used by stalebot to clean house Status: Up for grabs Issues that are ready to be worked on by anyone Type: Feature New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants