Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add retries for operations to account for intermittent network issues #6628

Closed
AnirudhaS opened this issue Jun 17, 2020 · 5 comments
Closed

Comments

@AnirudhaS
Copy link

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment. If the issue is assigned to the "modular-magician" user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If the issue is assigned to a user, that user is claiming responsibility for the issue. If the issue is assigned to "hashibot", a community member has claimed the issue already.

Description

Currently resource for instance creation or table creation for Bigtable resources do not seem to rely on retrying requests to account for intermittent network failures.

Consider a Terraform task to setup a bigtable instance with column family configuration. This results in multiple requests, and if additional requests fail, it results in complete task failure. Moreover, to complete the task involves having to delete the tables manually, as task fails due to since the table already exists.

New or Affected Resource(s)

  • google_bigtable_table

Potential Terraform Configuration

resource google_bigtable_table "table" {
  name          = replace(var.name, "some-instance", "")
  project       = var.project
  instance_name = var.instance_name
  split_keys    = var.split_keys
  dynamic column_family {
    for_each = var.column_families
    content {
      family = column_family.value
    }
  }
}

References

@ghost ghost added enhancement labels Jun 17, 2020
@danawillow danawillow added this to the Goals milestone Jun 22, 2020
modular-magician added a commit to modular-magician/terraform-provider-google that referenced this issue Oct 6, 2022
Signed-off-by: Modular Magician <magic-modules@google.com>
modular-magician added a commit that referenced this issue Oct 6, 2022
Signed-off-by: Modular Magician <magic-modules@google.com>

Signed-off-by: Modular Magician <magic-modules@google.com>
@kevinsi4508
Copy link
Contributor

kevinsi4508 commented Oct 24, 2022

This issues seems to have two parts:

  1. Resource creation should be tried on transient errors.
  2. Handling already_exists error.

We still have some work to do with retrying transient error, but application level retry should be sufficient.
For example KCC periodically retries/reconciles the resource.

Handling already_exists error might be something we can do.

I think it was something like this:

  1. the resource was created
  2. on error removed resource: clear ID.
  3. TF apply again got already_existed error.

In GoogleCloudPlatform/magic-modules#6735, we will remove the resource only if there is a NOT_FOUND error. This should prevent the resources being cleared on transient errors (or any error other than NOT_FOUND).

@melinath
Copy link
Collaborator

b/259278591

@kevinsi4508
Copy link
Contributor

We have more work to do in term of error handling. However, with GoogleCloudPlatform/magic-modules#6735, we don't clear the resource on errors other than NOT_FOUND. This should fix the reported problem. We should close this issue if no objection.

@melinath
Copy link
Collaborator

Closing the issue - if there are additional similar issues please open a new ticket.

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 19, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants