Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add instruction to forcefully refresh state from the script #23388

Open
MarcelT-NL opened this issue Nov 15, 2019 · 6 comments
Open

Add instruction to forcefully refresh state from the script #23388

MarcelT-NL opened this issue Nov 15, 2019 · 6 comments
Labels
config enhancement waiting for reproduction unable to reproduce issue without further information

Comments

@MarcelT-NL
Copy link

MarcelT-NL commented Nov 15, 2019

Current Terraform Version

0.12.14

Use-cases

It would be great if you could issue a forced “refresh” at the top of the script, forcing the saved state to be deleted and recreated from the live environment.

Rationale:

Sometimes a terraform script fails, for whatever reason and you need to delete some assets manually (eg. in the Azure portal). When using Terraform.io (Cloud), there is no way to run CLI commands so no refresh option. Sometimes you have to delete and recreate the workspace because the state is out of sync with reality and terraform consistently fails when running.

Being able to have the state forcefully refreshed would solve a lot of manual work.

Proposal

Add an instruction that can be used in scripts, that initiates a forced refresh of the state:

terraform_state_refresh

@teamterraform
Copy link
Contributor

Hi @MarcelT-NL! Thanks for reporting this.

We'd like to understand a bit more about the problem you've encountered here. Terraform automatically runs the same behavior as terraform refresh would take as an early part of terraform plan, so we're not sure what exactly it would mean to force a refresh from the configuration; it sounds like you've run into a specific problem that a normal terraform refresh wouldn't have resolved anyway, because otherwise terraform plan would automatically resolve it too.

It would help if you could share some full error messages you've seen in situations like what you described where a failure on a previous Terraform run caused you to make changes outside of Terraform that then caused Terraform to get "stuck". That would help us to determine what is failing in the refresh step that is preventing Terraform from recovering automatically.

Thanks!

@teamterraform teamterraform added the waiting-response An issue/pull request is waiting for a response from the community label Nov 16, 2019
@MarcelT-NL
Copy link
Author

MarcelT-NL commented Nov 16, 2019

Hi,

It happens on different occasions. For instance, when you tear down an infrastructure with terraform, but some resource could not be deleted. Eg. a load balancer failed to be deleted because something was still connected to it. Seems to be related to race conditions when deleting infrastructure. Sometimes it happens when you rename a resource and terraform plan cannot remove (find) the old one. The only solution is to manually delete the resource. But then the state is out of sync with reality and fails consistently.

When I encounter this bug again I will add the error messages, even though I remember that they are generic.

The only solution so far is to completely remove the workspace from terraform, delete all resources, and create a new one (manually adding your secrets to the portal again), pointing at the same script. It then runs perfectly. It would be great if we could delete the existing state runtime, because the refresh apparently does not detect all changes.

@ghost ghost removed waiting-response An issue/pull request is waiting for a response from the community labels Nov 16, 2019
@MarcelT-NL
Copy link
Author

@teamterraform Did you get a chance to review my reply?

@danieldreier
Copy link
Contributor

@MarcelT-NL most of the team will be out for the rest of this week for the US Thanksgiving holiday. I've put this on our internal list to follow up when we're back.

@danieldreier
Copy link
Contributor

@MarcelT-NL If I understand your description correctly, the root issue you're trying to fix here is that you've encountered a bug that causes resource destruction to fail. Am I understanding you right?

If so, my impulse is to focus on reproducing and fixing that, rather than doing work to make it easier to recover from the broken state you find yourself in.

@MarcelT-NL
Copy link
Author

Yes you are correct. I tried reproducing the issue but have not encountered it in the past week.

One use case is:

  • create some resources
  • rename are least to resources
  • plan and apply again
  • the state contains the old name, the resource is not changed or destroyed from terraform
  • after manually destroying the resource in azure: terraform cannot find the resource with the old name, fails.

This happened when creating and changing load balancers and some other instances. It may be that the names of 2 resources changed, when there was a dependency between them.

@jbardin jbardin added the waiting for reproduction unable to reproduce issue without further information label Sep 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
config enhancement waiting for reproduction unable to reproduce issue without further information
Projects
None yet
Development

No branches or pull requests

5 participants