Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backend state locking #25454

Merged
merged 4 commits into from
Aug 11, 2020
Merged

backend state locking #25454

merged 4 commits into from
Aug 11, 2020

Conversation

mildwonkey
Copy link
Contributor

There was a subtle (and confusing) difference between the local and remote backend state locking strategies:

  • The local backend context returns with a locked state, even if Context() failed.
  • The remote backend returns with an unlocked state when Context() failed.

This was only showing up as a problem in a few commands when one was using the remote backend but the command was (only) running locally, such as terraform console: terraform console would always unlock state after getting the context, but if there was an error from Context(), the remote state was already unlocked and would result in a "workspace already unlocked" error.

This PR aims for parity between remote and local by making the following changes:

  • backend/local will unlock the state if Context() has an error, exactly as backend/remote does today
  • terraform console and terraform import will exit before unlocking state in case of error in Context()
  • responsibility for unlocking state in the local backend is pushed down the stack, out of backend.go and into each individual state operation

My first attempt at this PR broke basically everything, and yet it wasn't caught (reasonably: there were probably plenty of tests that expected an error, and continued to find the error. But possibly the wrong error, or at least an extra error). So there's more testing to do, and I have yet to dig into other commands that might need some work (my main concerns are the state commands, which are another set of kind-of-local-but-remote commands).

@codecov
Copy link

codecov bot commented Jul 1, 2020

Codecov Report

Merging #25454 into master will increase coverage by 0.02%.
The diff coverage is 55.81%.

Impacted Files Coverage Δ
backend/local/backend.go 49.78% <ø> (+1.71%) ⬆️
command/console.go 41.66% <25.00%> (ø)
backend/local/backend_apply.go 39.35% <40.00%> (+0.02%) ⬆️
backend/local/backend_plan.go 73.57% <40.00%> (+0.54%) ⬆️
backend/local/backend_refresh.go 41.50% <40.00%> (+6.09%) ⬆️
backend/local/backend_local.go 41.23% <60.00%> (+4.33%) ⬆️
backend/local/testing.go 57.89% <64.28%> (+0.89%) ⬆️
backend/remote/backend.go 59.20% <100.00%> (+0.83%) ⬆️
command/import.go 53.69% <100.00%> (ø)
terraform/node_resource_plan.go 91.80% <0.00%> (-1.64%) ⬇️
... and 7 more

@mildwonkey mildwonkey requested a review from a team July 2, 2020 11:54
@mildwonkey
Copy link
Contributor Author

👋 I know it's weird (and extra work) to request a review on a draft PR, but I could really use the extra eyes to let me know if I'm on the right track. I've got plenty of tests to work on in the meantime.

@@ -591,7 +591,7 @@ func (b *Remote) DeleteWorkspace(name string) error {
}

// StateMgr implements backend.Enhanced.
func (b *Remote) StateMgr(name string) (state.State, error) {
func (b *Remote) StateMgr(name string) (statemgr.Full, error) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sneaking in an unrelated change to start removing references to the deprecated state package:

// State is a deprecated alias for statemgr.Full

This adds tests to plan, apply and refresh which validate that the state
is unlocked after all operations, regardless of exit status. I've also
added specific tests that force Context() to fail during each operation
to verify that locking behavior specifically.
@mildwonkey mildwonkey marked this pull request as ready for review July 7, 2020 12:12
@mildwonkey mildwonkey added this to the v0.13.1 milestone Jul 7, 2020
@mildwonkey
Copy link
Contributor Author

I'm labeling this 0.13.1 with the caveat that we may decide to hold off till the 0.14 development cycle.

Copy link
Contributor

@pselle pselle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To restate the goal to ensure I understand it: The backend context should return a locked state, unless there was a failure in creating that context. I think I can get behind this, although that "unless" raises some flags to me, but maybe it's okay.

As far as this approach, moving the code out of one place so it can be called in many is also suspicious ... I'm guessing that moving the order of the ctx diags calls alone didn't fix the issue you're describing? (https://github.com/hashicorp/terraform/pull/25454/files#diff-2cf6324dad1cc452ed13778ff1552b43R203-R207) I suppose what I'm expressing is I have some hope for being able to solve this without the need to add the same function calls in multiple places, but perhaps it's unavoidable.

@@ -0,0 +1,57 @@
package local
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet! New test file!

@pselle pselle requested a review from a team July 7, 2020 19:20
@mildwonkey mildwonkey changed the title [DRAFT] backend state locking backend state locking Jul 7, 2020
@mildwonkey
Copy link
Contributor Author

As far as this approach, moving the code out of one place so it can be called in many is also suspicious ... I'm guessing that moving the order of the ctx diags calls alone didn't fix the issue you're describing? (https://github.com/hashicorp/terraform/pull/25454/files#diff-2cf6324dad1cc452ed13778ff1552b43R203-R207) I suppose what I'm expressing is I have some hope for being able to solve this without the need to add the same function calls in multiple places, but perhaps it's unavoidable.

Yes, unfortunately that's required by this change. I could have avoided adding those into every local operation by continuing to return a locked state whether or not Context failed, and instead modifying the remote backend to match locals behavior. This behavior - unlocking the state if Context returns an error - feels more correct, with the added bonus of sparing me digging deeply into the remote backend client 🙃

@mildwonkey mildwonkey merged commit 86e9ba3 into master Aug 11, 2020
@mildwonkey
Copy link
Contributor Author

Fixes #24246

@mildwonkey mildwonkey deleted the mildwonkey/backend-locking branch August 11, 2020 15:25
@ghost
Copy link

ghost commented Sep 11, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Sep 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants