backend state locking #25454

mildwonkey · 2020-07-01T20:31:19Z

There was a subtle (and confusing) difference between the local and remote backend state locking strategies:

The local backend context returns with a locked state, even if Context() failed.
The remote backend returns with an unlocked state when Context() failed.

This was only showing up as a problem in a few commands when one was using the remote backend but the command was (only) running locally, such as terraform console: terraform console would always unlock state after getting the context, but if there was an error from Context(), the remote state was already unlocked and would result in a "workspace already unlocked" error.

This PR aims for parity between remote and local by making the following changes:

backend/local will unlock the state if Context() has an error, exactly as backend/remote does today
terraform console and terraform import will exit before unlocking state in case of error in Context()
responsibility for unlocking state in the local backend is pushed down the stack, out of backend.go and into each individual state operation

My first attempt at this PR broke basically everything, and yet it wasn't caught (reasonably: there were probably plenty of tests that expected an error, and continued to find the error. But possibly the wrong error, or at least an extra error). So there's more testing to do, and I have yet to dig into other commands that might need some work (my main concerns are the state commands, which are another set of kind-of-local-but-remote commands).

operations

codecov · 2020-07-01T20:41:16Z

Codecov Report

Merging #25454 into master will increase coverage by 0.02%.
The diff coverage is 55.81%.

Impacted Files	Coverage Δ
backend/local/backend.go	`49.78% <ø> (+1.71%)`	⬆️
command/console.go	`41.66% <25.00%> (ø)`
backend/local/backend_apply.go	`39.35% <40.00%> (+0.02%)`	⬆️
backend/local/backend_plan.go	`73.57% <40.00%> (+0.54%)`	⬆️
backend/local/backend_refresh.go	`41.50% <40.00%> (+6.09%)`	⬆️
backend/local/backend_local.go	`41.23% <60.00%> (+4.33%)`	⬆️
backend/local/testing.go	`57.89% <64.28%> (+0.89%)`	⬆️
backend/remote/backend.go	`59.20% <100.00%> (+0.83%)`	⬆️
command/import.go	`53.69% <100.00%> (ø)`
terraform/node_resource_plan.go	`91.80% <0.00%> (-1.64%)`	⬇️
... and 7 more

mildwonkey · 2020-07-02T11:55:50Z

👋 I know it's weird (and extra work) to request a review on a draft PR, but I could really use the extra eyes to let me know if I'm on the right track. I've got plenty of tests to work on in the meantime.

mildwonkey · 2020-07-02T14:56:05Z

backend/remote/backend.go

@@ -591,7 +591,7 @@ func (b *Remote) DeleteWorkspace(name string) error {
 }

 // StateMgr implements backend.Enhanced.
-func (b *Remote) StateMgr(name string) (state.State, error) {
+func (b *Remote) StateMgr(name string) (statemgr.Full, error) {


Sneaking in an unrelated change to start removing references to the deprecated state package:

terraform/state/state.go

Line 16 in 6824407

// State is a deprecated alias for statemgr.Full

This adds tests to plan, apply and refresh which validate that the state is unlocked after all operations, regardless of exit status. I've also added specific tests that force Context() to fail during each operation to verify that locking behavior specifically.

mildwonkey · 2020-07-07T15:02:06Z

I'm labeling this 0.13.1 with the caveat that we may decide to hold off till the 0.14 development cycle.

pselle

To restate the goal to ensure I understand it: The backend context should return a locked state, unless there was a failure in creating that context. I think I can get behind this, although that "unless" raises some flags to me, but maybe it's okay.

As far as this approach, moving the code out of one place so it can be called in many is also suspicious ... I'm guessing that moving the order of the ctx diags calls alone didn't fix the issue you're describing? (https://github.com/hashicorp/terraform/pull/25454/files#diff-2cf6324dad1cc452ed13778ff1552b43R203-R207) I suppose what I'm expressing is I have some hope for being able to solve this without the need to add the same function calls in multiple places, but perhaps it's unavoidable.

pselle · 2020-07-07T14:08:56Z

backend/local/backend_local_test.go

@@ -0,0 +1,57 @@
+package local


Sweet! New test file!

mildwonkey · 2020-07-07T20:04:27Z

As far as this approach, moving the code out of one place so it can be called in many is also suspicious ... I'm guessing that moving the order of the ctx diags calls alone didn't fix the issue you're describing? (https://github.com/hashicorp/terraform/pull/25454/files#diff-2cf6324dad1cc452ed13778ff1552b43R203-R207) I suppose what I'm expressing is I have some hope for being able to solve this without the need to add the same function calls in multiple places, but perhaps it's unavoidable.

Yes, unfortunately that's required by this change. I could have avoided adding those into every local operation by continuing to return a locked state whether or not Context failed, and instead modifying the remote backend to match locals behavior. This behavior - unlocking the state if Context returns an error - feels more correct, with the added bonus of sparing me digging deeply into the remote backend client 🙃

mildwonkey · 2020-08-11T15:23:48Z

Fixes #24246

ghost · 2020-09-11T01:51:34Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

mildwonkey added 2 commits July 1, 2020 14:26

snapshot commit to share

e437b00

backend/local: push responsibility for unlocking state into individual

4aa1cc6

operations

mildwonkey force-pushed the mildwonkey/backend-locking branch from bb91ee1 to 4aa1cc6 Compare July 1, 2020 20:36

mildwonkey requested a review from a team July 2, 2020 11:54

add tests confirming that state is not locked after apply and plan

c4d1bb0

mildwonkey commented Jul 2, 2020

View reviewed changes

mildwonkey marked this pull request as ready for review July 7, 2020 12:12

mildwonkey added this to the v0.13.1 milestone Jul 7, 2020

pselle reviewed Jul 7, 2020

View reviewed changes

backend/local/backend_local_test.go

@@ -0,0 +1,57 @@

package local

Copy link

Contributor

pselle Jul 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet! New test file!

pselle requested a review from a team July 7, 2020 19:20

mildwonkey changed the title ~~[DRAFT] backend state locking~~ backend state locking Jul 7, 2020

jbardin approved these changes Jul 7, 2020

View reviewed changes

mildwonkey merged commit 86e9ba3 into master Aug 11, 2020

mildwonkey deleted the mildwonkey/backend-locking branch August 11, 2020 15:25

ghost locked and limited conversation to collaborators Sep 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

backend state locking #25454

backend state locking #25454

mildwonkey commented Jul 1, 2020

codecov bot commented Jul 1, 2020 •

edited

Loading

mildwonkey commented Jul 2, 2020

mildwonkey Jul 2, 2020

mildwonkey commented Jul 7, 2020

pselle left a comment

pselle Jul 7, 2020

mildwonkey commented Jul 7, 2020

mildwonkey commented Aug 11, 2020

ghost commented Sep 11, 2020

backend state locking #25454

backend state locking #25454

Conversation

mildwonkey commented Jul 1, 2020

codecov bot commented Jul 1, 2020 • edited Loading

Codecov Report

mildwonkey commented Jul 2, 2020

mildwonkey Jul 2, 2020

Choose a reason for hiding this comment

mildwonkey commented Jul 7, 2020

pselle left a comment

Choose a reason for hiding this comment

pselle Jul 7, 2020

Choose a reason for hiding this comment

mildwonkey commented Jul 7, 2020

mildwonkey commented Aug 11, 2020

ghost commented Sep 11, 2020

codecov bot commented Jul 1, 2020 •

edited

Loading