New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable state locking for plan/apply/destroy/refresh/taint/untaint #11686

Merged
merged 18 commits into from Feb 6, 2017

Conversation

Projects
None yet
3 participants
@jbardin
Contributor

jbardin commented Feb 3, 2017

This enables the locking of state through the command UI.

This does change the behavior of 2 tests. Previously when running a plan with no existing state, the plan would be written out and then backed up on the next WriteState by another BackupState instance. Since we now maintain a single State instance throughout an operation, the backup happens before any state exists so no backup file is created. This shouldn't be a problem, as there really was nothing that required backing up. Now those tests will create the state file before running.

The lock/unlock terraform commands will be added in another PR. Only local state is supported so far, so the commands aren't yet required.

jbardin added some commits Feb 1, 2017

enable local state locking for apply
Have the LocalBackend lock the state during operations, and enble this
for the apply comand.
Change lock reason -> info
This makes it more apparent that the information passed in isn't
required nor will it conform to any standard. There may be call sites
that can't provide good contextual info, and we don't want to count on
that value.
Create state files first for backup tests
Previously when runnign a plan with no exitsing state, the plan would be
written out and then backed up on the next WriteState by another
BackupState instance. Since we now maintain a single State instance
thoughout an operation, the backup happens before any state exists so no
backup file is created.

This is OK, as the backup state the tests were checking for is from the
plan file, which already exists separate from the state.
Remove "expires" from lock info.
We are not going to handle lock expiration, at least at this time, so
remove the Expires fields to avoid any confusion.
Add separate program for locking state files
Depending on the implementation, local state locks may be reentrant
within the same process. Use a separate process to test locked state
files.
Add test for apply/refresh on locked state files
Verify that these operations fail when a state file is locked.
build the statelocker binary before running
this way we can signal it directly to amke sure it exits cleanly.
Add test/untaint tests with locked state
add missing lock-state flag to untaint

@jbardin jbardin requested a review from mitchellh Feb 3, 2017

Cleanup state file during Unlock
Close and remove the file descriptor from LocalState if we Unlock the
state. Also remove an empty state file if we created it and it was never
written to. This is mostly to clean up after tests, but doesn't hurt to
not leave empty files around.
@pchaganti

This comment has been minimized.

Show comment
Hide comment
@pchaganti

pchaganti commented Feb 4, 2017

👍

@mitchellh

Some minor changes, overall looks amazing.

@@ -22,7 +22,8 @@ type Backend interface {
// State returns the current state for this environment. This state may
// not be loaded locally: the proper APIs should be called on state.State
// to load the state.
// to load the state. If the state.State is a state.Locker, it's up to the
// caller to call Lock and Unlock as needed.

This comment has been minimized.

@mitchellh

mitchellh Feb 5, 2017

Member

Thanks for updating the comment, this is the correct behavior I wanted!

@mitchellh

mitchellh Feb 5, 2017

Member

Thanks for updating the comment, this is the correct behavior I wanted!

Show outdated Hide outdated backend/local/backend_apply.go Outdated
Show outdated Hide outdated command/meta.go Outdated
Show outdated Hide outdated command/apply.go Outdated
@@ -99,6 +103,10 @@ type Operation struct {
// Input/output/control options.
UIIn terraform.UIInput
UIOut terraform.UIOutput
// If LockState is true, the Operation must Lock any
// state.Lockers for its duration, and Unlock when complete.

This comment has been minimized.

@mitchellh

mitchellh Feb 5, 2017

Member

Add to the comment: if using backend.Local, it is up to the caller to unlock the state.

@mitchellh

mitchellh Feb 5, 2017

Member

Add to the comment: if using backend.Local, it is up to the caller to unlock the state.

This comment has been minimized.

@jbardin

jbardin Feb 6, 2017

Contributor

I don't think backend.Local is an exception here. The state is acquired and used solely within Enhanced.Operation, which is done for backend.Local too. I did note that Backend.State expects the caller to lock the state as needed, and that's called from within an Operation.

@jbardin

jbardin Feb 6, 2017

Contributor

I don't think backend.Local is an exception here. The state is acquired and used solely within Enhanced.Operation, which is done for backend.Local too. I did note that Backend.State expects the caller to lock the state as needed, and that's called from within an Operation.

jbardin added some commits Feb 6, 2017

Update runningOp.Err with State.Unlock error
Have the defer'ed State.Unlock call append any error to the
RunningOperation.Err field. Local error would be rare and
self-correcting, but when the backend.Local is using a remote state the
error may require user intervention.
@mitchellh

I think this is good. One thing I want to just leave as a note here but shouldn't block this merge: we should think through a UX if locking is taking awhile (for whatever reason, the network).

In Otto I had created a package that was basically "do this, but if it takes longer than N (time.Duration) then show this message". I think bringing that as a helper package here and using that for cases like this would be ideal. In the average case, state locking should be fast enough, if its taking longer than 100ms or something we should probably inform the user that we're trying to acquire a state lock. I could see some users terraform <op> hanging and being curious why.

@jbardin jbardin merged commit 9fbc5b1 into master Feb 6, 2017

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@jbardin jbardin deleted the jbardin/state-locking branch Feb 6, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment