Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable state locking for plan/apply/destroy/refresh/taint/untaint #11686

Merged
merged 18 commits into from Feb 6, 2017

Conversation

@jbardin
Copy link
Member

@jbardin jbardin commented Feb 3, 2017

This enables the locking of state through the command UI.

This does change the behavior of 2 tests. Previously when running a plan with no existing state, the plan would be written out and then backed up on the next WriteState by another BackupState instance. Since we now maintain a single State instance throughout an operation, the backup happens before any state exists so no backup file is created. This shouldn't be a problem, as there really was nothing that required backing up. Now those tests will create the state file before running.

The lock/unlock terraform commands will be added in another PR. Only local state is supported so far, so the commands aren't yet required.

jbardin added 14 commits Feb 2, 2017
Have the LocalBackend lock the state during operations, and enble this
for the apply comand.
This makes it more apparent that the information passed in isn't
required nor will it conform to any standard. There may be call sites
that can't provide good contextual info, and we don't want to count on
that value.
Previously when runnign a plan with no exitsing state, the plan would be
written out and then backed up on the next WriteState by another
BackupState instance. Since we now maintain a single State instance
thoughout an operation, the backup happens before any state exists so no
backup file is created.

This is OK, as the backup state the tests were checking for is from the
plan file, which already exists separate from the state.
We are not going to handle lock expiration, at least at this time, so
remove the Expires fields to avoid any confusion.
Depending on the implementation, local state locks may be reentrant
within the same process. Use a separate process to test locked state
files.
Verify that these operations fail when a state file is locked.
this way we can signal it directly to amke sure it exits cleanly.
add missing lock-state flag to untaint
Close and remove the file descriptor from LocalState if we Unlock the
state. Also remove an empty state file if we created it and it was never
written to. This is mostly to clean up after tests, but doesn't hurt to
not leave empty files around.
@pchaganti
Copy link

@pchaganti pchaganti commented Feb 4, 2017

👍

Copy link
Member

@mitchellh mitchellh left a comment

Some minor changes, overall looks amazing.

@@ -22,7 +22,8 @@ type Backend interface {

// State returns the current state for this environment. This state may
// not be loaded locally: the proper APIs should be called on state.State
// to load the state.
// to load the state. If the state.State is a state.Locker, it's up to the
// caller to call Lock and Unlock as needed.
Copy link
Member

@mitchellh mitchellh Feb 5, 2017

Thanks for updating the comment, this is the correct behavior I wanted!

defer func() {
if s, ok := opState.(state.Locker); op.LockState && ok {
if err := s.Unlock(); err != nil {
log.Printf("[ERROR]: %s", err)
Copy link
Member

@mitchellh mitchellh Feb 5, 2017

We should multierror append the error to runningOp.Err so that the error shows up to the end user. I would make a long const (like other error messages in the package) for it so that the user knows what to do: verify everything is okay, manually call terraform unlock

Copy link
Member Author

@jbardin jbardin Feb 6, 2017

Ah yes. I was thinking this would only apply to LocalState, where Unlock shouldn't error, and the lock is gone on exit anyway.

command/meta.go Outdated
statePath string
stateOutPath string
backupPath string
parallelism int
shadow bool
provider string
lockState bool
Copy link
Member

@mitchellh mitchellh Feb 5, 2017

Nitpick: let's name this stateLock just to match the other state-related fields above (statePath, stateOutPath)

command/apply.go Outdated
@@ -272,6 +274,8 @@ Options:
modifying. Defaults to the "-state-out" path with
".backup" extension. Set to "-" to disable backup.
-lock-state=true Lock the state file when locking is supported.
Copy link
Member

@mitchellh mitchellh Feb 5, 2017

Bike shed: Let's just use -lock. Maybe we'll lock more in the future maybe we won't but I think its clear regardless and I'd prefer the aesthetic of it.

@@ -99,6 +103,10 @@ type Operation struct {
// Input/output/control options.
UIIn terraform.UIInput
UIOut terraform.UIOutput

// If LockState is true, the Operation must Lock any
// state.Lockers for its duration, and Unlock when complete.
Copy link
Member

@mitchellh mitchellh Feb 5, 2017

Add to the comment: if using backend.Local, it is up to the caller to unlock the state.

Copy link
Member Author

@jbardin jbardin Feb 6, 2017

I don't think backend.Local is an exception here. The state is acquired and used solely within Enhanced.Operation, which is done for backend.Local too. I did note that Backend.State expects the caller to lock the state as needed, and that's called from within an Operation.

jbardin added 3 commits Feb 6, 2017
Have the defer'ed State.Unlock call append any error to the
RunningOperation.Err field. Local error would be rare and
self-correcting, but when the backend.Local is using a remote state the
error may require user intervention.
Copy link
Member

@mitchellh mitchellh left a comment

I think this is good. One thing I want to just leave as a note here but shouldn't block this merge: we should think through a UX if locking is taking awhile (for whatever reason, the network).

In Otto I had created a package that was basically "do this, but if it takes longer than N (time.Duration) then show this message". I think bringing that as a helper package here and using that for cases like this would be ideal. In the average case, state locking should be fast enough, if its taking longer than 100ms or something we should probably inform the user that we're trying to acquire a state lock. I could see some users terraform <op> hanging and being curious why.

@jbardin jbardin merged commit 9fbc5b1 into master Feb 6, 2017
1 check passed
@jbardin jbardin deleted the jbardin/state-locking branch Feb 6, 2017
@ghost
Copy link

@ghost ghost commented Apr 17, 2020

I'm going to lock this issue because it has been closed for 30 days . This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@hashicorp hashicorp locked and limited conversation to collaborators Apr 17, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants