Skip to content

Commit

Permalink
raft: Fix race that leads to raft deadlock
Browse files Browse the repository at this point in the history
PR #1310 ("Fix infinite election loop") solved a problem with election
loops on startup, by delaying new proposals until the leader has
committed all its existing entries. This ensures that the state machine
doesn't call ApplyStoreActions to commit a previous entry from the log
while an new proposal is in process - since they both acquire a write
lock over the memory store, which would deadlock.

Unfortunately, there is still a race condition which can lead to a
similar deadlock. processInternalRaftRequest makes sure that proposals
arent't started after the manager loses its status as the leader by
first registering a wait for the raft request, then checking the
leadership status. If the leadership status is lost before calling
register(), then the leadership check should fail, since it happens
afterwards. Conversely, if the leadership status is lost after calling
register(), then cancelAll() in Run() will make sure this wait gets
cancelled.

The problem with this is that the new code in PR #1310 calls cancelAll()
*before* setting the leadership status. So it's possible that first we
cancel all outstanding requests, then a new request is registered and
successfully checks that we are still the leader, then we set leader to
"false". This request never gets cancelled, so it causes a deadlock.
Nothing can be committed to the store until this request goes through,
but it can't go through if we're not the leader anymore.

To fix this, swap the order of cancelAll so it happens after we change
the leadership status variable. This means that no matter how the
goroutines are interleaved, a new request will either cancel itself or
be cancelled by Run when leadership is lost. I'm aware that this is ugly
and I'm open to suggestions for refactoring or abstracting.

Also, out of extra caution, call cancelAll in the situation which would
lead to a deadlock if there were any outstanding raft requests.

Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
(cherry picked from commit 0324db7)
  • Loading branch information
aaronlehmann committed Sep 21, 2016
1 parent d53622d commit a8c084c
Showing 1 changed file with 18 additions and 1 deletion.
19 changes: 18 additions & 1 deletion manager/state/raft/raft.go
Original file line number Diff line number Diff line change
Expand Up @@ -376,11 +376,23 @@ func (n *Node) Run(ctx context.Context) error {
if rd.SoftState != nil {
if wasLeader && rd.SoftState.RaftState != raft.StateLeader {
wasLeader = false
n.wait.cancelAll()
if atomic.LoadUint32(&n.signalledLeadership) == 1 {
atomic.StoreUint32(&n.signalledLeadership, 0)
n.leadershipBroadcast.Write(IsFollower)
}

// It is important that we set n.signalledLeadership to 0
// before calling n.wait.cancelAll. When a new raft
// request is registered, it checks n.signalledLeadership
// afterwards, and cancels the registration if it is 0.
// If cancelAll was called first, this call might run
// before the new request registers, but
// signalledLeadership would be set after the check.
// Setting signalledLeadership before calling cancelAll
// ensures that if a new request is registered during
// this transition, it will either be cancelled by
// cancelAll, or by its own check of signalledLeadership.
n.wait.cancelAll()
} else if !wasLeader && rd.SoftState.RaftState == raft.StateLeader {
wasLeader = true
}
Expand Down Expand Up @@ -1242,6 +1254,11 @@ func (n *Node) processEntry(entry raftpb.Entry) error {
// position and cancelling the transaction. Create a new
// transaction to commit the data.

// It should not be possible for processInternalRaftRequest
// to be running in this situation, but out of caution we
// cancel any current invocations to avoid a deadlock.
n.wait.cancelAll()

err := n.memoryStore.ApplyStoreActions(r.Action)
if err != nil {
log.G(context.Background()).Errorf("error applying actions from raft: %v", err)
Expand Down

0 comments on commit a8c084c

Please sign in to comment.