Skip to content

Commit

Permalink
ovsdb raft: Avoid unnecessary reconnecting during leader election.
Browse files Browse the repository at this point in the history
If a server claims itself as "disconnected", all clients connected
to that server will try to reconnect to a new server in the cluster.

However, currently a server would claim itself as disconnected even
when itself is the candidate and try to become the new leader (most
likely it will be), and all its clients will reconnect to another
node.

During a leader fail-over (e.g. due to a leader failure), it is
expected that all clients of the old leader will have to reconnect
to other nodes in the cluster, but it is unnecessary for all the
clients of a healthy node to reconnect, which could cause more
disturbance in a large scale environment.

This patch fixes the problem by slightly change the condition that
a server regards itself as disconnected: if its role is candidate,
it is regarded as disconnected only if the election didn't succeed
at the first attempt. Related failure test cases are also unskipped
and all passed with this patch.

Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
  • Loading branch information
hzhou8 authored and blp committed Apr 22, 2019
1 parent 9f39622 commit 4d9b28c
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 8 deletions.
13 changes: 11 additions & 2 deletions ovsdb/raft.c
Expand Up @@ -286,6 +286,8 @@ struct raft {

/* Candidates only. Reinitialized at start of election. */
int n_votes; /* Number of votes for me. */
bool candidate_retrying; /* The first round of election timed-out and it
is now retrying. */
};

/* All Raft structures. */
Expand Down Expand Up @@ -985,11 +987,17 @@ raft_get_sid(const struct raft *raft)
/* Returns true if 'raft' has completed joining its cluster, has not left or
* initiated leaving the cluster, does not have failed disk storage, and is
* apparently connected to the leader in a healthy way (or is itself the
* leader).*/
* leader).
*
* If 'raft' is candidate:
* a) if it is the first round of election, consider it as connected, hoping
* it will successfully elect a new leader soon.
* b) if it is already retrying, consider it as disconnected (so that clients
* may decide to reconnect to other members). */
bool
raft_is_connected(const struct raft *raft)
{
return (raft->role != RAFT_CANDIDATE
return (!(raft->role == RAFT_CANDIDATE && raft->candidate_retrying)
&& !raft->joining
&& !raft->leaving
&& !raft->left
Expand Down Expand Up @@ -1608,6 +1616,7 @@ raft_start_election(struct raft *raft, bool leadership_transfer)
}

ovs_assert(raft->role != RAFT_LEADER);
raft->candidate_retrying = (raft->role == RAFT_CANDIDATE);
raft->role = RAFT_CANDIDATE;

raft->n_votes = 0;
Expand Down
6 changes: 0 additions & 6 deletions tests/ovsdb-cluster.at
Expand Up @@ -148,8 +148,6 @@ AT_CLEANUP

AT_SETUP([OVSDB cluster - txn on follower-2, leader crash before sending execRep, follower-2 becomes leader])
AT_KEYWORDS([ovsdb server negative unix cluster pending-txn])
# XXX: fix bug before enabling this test
AT_CHECK([exit 77])
ovsdb_cluster_failure_test 2 3 1 crash-before-sending-execute-command-reply 2
AT_CLEANUP

Expand All @@ -160,8 +158,6 @@ AT_CLEANUP

AT_SETUP([OVSDB cluster - txn on follower-2, leader crash after sending execRep, follower-2 becomes leader])
AT_KEYWORDS([ovsdb server negative unix cluster pending-txn])
# XXX: fix bug before enabling this test
AT_CHECK([exit 77])
ovsdb_cluster_failure_test 2 3 1 crash-after-sending-execute-command-reply 2
AT_CLEANUP

Expand All @@ -172,8 +168,6 @@ AT_CLEANUP

AT_SETUP([OVSDB cluster - txn on leader, leader crash before sending appendReq, follower-2 becomes leader])
AT_KEYWORDS([ovsdb server negative unix cluster pending-txn])
# XXX: fix bug before enabling this test
AT_CHECK([exit 77])
ovsdb_cluster_failure_test 1 2 1 crash-before-sending-append-request 2
AT_CLEANUP

Expand Down

0 comments on commit 4d9b28c

Please sign in to comment.