Skip to content

Commit

Permalink
ovsdb-server: Don't disconnect clients after raft install_snapshot.
Browse files Browse the repository at this point in the history
When "schema" field is found in read_db(), there can be two cases:
1. There is a schema change in clustered DB and the "schema" is the new one.
2. There is a install_snapshot RPC happened, which caused log compaction on the
server and the next log is just the snapshot, which always constains "schema"
field, even though the schema hasn't been changed.

The current implementation doesn't handle case 2), and always assume the schema
is changed hence disconnect all clients of the server. It can cause stability
problem when there are big number of clients connected when this happens in
a large scale environment.

Signed-off-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
  • Loading branch information
hzhou8 authored and blp committed Mar 6, 2020
1 parent 017d89d commit 86834bf
Show file tree
Hide file tree
Showing 2 changed files with 58 additions and 1 deletion.
3 changes: 2 additions & 1 deletion ovsdb/ovsdb-server.c
Expand Up @@ -534,7 +534,8 @@ parse_txn(struct server_config *config, struct db *db,
struct ovsdb_schema *schema, const struct json *txn_json,
const struct uuid *txnid)
{
if (schema) {
if (schema && (!db->db->schema || strcmp(schema->version,
db->db->schema->version))) {
/* We're replacing the schema (and the data). Destroy the database
* (first grabbing its storage), then replace it with the new schema.
* The transaction must also include the replacement data.
Expand Down
56 changes: 56 additions & 0 deletions tests/ovsdb-cluster.at
Expand Up @@ -273,6 +273,62 @@ OVS_WAIT_UNTIL([ovs-appctl -t "`pwd`"/s4 cluster/status $schema_name | grep "Ele

AT_CLEANUP


AT_BANNER([OVSDB cluster install snapshot RPC])

AT_SETUP([OVSDB cluster - install snapshot RPC])
AT_KEYWORDS([ovsdb server positive unix cluster snapshot])

n=3
schema_name=`ovsdb-tool schema-name $abs_srcdir/idltest.ovsschema`
ordinal_schema > schema
AT_CHECK([ovsdb-tool '-vPATTERN:console:%c|%p|%m' create-cluster s1.db $abs_srcdir/idltest.ovsschema unix:s1.raft], [0], [], [stderr])
cid=`ovsdb-tool db-cid s1.db`
schema_name=`ovsdb-tool schema-name $abs_srcdir/idltest.ovsschema`
for i in `seq 2 $n`; do
AT_CHECK([ovsdb-tool join-cluster s$i.db $schema_name unix:s$i.raft unix:s1.raft])
done

on_exit 'kill `cat *.pid`'
for i in `seq $n`; do
AT_CHECK([ovsdb-server -v -vconsole:off -vsyslog:off --detach --no-chdir --log-file=s$i.log --pidfile=s$i.pid --unixctl=s$i --remote=punix:s$i.ovsdb s$i.db])
done
for i in `seq $n`; do
AT_CHECK([ovsdb_client_wait unix:s$i.ovsdb $schema_name connected])
done

# Kill one follower (s2) and write some data to cluster, so that the follower is falling behind
printf "\ns2: stopping\n"
OVS_APP_EXIT_AND_WAIT_BY_TARGET([`pwd`/s2], [s2.pid])

AT_CHECK([ovsdb-client transact unix:s1.ovsdb '[["idltest",
{"op": "insert",
"table": "simple",
"row": {"i": 1}}]]'], [0], [ignore], [ignore])

# Compact leader online to generate snapshot
AT_CHECK([ovs-appctl -t "`pwd`"/s1 ovsdb-server/compact])

# Start the follower s2 again.
AT_CHECK([ovsdb-server -v -vconsole:off -vsyslog:off --detach --no-chdir --log-file=s2.log --pidfile=s2.pid --unixctl=s2 --remote=punix:s2.ovsdb s2.db])
AT_CHECK([ovsdb_client_wait unix:s2.ovsdb $schema_name connected])

# A client transaction through s2. During this transaction, there will be a
# install_snapshot RPC because s2 detects it is behind and s1 doesn't have the
# pre_log_index requested by s2 because it is already compacted.
# After the install_snapshot RPC process, the transaction through s2 should
# succeed.
AT_CHECK([ovsdb-client transact unix:s2.ovsdb '[["idltest",
{"op": "insert",
"table": "simple",
"row": {"i": 1}}]]'], [0], [ignore], [ignore])

for i in `seq $n`; do
OVS_APP_EXIT_AND_WAIT_BY_TARGET([`pwd`/s$i], [s$i.pid])
done

AT_CLEANUP



OVS_START_SHELL_HELPERS
Expand Down

0 comments on commit 86834bf

Please sign in to comment.