-
Notifications
You must be signed in to change notification settings - Fork 375
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
replication: check rs uuid on subscribe process
Remote node doing the subscribe might be from a different replicaset. Before this patch the subscribe would be retried infinitely because the node couldn't be found in _cluster, and the master assumed it must have joined to another node, and its ID should arrive shortly (ER_TOO_EARLY_SUBSCRIBE). The ID would never arrive, because the node belongs to another replicaset. The patch makes so the master checks if the peer lives in the same replicaset. Since it is doing a subscribe, it must have joined already and should have a valid replicaset UUID, regardless of whether it is anonymous or not. Correct behaviour is to hard cut this peer off immediately, without retries. Closes #6094 Part of #5613
- Loading branch information
Showing
7 changed files
with
121 additions
and
101 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
## bugfix/replication | ||
|
||
* Fixed an error when a replica, at attempt to subscribe to a foreign cluster | ||
(with different replicaset UUID), didn't notice it is not possible, and | ||
instead was stuck in an infinite retry loop printing an error about "too | ||
early subscribe" (gh-6094). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
68 changes: 0 additions & 68 deletions
68
test/replication/gh-3704-misc-replica-checks-cluster-id.result
This file was deleted.
Oops, something went wrong.
25 changes: 0 additions & 25 deletions
25
test/replication/gh-3704-misc-replica-checks-cluster-id.test.lua
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,67 @@ | ||
-- test-run result file version 2 | ||
test_run = require('test_run').new() | ||
| --- | ||
| ... | ||
|
||
-- | ||
-- gh-6094: master instance didn't check if the subscribed instance has the same | ||
-- replicaset UUID as its own. As a result, if the peer is from a different | ||
-- replicaset, the master couldn't find its record in _cluster, and assumed it | ||
-- simply needs to wait a bit more. This led to an infinite re-subscribe. | ||
-- | ||
box.schema.user.grant('guest', 'super') | ||
| --- | ||
| ... | ||
|
||
test_run:cmd('create server master2 with script="replication/master1.lua"') | ||
| --- | ||
| - true | ||
| ... | ||
test_run:cmd('start server master2') | ||
| --- | ||
| - true | ||
| ... | ||
test_run:switch('master2') | ||
| --- | ||
| - true | ||
| ... | ||
replication = test_run:eval('default', 'return box.cfg.listen')[1] | ||
| --- | ||
| ... | ||
box.cfg{replication = {replication}} | ||
| --- | ||
| ... | ||
assert(test_run:grep_log('master2', 'ER_REPLICASET_UUID_MISMATCH')) | ||
| --- | ||
| - ER_REPLICASET_UUID_MISMATCH | ||
| ... | ||
info = box.info | ||
| --- | ||
| ... | ||
repl_info = info.replication[1] | ||
| --- | ||
| ... | ||
assert(not repl_info.downstream and not repl_info.upstream) | ||
| --- | ||
| - true | ||
| ... | ||
assert(info.status == 'orphan') | ||
| --- | ||
| - true | ||
| ... | ||
|
||
test_run:switch('default') | ||
| --- | ||
| - true | ||
| ... | ||
test_run:cmd('stop server master2') | ||
| --- | ||
| - true | ||
| ... | ||
test_run:cmd('delete server master2') | ||
| --- | ||
| - true | ||
| ... | ||
box.schema.user.revoke('guest', 'super') | ||
| --- | ||
| ... |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
test_run = require('test_run').new() | ||
|
||
-- | ||
-- gh-6094: master instance didn't check if the subscribed instance has the same | ||
-- replicaset UUID as its own. As a result, if the peer is from a different | ||
-- replicaset, the master couldn't find its record in _cluster, and assumed it | ||
-- simply needs to wait a bit more. This led to an infinite re-subscribe. | ||
-- | ||
box.schema.user.grant('guest', 'super') | ||
|
||
test_run:cmd('create server master2 with script="replication/master1.lua"') | ||
test_run:cmd('start server master2') | ||
test_run:switch('master2') | ||
replication = test_run:eval('default', 'return box.cfg.listen')[1] | ||
box.cfg{replication = {replication}} | ||
assert(test_run:grep_log('master2', 'ER_REPLICASET_UUID_MISMATCH')) | ||
info = box.info | ||
repl_info = info.replication[1] | ||
assert(not repl_info.downstream and not repl_info.upstream) | ||
assert(info.status == 'orphan') | ||
|
||
test_run:switch('default') | ||
test_run:cmd('stop server master2') | ||
test_run:cmd('delete server master2') | ||
box.schema.user.revoke('guest', 'super') |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters