New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
box.ctl.promote() #3055
Comments
box.cfg { read_only = true } ? |
и что общего это имеет с минимизацией отстаивания в r/o? |
у меня есть и для 1.6 |
* Mons <notifications@github.com> [18/01/24 19:53]:
у меня есть и для 1.6
Добавь в тикет.
…--
Konstantin Osipov, Moscow, Russia, +7 903 626 22 32
http://tarantool.org - www.twitter.com/kostja_osipov
|
Basic algorithmbox.ctl.promote() - a function to make the current replica to be master of a
Various problemsObviously, there are no problems, if:
Assume, an old master is down when he was syncing in read-only mode. In such a Assume, a replica, called box.ctl.promote, is down, when an old master already ImplementationA host, on which box.ctl.promote is called, must communicate with replicaset Consider, how communication can be done via replication connections. At first, struct iproto_ctl_op {
enum iproto_ctl_type type;
int (*do)(struct iproto_ctl_op *op);
void (*rollback)(struct iproto_ctl_op *op);
void (*destroy)(struct iproto_ctl_op *op);
} Sequence of these operations are applied one by one via
Final protocol of promotion:
Final box.ctl.promote API: |
The ticket is mine. |
Updated: box/net.box_reconnect_after_gh-3164.test.lua gh-5081 replication/errinj.test.lua gh-3870 replication/qsync_basic.test.lua gh-5355 replication/anon.test.lua gh-5381 replication/status.test.lua gh-5409 replication/election_qsync.test.lua gh-5430 Added new: box-py/iproto.test.py gh-qa-132 replication/gh-5435-qsync-clear-synchro-queue-co> gh-qa-129 replication/gh-5445-leader-inconsistency.test.lua gh-qa-129 replication/gh-3055-election-promote.test.lua gh-qa-127 replication/election_basic.test.lua gh-qa-133
Updated: box/net.box_reconnect_after_gh-3164.test.lua gh-5081 replication/errinj.test.lua gh-3870 replication/qsync_basic.test.lua gh-5355 replication/anon.test.lua gh-5381 replication/status.test.lua gh-5409 replication/election_qsync.test.lua gh-5430 Added new: box-py/iproto.test.py gh-qa-132 replication/gh-5435-qsync-clear-synchro-queue-co> gh-qa-129 replication/gh-5445-leader-inconsistency.test.lua gh-qa-129 replication/gh-3055-election-promote.test.lua gh-qa-127 replication/election_basic.test.lua gh-qa-133
Updated: box/net.box_reconnect_after_gh-3164.test.lua gh-5081 replication/errinj.test.lua gh-3870 replication/qsync_basic.test.lua gh-5355 replication/anon.test.lua gh-5381 replication/status.test.lua gh-5409 replication/election_qsync.test.lua gh-5430 Added new: box-py/iproto.test.py gh-qa-132 replication/gh-5435-qsync-clear-synchro-queue-co> gh-qa-129 replication/gh-5445-leader-inconsistency.test.lua gh-qa-129 replication/gh-3055-election-promote.test.lua gh-qa-127 replication/election_basic.test.lua gh-qa-133 (cherry picked from commit 75193e5)
Updated: box/net.box_reconnect_after_gh-3164.test.lua gh-5081 replication/errinj.test.lua gh-3870 replication/qsync_basic.test.lua gh-5355 replication/anon.test.lua gh-5381 replication/status.test.lua gh-5409 replication/election_qsync.test.lua gh-5430 Added new: box-py/iproto.test.py gh-qa-132 replication/gh-5435-qsync-clear-synchro-queue-co> gh-qa-129 replication/gh-5445-leader-inconsistency.test.lua gh-qa-129 replication/gh-3055-election-promote.test.lua gh-qa-127 replication/election_basic.test.lua gh-qa-133 (cherry picked from commit 75193e5)
Found the following error in our CI: [001] Test failed! Result content mismatch: [001] --- replication/gh-3055-election-promote.result Mon Aug 2 17:52:55 2021 [001] +++ var/rejects/replication/gh-3055-election-promote.reject Mon Aug 9 10:29:34 2021 [001] @@ -88,7 +88,7 @@ [001] | ... [001] assert(not box.info.ro) [001] | --- [001] - | - true [001] + | - error: assertion failed! [001] | ... [001] assert(box.info.election.term > term) [001] | --- [001] The problem was the same as in recently fixed election_qsync.test (commit 096a0a7): PROMOTE is written to WAL asynchronously, and box.ctl.promote() returns earlier than this happens. Fix the issue by waiting for the instance to become writeable. Follow-up #6034
Found the following error in our CI: [001] Test failed! Result content mismatch: [001] --- replication/gh-3055-election-promote.result Mon Aug 2 17:52:55 2021 [001] +++ var/rejects/replication/gh-3055-election-promote.reject Mon Aug 9 10:29:34 2021 [001] @@ -88,7 +88,7 @@ [001] | ... [001] assert(not box.info.ro) [001] | --- [001] - | - true [001] + | - error: assertion failed! [001] | ... [001] assert(box.info.election.term > term) [001] | --- [001] The problem was the same as in recently fixed election_qsync.test (commit 096a0a7): PROMOTE is written to WAL asynchronously, and box.ctl.promote() returns earlier than this happens. Fix the issue by waiting for the instance to become writeable. Follow-up #6034 (cherry picked from commit 1df9960)
Found the following error in our CI: [001] Test failed! Result content mismatch: [001] --- replication/gh-3055-election-promote.result Mon Aug 2 17:52:55 2021 [001] +++ var/rejects/replication/gh-3055-election-promote.reject Mon Aug 9 10:29:34 2021 [001] @@ -88,7 +88,7 @@ [001] | ... [001] assert(not box.info.ro) [001] | --- [001] - | - true [001] + | - error: assertion failed! [001] | ... [001] assert(box.info.election.term > term) [001] | --- [001] The problem was the same as in recently fixed election_qsync.test (commit 096a0a7): PROMOTE is written to WAL asynchronously, and box.ctl.promote() returns earlier than this happens. Fix the issue by waiting for the instance to become writeable. Follow-up #6034 (cherry picked from commit 1df9960)
Found the following error in our CI: [001] Test failed! Result content mismatch: [001] --- replication/gh-3055-election-promote.result Mon Aug 2 17:52:55 2021 [001] +++ var/rejects/replication/gh-3055-election-promote.reject Mon Aug 9 10:29:34 2021 [001] @@ -88,7 +88,7 @@ [001] | ... [001] assert(not box.info.ro) [001] | --- [001] - | - true [001] + | - error: assertion failed! [001] | ... [001] assert(box.info.election.term > term) [001] | --- [001] The problem was the same as in recently fixed election_qsync.test (commit 096a0a7): PROMOTE is written to WAL asynchronously, and box.ctl.promote() returns earlier than this happens. Fix the issue by waiting for the instance to become writeable. Follow-up #6034
Implement a built-in call which promotes a replica to a master in a replica set.
What it should do:
Open issues:
Now that we have before_replace triggers, we essentially have logical replication available as a vehicle for message passing between master and slave. Perhaps we should make Iproto_nop possible in read-only mode, so that message passing can work in both directions - from read-write master to read-only one, and vice-versa.
How we can achieve correctness of the algorithm without persisting its state:
More issues to consider:
The text was updated successfully, but these errors were encountered: