Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raft: new leader should commit synchronous transactions from the previous leader #5339

Closed
Gerold103 opened this issue Sep 26, 2020 · 0 comments
Assignees
Labels

Comments

@Gerold103
Copy link
Collaborator

When a new leader is elected, it should try to commit the pending synchronous transactions created by the previous leader.

@Gerold103 Gerold103 added feature A new functionality replication qsync replication raft RAFT protocol labels Sep 26, 2020
@Gerold103 Gerold103 self-assigned this Oct 13, 2020
Gerold103 added a commit that referenced this issue Oct 13, 2020
The test is long, about 10 seconds. But its name is too general.
And it would be better used for a simpler more basic test. This is
going to happen in the next commits.

election_qsync.test.lua will check if the election and qsync work
fine together without any stress cases.

Needed for #5339
Gerold103 added a commit that referenced this issue Oct 13, 2020
Raft has a worker fiber to perform async tasks such as WAL write,
state broadcast.

The worker was created and woken up from 2 places, leading at
least to code duplication. The patch wraps it into a new function
raft_worker_wakeup(), and uses it.

The patch is not need for anything functional, but was created
while working on #5339 and trying ideas. The patch seems to be
good refactoring making the code simpler, and therefore it is
submitted.
Gerold103 added a commit that referenced this issue Oct 13, 2020
When a raft node was configured to be a candidate via
election_mode, it didn't do anything if there was an active
leader.

But it should have started monitoring its health in order to
initiate a new election round when it dies.

The patch fixes this bug. It does not contain a test, because will
be covered by a test for #5339.

Needed for #5339
Gerold103 added a commit that referenced this issue Oct 13, 2020
Raft state machine now has a trigger invoked each time when any of
the visible Raft attributes is changed: state, term, vote.

The trigger is needed to commit synchronous transactions of an old
leader, when a new leader is elected. This is done via a trigger
so as not to depend on box in raft code too much. That would make
it harder to extract it into a new module later.

The trigger is executed in the Raft worker fiber, so as not to
stop the state machine transitions anywhere, which currently don't
contain a single yield. And the synchronous transaction queue
clearance requires a yield, to write CONFIRM and ROLLBACK records
to WAL.

Part of #5339
Gerold103 added a commit that referenced this issue Oct 13, 2020
According to Raft, when a new leader is elected, it should finish
transactions of the old leader. In Raft this is done via adding a
new transaction originated from the new leader.

In case of Tarantool this can be done without a new transaction
due to WAL format specifics, and the function doing it is called
box_clear_synchro_queue().

Before the patch, when a node was elected as a leader, it didn't
finish the pending transactions. The queue clearance was expected
to be done by a user. There was no any issue with that, just
technical debt. The patch fixes it.

Now when a node becomes a leader, it finished synchronous
transactions of the old leader. This is done a bit differently
than in the public box.ctl.clear_synchro_queue().

The public box.ctl.clear_synchro_queue() tries to wait for CONFIRM
messages, which may be late. For replication_synchro_timeout * 2
time.

But when a new leader is elected, the leader will ignore all rows
from all the other nodes, as it thinks it is the only source of
truth. Therefore it makes no sense to wait for CONFIRMs here, and
the waiting is omitted.

Closes #5339
Gerold103 added a commit that referenced this issue Oct 14, 2020
The test is long, about 10 seconds. But its name is too general.
And it would be better used for a simpler more basic test. This is
going to happen in the next commits.

election_qsync.test.lua will check if the election and qsync work
fine together without any stress cases.

Needed for #5339
Gerold103 added a commit that referenced this issue Oct 14, 2020
Raft has a worker fiber to perform async tasks such as WAL write,
state broadcast.

The worker was created and woken up from 2 places, leading at
least to code duplication. The patch wraps it into a new function
raft_worker_wakeup(), and uses it.

The patch is not need for anything functional, but was created
while working on #5339 and trying ideas. The patch seems to be
good refactoring making the code simpler, and therefore it is
submitted.
Gerold103 added a commit that referenced this issue Oct 14, 2020
When a raft node was configured to be a candidate via
election_mode, it didn't do anything if there was an active
leader.

But it should have started monitoring its health in order to
initiate a new election round when it dies.

The patch fixes this bug. It does not contain a test, because will
be covered by a test for #5339.

Needed for #5339
Gerold103 added a commit that referenced this issue Oct 14, 2020
Raft state machine now has a trigger invoked each time when any of
the visible Raft attributes is changed: state, term, vote.

The trigger is needed to commit synchronous transactions of an old
leader, when a new leader is elected. This is done via a trigger
so as not to depend on box in raft code too much. That would make
it harder to extract it into a new module later.

The trigger is executed in the Raft worker fiber, so as not to
stop the state machine transitions anywhere, which currently don't
contain a single yield. And the synchronous transaction queue
clearance requires a yield, to write CONFIRM and ROLLBACK records
to WAL.

Part of #5339
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant