Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: responsive leadership transfer #144

Open
BusyJay opened this issue Nov 22, 2018 · 4 comments
Open

RFC: responsive leadership transfer #144

BusyJay opened this issue Nov 22, 2018 · 4 comments
Labels
Feature Related to a major feature. Request for Comment A proposal to be considered. Analogous to an RFC in TiKV/Rust.

Comments

@BusyJay
Copy link
Member

BusyJay commented Nov 22, 2018

Summary

This RFC proposes to use an index to trace leadership transfer, and if follower fails to start election, a response should be sent back to leader.

Motivation

When leader transfer leadership to a follower, follower may and may not start an election. Leader can't know precisely what happen, so it will stop read/write and wait for an election timeout and then try to retain leadership if no one campaigns. We observe some unexpected high latency when a leadership transfer fails. Note that failure doesn't have to be caused by network failure, it can also be caused by slow apply of logs. For example, a newly promoted voter may not start campaign if conf change is applied locally.

Detailed design

We can introduce an index to trace every leadership transfer. Everytime leadership transfer happens, index should increase by 1. The index is also sent via transfer command. If a follower checks its own state, and decides not to campaign, it should send back a TransferLeaderResponse to tell leader its decision. Leader finds a rejected response's index matches its own latest transfer index, then abort leadership transfer immediately.

Unresolved questions

What if transfer command is dropped due to network failure? It may be hard to handle all situations, but at lease should make it work as expected when infrastructures work as expected.

@BusyJay BusyJay added Feature Related to a major feature. Request for Comment A proposal to be considered. Analogous to an RFC in TiKV/Rust. labels Nov 22, 2018
@BusyJay
Copy link
Member Author

BusyJay commented Nov 22, 2018

/cc @siddontang @xiang90 Any thoughts?

@BusyJay
Copy link
Member Author

BusyJay commented Nov 22, 2018

Index may not be easy to maintain, using a local timestamp can be a workaround.

@BusyJay
Copy link
Member Author

BusyJay commented Nov 23, 2018

One difference between current implementation and thesis paper is that raft apply conf change after it is committed and applied. However, thesis suggests to always use latest available configuration. In that case, the problem describe here should not exist as TimeoutNow is sent only after follower's log is up to date.

@BusyJay
Copy link
Member Author

BusyJay commented May 9, 2019

If #234 is solved, the index is unnecessary anymore. The only thing needs to do is just send back a response to leader.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature Related to a major feature. Request for Comment A proposal to be considered. Analogous to an RFC in TiKV/Rust.
Projects
None yet
Development

No branches or pull requests

1 participant