Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raft store: follower replication for write-flow #13533

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

LintianShi
Copy link
Contributor

@LintianShi LintianShi commented Sep 26, 2022

What is changed and how it works?

Issue Number: Close #13269

What's Changed:

With follower replication, the leader peer can only replicate log entries to one follower (called agent) in each availability zone, and the agent will help the leader forward log entries to other followers or learners in this availability zone.
Hence, TiDB will reduce considerable (about 40% in write scenario) cross-AZ traffic which reduces the cost of TiDB cloud deployment.

In implementation, the leader peer merges MsgAppends which are sent to the same availability zone when building raft messages.
After merging, the leader only sends one copy of log entries and several forward information including the range of log entries and target peer id. This messages is called MsgGroupBroadcast.
When the agent receives the message from the leader, it will first append log entries to its raft log. After appending, it will fetch log entries from its raft log according to the range recorded in MsgGroupBroadcast, and send MsgAppend to the target peer.

Main changes on code:
1. Record and maintain availability zone information in `PollContext`.
2. Merge MsgAppend into MsgGroupBroadcast in `fn build_raft_messages`.

Related changes

  • PR to update pingcap/docs/pingcap/docs-cn:
  • Need to cherry-pick to the release branch

Check List

Tests

  • Manual test (end-to-end test)
  • No code

Side effects

  • Performance regression
    • A little regression on TPS for write scenarios (about 1 ~ 2%).

Release note

This feature introduces a new mechanism in Raft algorithm which allows a follower to replicate log entries to other followers and learners. This feature is mainly aimed to reduce cross-AZ (short for Availability Zones) traffic in data replication between TiKV instances and TiFlash instances.

@ti-chi-bot
Copy link
Member

[REVIEW NOTIFICATION]

This pull request has not been approved.

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added release-note contribution Type: PR - From contributors size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Sep 26, 2022
@LintianShi
Copy link
Contributor Author

The changes on the raft-rs side is here tikv/raft-rs#485.

@LintianShi LintianShi force-pushed the dev-follower-replication branch 2 times, most recently from 17fd9a5 to 567ba83 Compare September 29, 2022 06:05
LintianShi added 2 commits October 8, 2022 23:13
Signed-off-by: LintianShi <lintian.shi@pingcap.com>
Signed-off-by: LintianShi <lintian.shi@pingcap.com>
@LintianShi LintianShi force-pushed the dev-follower-replication branch 2 times, most recently from 80968da to 4781751 Compare October 13, 2022 14:41
@ti-chi-bot ti-chi-bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 13, 2022
}
}
} else {
for msg in msgs {
Copy link
Contributor

@tonyxuqqi tonyxuqqi Oct 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: maybe it's better to put else to if as it's the more common code path. Just for better readability.

if let Some(to_peer_zone) = ctx.zone_info.get(&to_peer_store_id) && to_peer_zone != leader_zone
{
// If batch_append is disabled, there may be several MsgAppend to a same peer in msgs.
// The agent should not forward more than one MsgAppend to a single peer.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// The agent should not forward more than one MsgAppend to a single peer.

Could you elaborate the reason?

raft_msgs.push(m);
let leader_store_id = self.peer.get_store_id();

if self.raft_group.raft.follower_replication() && self.is_leader() && let Some(leader_zone) = ctx.zone_info.get(&leader_store_id) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you summarize the algorithm of the following code block?

LintianShi added 5 commits October 14, 2022 13:35
Signed-off-by: LintianShi <lintian.shi@pingcap.com>
Signed-off-by: LintianShi <lintian.shi@pingcap.com>
Signed-off-by: LintianShi <lintian.shi@pingcap.com>
Signed-off-by: LintianShi <lintian.shi@pingcap.com>
Signed-off-by: LintianShi <lintian.shi@pingcap.com>
@ti-chi-bot
Copy link
Member

@LintianShi: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ti-chi-bot ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 6, 2022
Signed-off-by: LintianShi <lintian.shi@pingcap.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contribution Type: PR - From contributors needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. release-note size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

raftstore: follower replication for write-flow
3 participants