Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raftstore-v2: implement local read for raftstore-v2 #13375

Merged
merged 48 commits into from Oct 11, 2022

Conversation

SpadeA-Tang
Copy link
Member

@SpadeA-Tang SpadeA-Tang commented Aug 30, 2022

What is changed and how it works?

Issue Number: Ref #12842

What's Changed:

This PR involves:

  • Refactor LocalReader of raftstore-v1 to make some code sharable with raftstore-v2
  • Implement LocalReader of raftstore-v2 and add relevant tests
  • Replace callback with ReadCallback trait

Related changes

  • PR to update pingcap/docs/pingcap/docs-cn:
  • Need to cherry-pick to the release branch

Check List

Tests

  • Unit test
  • Integration test

Release note

None

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
@ti-chi-bot
Copy link
Member

ti-chi-bot commented Aug 30, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • 5kbpers
  • tonyxuqqi

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
@@ -284,7 +284,7 @@ impl ServerCluster {
}
}

let local_reader = LocalReader::new(
let local_reader = LocalReaderCore::new(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use Core?

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
@SpadeA-Tang SpadeA-Tang changed the title WIP: LocalReader refactor and use ReadCallback in both v1 and v2 raftstore-v2: implement LocalReader for raftstore-v2 Sep 13, 2022
@@ -8,3 +8,5 @@ mod ready;
pub use command::{SimpleWriteDecoder, SimpleWriteEncoder};
pub use life::DestroyProgress;
pub use ready::AsyncWriter;

pub(crate) use self::query::LocalReader;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it could be used by other mods, using pub would be better.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's supposed not to be used by other module.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In v1's code, it's called in components/server as well.

@ti-chi-bot ti-chi-bot added status/LGT1 Status: PR - There is already 1 approval needs-rebase labels Sep 28, 2022
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
@SpadeA-Tang
Copy link
Member Author

/test

let (msg, sub) = PeerMsg::raft_query(req.clone());
let mut err = errorpb::Error::default();
match MsgRouter::send(&self.router, region_id, msg) {
Ok(()) => return Ok(sub.result().await),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we can clone and cache sub here.
So when sub exists, we don't need to issue the MsgRouter::send again for later requests that are before sub.result().await return . Just wait on sub.result().await. And once await returns, clear the cache.
This is to dedup unnecessary renew lease requests to raftstore.
With this, it can even remove the read index amend logic in raftstore v2. Because redundant read index messages will be greatly reduced or in fact completely removed. @BusyJay What do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is possible, I think we can propose another PR to implement it and see the improvements.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ReadIndex amend works for requests across threads.

let region_id = req.header.get_ref().region_id;
TLS_LOCAL_READ_METRICS.with(|m| m.borrow_mut().renew_lease_advance.inc());
// Send a read query which may renew the lease
let (msg, sub) = PeerMsg::raft_query(req.clone());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above. If we cache the sub after send successfully, it can reduce the unnecessary renew lease requests.

@@ -8,3 +8,5 @@ mod ready;
pub use command::{SimpleWriteDecoder, SimpleWriteEncoder};
pub use life::DestroyProgress;
pub use ready::AsyncWriter;

pub(crate) use self::query::LocalReader;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In v1's code, it's called in components/server as well.

@ti-chi-bot ti-chi-bot added status/LGT2 Status: PR - There are already 2 approvals and removed status/LGT1 Status: PR - There is already 1 approval labels Oct 11, 2022
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
@5kbpers
Copy link
Member

5kbpers commented Oct 11, 2022

/merge

@ti-chi-bot
Copy link
Member

@5kbpers: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

You only need to trigger /merge once, and if the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

If you have any questions about the PR merge process, please refer to pr process.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 0ee256b

@ti-chi-bot ti-chi-bot added the status/can-merge Status: Can merge to base branch label Oct 11, 2022
@SpadeA-Tang
Copy link
Member Author

/test

@ti-chi-bot ti-chi-bot merged commit 94c8aa5 into tikv:master Oct 11, 2022
@ti-chi-bot ti-chi-bot added this to the Pool milestone Oct 11, 2022
@BusyJay
Copy link
Member

BusyJay commented Oct 11, 2022

Why merge this? I think it depends on #13568

@SpadeA-Tang
Copy link
Member Author

Why merge this? I think it depends on #13568

The circumstance here is much easier. It only has snap oepration and does not have cache.

@BusyJay
Copy link
Member

BusyJay commented Oct 11, 2022

The circumstance here is much easier. It only has snap oepration and does not have cache.

It doesn't have cross snapshot cache, but it may has cache within the same region. And this PR also lacks the check of last_valid_ts.

impl<'r> SnapRequestInspector<'r> {
fn inspect(&mut self, req: &RaftCmdRequest) -> Result<RequestPolicy> {
assert!(!req.has_admin_request());
if req.get_requests().len() != 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, maybe we don't need RaftCmdRequest at all.

self.router.send_raft_message(msg)
}

pub async fn get_snapshot(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub async fn get_snapshot(
pub async fn snapshot(

return Ok(RequestPolicy::ReadIndex);
}

// If applied index's term is differ from current raft's term, leader transfer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// If applied index's term is differ from current raft's term, leader transfer
// If applied index's term is different from current raft's term, leader transfer

}
}

fn has_applied_to_current_term(&mut self) -> bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better inline these functions.

let (msg, sub) = PeerMsg::raft_query(req.clone());
let mut err = errorpb::Error::default();
match MsgRouter::send(&self.router, region_id, msg) {
Ok(()) => return Ok(sub.result().await),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ReadIndex amend works for requests across threads.

@SpadeA-Tang
Copy link
Member Author

The circumstance here is much easier. It only has snap oepration and does not have cache.

It doesn't have cross snapshot cache, but it may has cache within the same region. And this PR also lacks the check of last_valid_ts.

Checking last_valid_ts means we have to acquire the system time twice? One before getting the snapshot and one after it.

@BusyJay
Copy link
Member

BusyJay commented Oct 11, 2022

Checking last_valid_ts means we have to acquire the system time twice? One before getting the snapshot and one after it.

Yes, for V1. For v2, only check if latest tablet is None is sufficient.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contribution Type: PR - From contributors release-note-none size/XXL status/can-merge Status: Can merge to base branch status/LGT2 Status: PR - There are already 2 approvals
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants