Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raftstore: ignore error when sending messages #4734

Merged
merged 11 commits into from May 29, 2019

Conversation

BusyJay
Copy link
Member

@BusyJay BusyJay commented May 20, 2019

What have you changed? (mandatory)

Fix #4717, I'm still trying to write a test case to reproduce it reliably.

What are the type of the changes? (mandatory)

  • Bug fix (change which fixes an issue)

How has this PR been tested? (mandatory)

manual tests

Does this PR affect documentation (docs) or release note? (mandatory)

No.

Does this PR affect tidb-ansible update? (mandatory)

No.

Refer to a related PR or issue link (optional)

#4717

Signed-off-by: Jay Lee <busyjaylee@gmail.com>
@BusyJay BusyJay added the type/bugfix Type: PR - Fix a bug label May 20, 2019
@zhangjinpeng87
Copy link
Member

/release

@sre-bot
Copy link
Contributor

sre-bot commented May 20, 2019

zhangjinpeng87
zhangjinpeng87 previously approved these changes May 20, 2019
Copy link
Member

@zhangjinpeng87 zhangjinpeng87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

overvenus
overvenus previously approved these changes May 20, 2019
@siddontang
Copy link
Contributor

@zhangjinpeng1987

have you ensured using this can avoid 1s latency jitter now?

@overvenus overvenus dismissed their stale review May 21, 2019 00:46

Please update the change log

Signed-off-by: Jay Lee <busyjaylee@gmail.com>
@BusyJay
Copy link
Member Author

BusyJay commented May 21, 2019

/release

@sre-bot
Copy link
Contributor

sre-bot commented May 21, 2019

@@ -31,10 +32,27 @@ pub fn initial_logger(config: &TiKvConfig) {
let log_rotation_timespan =
chrono::Duration::from_std(config.log_rotation_timespan.clone().into())
.expect("config.log_rotation_timespan is an invalid duration.");

// Collects following targets.
const ENABLED_TARGETS: &[&str] = &[
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, seem now we have not filtered any message, right? @BusyJay

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. And it's expected not to filter any messages. Here is just for debug purpose, will remove later.

Signed-off-by: Jay Lee <busyjaylee@gmail.com>
@BusyJay
Copy link
Member Author

BusyJay commented May 24, 2019

/release

@sre-bot
Copy link
Contributor

sre-bot commented May 24, 2019

@@ -838,6 +837,15 @@ impl Peer {
"peer_id" => self.peer.get_id(),
"lease" => ?self.leader_lease,
);
// If predecessor read index during transferring leader and received quorum's
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTAL @dcalvin

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// If predecessor read index during transferring leader and received quorum's
// If the predecessor read index receives quorum's heart response during

@zhangjinpeng87
Copy link
Member

/release

@sre-bot
Copy link
Contributor

sre-bot commented May 26, 2019

@@ -911,6 +910,15 @@ impl Peer {
"peer_id" => self.peer.get_id(),
"lease" => ?self.leader_lease,
);
// If predecessor read index during transferring leader and received quorum's
// heartbeat response, it may waiting for advancing apply to current term to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it may wait

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

waiting for advancing apply to current term to apply the read.
What does "advancing apply to current term" mean? @BusyJay

// TODO: Maybe the predecessor should just drop all the read requests directly?
// All the requests need to be redirected in the end anyway and executing
// prewrites or commits will be just a waste.
self.last_urgent_proposal_idx = self.raft_group.raft.raft_log.last_index();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any test to cover it?

Signed-off-by: Jay Lee <busyjaylee@gmail.com>
Signed-off-by: Jay Lee <busyjaylee@gmail.com>
Signed-off-by: Jay Lee <busyjaylee@gmail.com>
Signed-off-by: Jay Lee <busyjaylee@gmail.com>
@BusyJay
Copy link
Member Author

BusyJay commented May 28, 2019

PTAL

if filtered_msg.is_none() {
*filtered_msg = Some(msg);
}
} else if msg.get_message().get_msg_type() == self.flush_type {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't not need to break here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following messages may still apply the rules, it may not be appropriate to break as a general filter.

@@ -2123,6 +2107,10 @@ impl Peer {
}

pub fn get_peer_from_cache(&self, peer_id: u64) -> Option<metapb::Peer> {
if peer_id == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why add it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When there is no leader, peer_id may be 0 as it represents leader id. Add a check to skip slow path.

@hicqu
Copy link
Contributor

hicqu commented May 28, 2019

LGTM.

@BusyJay
Copy link
Member Author

BusyJay commented May 29, 2019

/run-integration-tests

@BusyJay BusyJay merged commit 0c57e3c into tikv:master May 29, 2019
@BusyJay BusyJay deleted the ignore-sending-error branch May 29, 2019 05:15
hicqu pushed a commit to hicqu/tikv that referenced this pull request Jun 10, 2019
Signed-off-by: qupeng <qupeng@pingcap.com>
BusyJay added a commit to BusyJay/tikv that referenced this pull request Jun 13, 2019
BusyJay added a commit to BusyJay/tikv that referenced this pull request Jun 13, 2019
Signed-off-by: Jay <BusyJay@users.noreply.github.com>
BusyJay added a commit that referenced this pull request Jun 14, 2019
* raftstore: ignore error when sending messages (#4734)
* raftstore: skip empty callback (#4682)

Signed-off-by: Jay Lee <busyjaylee@gmail.com>
sticnarf pushed a commit to sticnarf/tikv that referenced this pull request Oct 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bugfix Type: PR - Fix a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Sending messages should be more tolerant
7 participants