Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raftstore: do not update pending_cross_snap when peer is applying snapshot #3873

Merged
merged 15 commits into from Dec 18, 2018

Conversation

Projects
None yet
3 participants
@Connor1996
Copy link
Member

Connor1996 commented Dec 4, 2018

What have you changed? (mandatory)

In some extreme case, it may happen that a new snapshot is received whereas a snapshot is still in applying. If the snapshot under applying is generated before merge and the new snapshot is generated after merge, update pending_cross_snap may cause source peer destroys itself improperly and then when target peer is ready to merge and find a destroyed source peer which causes panic. So don't update pending_cross_snap in check_snapshot() when the peer is applying snapshot.

What are the type of the changes? (mandatory)

  • Bug fix (non-breaking change which fixes an issue)

How has this PR been tested? (mandatory)

unit-test

Does this PR affect documentation (docs/docs-cn) update? (mandatory)

No

Does this PR affect tidb-ansible update? (mandatory)

No

fix check snapshot
Signed-off-by: Connor1996 <zbk602423539@gmail.com>

@Connor1996 Connor1996 requested a review from BusyJay Dec 4, 2018

@Connor1996 Connor1996 added the C: Raft label Dec 4, 2018

Connor1996 added some commits Dec 4, 2018

fix format
Signed-off-by: Connor1996 <zbk602423539@gmail.com>
fix clippy
Signed-off-by: Connor1996 <zbk602423539@gmail.com>
add some comment
Signed-off-by: Connor1996 <zbk602423539@gmail.com>
fix format
Signed-off-by: Connor1996 <zbk602423539@gmail.com>
Show resolved Hide resolved src/raftstore/store/fsm/peer.rs Outdated
Show resolved Hide resolved src/raftstore/store/fsm/peer.rs Outdated
Show resolved Hide resolved src/raftstore/store/fsm/peer.rs Outdated
Show resolved Hide resolved src/raftstore/store/fsm/peer.rs Outdated
address comment
Signed-off-by: Connor1996 <zbk602423539@gmail.com>
@hicqu

This comment has been minimized.

Copy link
Contributor

hicqu commented Dec 10, 2018

The test case can still pass after removing the bug fix code, but keep the new failpoint. It's a great catch for finding the bug, and I think the fix is correct. Please improve the test case, maybe we need more failpoints to assert it in code.

stable
Signed-off-by: Connor1996 <zbk602423539@gmail.com>
@Connor1996

This comment has been minimized.

Copy link
Member Author

Connor1996 commented Dec 11, 2018

@hicqu Already stable it, and from my test when testing the case only, it can be reproduced every time. When running it with other cases, it can reproduce the issue 9 out of 10. I think it is acceptable.

@Connor1996

This comment has been minimized.

Copy link
Member Author

Connor1996 commented Dec 14, 2018

Show resolved Hide resolved tests/failpoints/cases/test_merge.rs Outdated
Show resolved Hide resolved tests/failpoints/cases/test_merge.rs Outdated

Connor1996 added some commits Dec 15, 2018

address comment
Signed-off-by: Connor1996 <zbk602423539@gmail.com>
fix format
Signed-off-by: Connor1996 <zbk602423539@gmail.com>
@Connor1996

This comment has been minimized.

Copy link
Member Author

Connor1996 commented Dec 16, 2018

Connor1996 added some commits Dec 18, 2018

change filter
Signed-off-by: Connor1996 <zbk602423539@gmail.com>
tiny change
Signed-off-by: Connor1996 <zbk602423539@gmail.com>
@Connor1996

This comment has been minimized.

Copy link
Member Author

Connor1996 commented Dec 18, 2018

PTAL again @BusyJay @hicqu

address comment
Signed-off-by: Connor1996 <zbk602423539@gmail.com>

@Connor1996 Connor1996 force-pushed the Connor1996:fix-check-snapshot branch from 90bbe1e to 7e3efcc Dec 18, 2018

@@ -600,7 +603,11 @@ impl Filter<StoreMsg> for LeadingDuplicatedSnapshotFilter {
if msg.get_message().get_msg_type() == MessageType::MsgSnapshot && !stale {
if last_msg.as_ref().map_or(false, |l| l != &msg) {
to_send.push(StoreMsg::RaftMessage(last_msg.take().unwrap()));
*last_msg = Some(msg);
if self.together {
to_send.push(StoreMsg::RaftMessage(msg));

This comment has been minimized.

Copy link
@BusyJay

BusyJay Dec 18, 2018

Contributor

So if the message sequence is snapshot[1], snapshot[2], snapshot[2], snapshot[2], after applying the filter, the sequence will not be changed?

This comment has been minimized.

Copy link
@Connor1996

Connor1996 Dec 18, 2018

Author Member

yes, this filter is just to ensure two different snapshots show up together once. It doesn't care the rest messages.

clean log
Signed-off-by: Connor1996 <zbk602423539@gmail.com>
clean up
Signed-off-by: Connor1996 <zbk602423539@gmail.com>
@hicqu

This comment has been minimized.

Copy link
Contributor

hicqu commented Dec 18, 2018

LGTM.

@hicqu

hicqu approved these changes Dec 18, 2018

@Connor1996 Connor1996 merged commit 577a6f7 into tikv:master Dec 18, 2018

4 checks passed

DCO All commits are signed off!
Details
ci/circleci: test Your tests passed on CircleCI!
Details
idc-jenkins-ci/build Jenkins job succeeded.
Details
jenkins-ci-tikv/build Jenkins job succeeded.
Details

@Connor1996 Connor1996 deleted the Connor1996:fix-check-snapshot branch Dec 18, 2018

Connor1996 added a commit to Connor1996/tikv that referenced this pull request Dec 18, 2018

raftstore: do not update pending_cross_snap when peer is applying sna…
…pshot (tikv#3873)

Signed-off-by: Connor1996 <zbk602423539@gmail.com>

Connor1996 added a commit that referenced this pull request Dec 19, 2018

raftstore: do not update pending_cross_snap when peer is applying sna…
…pshot (#3873) (#3949)

Signed-off-by: Connor1996 <zbk602423539@gmail.com>

Connor1996 added a commit to Connor1996/tikv that referenced this pull request Jan 2, 2019

raftstore: do not update pending_cross_snap when peer is applying sna…
…pshot(tikv#3873)

Signed-off-by: Connor1996 <zbk602423539@gmail.com>

Connor1996 added a commit to Connor1996/tikv that referenced this pull request Jan 2, 2019

raftstore: do not update pending_cross_snap when peer is applying sna…
…pshot(tikv#3873)

Signed-off-by: Connor1996 <zbk602423539@gmail.com>

Connor1996 added a commit to Connor1996/tikv that referenced this pull request Jan 2, 2019

raftstore: do not update pending_cross_snap when peer is applying sna…
…pshot (tikv#3873)

Signed-off-by: Connor1996 <zbk602423539@gmail.com>

Connor1996 added a commit to Connor1996/tikv that referenced this pull request Jan 2, 2019

raftstore: do not update pending_cross_snap when peer is applying sna…
…pshot (tikv#3873)

Signed-off-by: Connor1996 <zbk602423539@gmail.com>

overvenus added a commit that referenced this pull request Jan 2, 2019

raftstore: not always update pending_cross_snap (#3822) (#4003)
* raftstore: do not update pending_cross_snap when peer is applying snapshot(#3873)

Signed-off-by: Connor1996 <zbk602423539@gmail.com>

overvenus added a commit that referenced this pull request Jan 2, 2019

raftstore: do not update pending_cross_snap when peer is applying sna…
…pshot (#3873) (#4004)

Signed-off-by: Connor1996 <zbk602423539@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.