Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

checker: fix remove pending peer by replacing peer #1607

Merged
merged 7 commits into from Sep 29, 2019

Conversation

@rleungx
Copy link
Member

commented Jul 2, 2019

What problem does this PR solve?

Currently, once we remove a pending peer when a store is offline or down, there might be a gap before making a replica, which may be very dangerous.

What is changed and how it works?

This PR fixes this problem by replacing the pending peer instead of removing the peer directly.

Check List

Tests

  • Unit test

Related changes

  • Need to cherry-pick to the release branch
  • Need to be included in the release notes
@rleungx

This comment has been minimized.

Copy link
Member Author

commented Jul 2, 2019

/rebuild

server/checker/replica_checker.go Outdated Show resolved Hide resolved
server/checker/replica_checker_test.go Outdated Show resolved Hide resolved
@rleungx rleungx force-pushed the rleungx:fix-remove-pending branch from 7a9ea4f to cce31da Jul 3, 2019
@rleungx rleungx force-pushed the rleungx:fix-remove-pending branch from 22a777c to 16a74c4 Jul 3, 2019
@rleungx rleungx added the DNM label Jul 3, 2019
@rleungx

This comment has been minimized.

Copy link
Member Author

commented Jul 19, 2019

I completely remove the pending logic when making replicas. When we have the pending peer, we still add another peer first to make the whole process safer. Is this reasonable? @disksing @nolouch

@disksing

This comment has been minimized.

Copy link
Member

commented Jul 24, 2019

The operator could be blocked for a very long time if the pending peer does not recover.

@rleungx

This comment has been minimized.

Copy link
Member Author

commented Jul 26, 2019

Since we already have learner, maybe it won't block anymore? @disksing

@disksing

This comment has been minimized.

Copy link
Member

commented Jul 26, 2019

em, sounds make sense.

@disksing

This comment has been minimized.

Copy link
Member

commented Jul 26, 2019

Should we remove the enable learner option?

@rleungx

This comment has been minimized.

Copy link
Member Author

commented Jul 26, 2019

@disksing Yes. I think we can remove it.
/cc @nolouch What your opinion?

@rleungx rleungx force-pushed the rleungx:fix-remove-pending branch from 32cd111 to 182c545 Jul 30, 2019
@rleungx rleungx force-pushed the rleungx:fix-remove-pending branch from 6850383 to 9a05f7e Aug 22, 2019
rleungx added 4 commits Jul 2, 2019
Signed-off-by: Ryan Leung <rleungx@gmail.com>
Signed-off-by: Ryan Leung <rleungx@gmail.com>
Signed-off-by: Ryan Leung <rleungx@gmail.com>
Signed-off-by: Ryan Leung <rleungx@gmail.com>
@rleungx rleungx force-pushed the rleungx:fix-remove-pending branch from 9a05f7e to 02b92fb Sep 24, 2019
@rleungx rleungx added backport/release-3.1 and removed DNM labels Sep 24, 2019
@codecov-io

This comment has been minimized.

Copy link

commented Sep 24, 2019

Codecov Report

Merging #1607 into master will increase coverage by <.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1607      +/-   ##
==========================================
+ Coverage   77.44%   77.45%   +<.01%     
==========================================
  Files         163      163              
  Lines       16430    16422       -8     
==========================================
- Hits        12725    12720       -5     
+ Misses       2658     2656       -2     
+ Partials     1047     1046       -1
Impacted Files Coverage Δ
server/checker/replica_checker.go 82.09% <ø> (+3.27%) ⬆️
server/schedulers/shuffle_hot_region.go 58.97% <0%> (-6.42%) ⬇️
server/schedulers/random_merge.go 62.5% <0%> (-5%) ⬇️
server/grpc_service.go 57.91% <0%> (-0.66%) ⬇️
client/client.go 69.89% <0%> (-0.44%) ⬇️
server/handler.go 51.92% <0%> (+0.51%) ⬆️
server/tso/tso.go 79.81% <0%> (+2.75%) ⬆️
pkg/metricutil/metricutil.go 100% <0%> (+9.37%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c973515...69b45ce. Read the comment docs.

nolouch added 2 commits Sep 25, 2019
@nolouch nolouch added the CanMerge label Sep 27, 2019
@sre-bot

This comment has been minimized.

Copy link

commented Sep 27, 2019

/run-all-tests

@sre-bot

This comment has been minimized.

Copy link

commented Sep 27, 2019

@rleungx merge failed.

@CLAassistant

This comment has been minimized.

Copy link

commented Sep 29, 2019

CLA assistant check
All committers have signed the CLA.

@rleungx

This comment has been minimized.

Copy link
Member Author

commented Sep 29, 2019

/merge

@sre-bot

This comment has been minimized.

Copy link

commented Sep 29, 2019

/run-all-tests

@sre-bot

This comment has been minimized.

Copy link

commented Sep 29, 2019

@rleungx merge failed.

@rleungx

This comment has been minimized.

Copy link
Member Author

commented Sep 29, 2019

/merge

@sre-bot

This comment has been minimized.

Copy link

commented Sep 29, 2019

/run-all-tests

@sre-bot sre-bot merged commit e6c0609 into pingcap:master Sep 29, 2019
8 checks passed
8 checks passed
ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
idc-jenkins-ci-pd/integration-common-test Jenkins job succeeded.
Details
idc-jenkins-ci-pd/integration-compatibility-test Jenkins job succeeded.
Details
idc-jenkins-ci-pd/integration-ddl-test Jenkins job succeeded.
Details
idc-jenkins-ci/build Jenkins job succeeded.
Details
idc-jenkins-ci/test Jenkins job succeeded.
Details
license/cla Contributor License Agreement is signed.
Details
@sre-bot

This comment has been minimized.

Copy link

commented Sep 29, 2019

cherry pick to release-2.1 failed

@sre-bot

This comment has been minimized.

Copy link

commented Sep 29, 2019

cherry pick to release-3.0 failed

@sre-bot

This comment has been minimized.

Copy link

commented Sep 29, 2019

cherry pick to release-3.1 failed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
8 participants
You can’t perform that action at this time.