Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improvement(lb): add add_node_list.sh to add nodes with copy_pri after all copy_sec done #528

Merged
merged 16 commits into from
May 9, 2020

Conversation

foreverneverer
Copy link
Contributor

@foreverneverer foreverneverer commented May 1, 2020

What problem does this PR solve?

When we add new node for cluster, the copying operation will let the new node has very high load. And then the new node almost is unavailable.

What is changed and how it works?

This pr change the default order of copying primary preferentially to copying secondary preferentially by setting only_move_primary = true when we add new node and recover it after completed.

Before the total replica count is balanced, the new node won't have any primary until copying secondary completes. Because the new node won't serve user-requests, reads will not be influenced by the data-migration of replicas.

Check List

Tests (for the pr refactor the offline_node_list.sh, so need 2 manual test)

  • Manual test for pegasus_online_node_list.sh (add detailed scripts or steps below)

    • deploy cluster with minos
    • add new node for the cluster using the script
    • check the nodes using nodes -d and the falcon
    • the result is: the cluster copy secondary preferentially and the falcon show the new node don't serve for read:
    • default version(the follow show the P999 latency):
      选区_003
    • new version:
      选区_002
  • Manual test for pegasus_offline_node_list.sh(add detailed scripts or steps below)

    • deploy cluster with minos and show the running replica server node = 5
    • execute the pegasus_offline_node_list.sh and show the running replica server node = 4

Related changes

  • Need to cherry-pick to the release branch
  • Need to update the documentation
  • Need to be included in the release note

scripts/minos_common.sh Outdated Show resolved Hide resolved
scripts/pegasus_online_node_list.sh Outdated Show resolved Hide resolved
acelyc111
acelyc111 previously approved these changes May 7, 2020
scripts/pegasus_online_node_list.sh Outdated Show resolved Hide resolved
scripts/pegasus_online_node_list.sh Outdated Show resolved Hide resolved
scripts/pegasus_online_node_list.sh Outdated Show resolved Hide resolved
scripts/pegasus_online_node_list.sh Outdated Show resolved Hide resolved
scripts/pegasus_online_node_list.sh Outdated Show resolved Hide resolved
scripts/pegasus_online_node_list.sh Outdated Show resolved Hide resolved
foreverneverer and others added 3 commits May 9, 2020 16:11
format

Co-authored-by: Wu Tao <wutao1@xiaomi.com>
Co-authored-by: Wu Tao <wutao1@xiaomi.com>
Co-authored-by: Wu Tao <wutao1@xiaomi.com>
@neverchanje neverchanje changed the title feat: add online_node_list script to add new nodes with coping secondary preferentially improvement(lb): add add_node_list.sh to add nodes with copy_pri after all copy_sec done May 9, 2020
@foreverneverer foreverneverer merged commit d67bc43 into apache:master May 9, 2020
@neverchanje neverchanje mentioned this pull request May 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants