-
Notifications
You must be signed in to change notification settings - Fork 552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvements & cleanup in node operations tests, fix for finishing node operations #7862
Merged
mmaslankaprv
merged 7 commits into
redpanda-data:dev
from
mmaslankaprv:rebalancing-tests-follow-up
Dec 21, 2022
Merged
Improvements & cleanup in node operations tests, fix for finishing node operations #7862
mmaslankaprv
merged 7 commits into
redpanda-data:dev
from
mmaslankaprv:rebalancing-tests-follow-up
Dec 21, 2022
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
mmaslankaprv
changed the title
Improvements & cleanup in node operations tests
Improvements & cleanup in node operations tests, fix for finishing node operations
Dec 20, 2022
ci failure: |
6 tasks
ztlpn
reviewed
Dec 20, 2022
mmaslankaprv
force-pushed
the
rebalancing-tests-follow-up
branch
from
December 21, 2022 07:12
539a258
to
1bc52f2
Compare
Signed-off-by: Michal Maslanka <michal@redpanda.com>
Signed-off-by: Michal Maslanka <michal@redpanda.com>
Signed-off-by: Michal Maslanka <michal@redpanda.com>
Signed-off-by: Michal Maslanka <michal@redpanda.com>
Refactored the nodes operations fuzzy test to share logic with its smaller version - random node operations test. Signed-off-by: Michal Maslanka <michal@redpanda.com>
Members backend reconciliation loop is processing single node update at a time. This limitation introduce a dependency between subsequent partition rebalance phases. Since after node addition some of the partitions may be moved to the node that is requested to be decommissioned and shut down before the previous rebalancing phase finished it is required to prioritize decommissioning over rebalancing. Introduced a change that will always execute node decommission operation first before waiting for the rebalancing to finish. As a part of node decommissioning process all required reallocation (the one that targets the decommissioned node) will be canceled. The addition rebalance operation is going to be scheduled again after decommissioning finishes. Signed-off-by: Michal Maslanka <michal@redpanda.com>
Added learner recovery throttling to prevent node from finishing decommission before it is recommissioned. Signed-off-by: Michal Maslanka <michal@redpanda.com>
mmaslankaprv
force-pushed
the
rebalancing-tests-follow-up
branch
from
December 21, 2022 09:31
1bc52f2
to
ff7f2a3
Compare
ztlpn
approved these changes
Dec 21, 2022
/backport v22.3.x |
Failed to run cherry-pick command. I executed the below command:
|
dotnwat
reviewed
Dec 24, 2022
} | ||
// sort updates to prioritize decommissions/recommissions over node | ||
// additions, use stable sort to keep de/recommissions order | ||
static auto is_de_or_recommission = [](const update_meta& meta) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think there is any point in making this static (and might even introduce thread safety overheads) since this captures no state.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Members backend reconciliation loop is processing single node update at
a time. This limitation introduce a dependency between subsequent
partition rebalance phases. Since after node addition some of the
partitions may be moved to the node that is requested to be
decommissioned and shut down before the previous rebalancing phase
finished it is required to prioritize decommissioning over rebalancing.
Introduced a change that will always execute node decommission operation
first before waiting for the rebalancing to finish. As a part of
node decommissioning process all required reallocation (the one that
targets the decommissioned node) will be canceled. The addition
rebalance operation is going to be scheduled again after decommissioning
finishes.
Test improvements
Added failures injection to random node operations test and refactored nodes operation fuzzy test to share the same code.
Fixes: #7874
Backports Required
UX Changes
Release Notes
Improvements