-
Notifications
You must be signed in to change notification settings - Fork 552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cluster: publish total available reclaim size to balancer #16354
Conversation
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
d8b6905
to
8d161e9
Compare
new failures in https://buildkite.com/redpanda/redpanda/builds/44451#018d57bd-426f-4946-94f7-cdf8271819d1:
|
ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/44451#018d57bd-426c-4bb5-bf55-002e20734efa ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/44491#018d5c18-12a2-48fd-8858-2bf043653c7f |
8d161e9
to
20e1063
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this change alone, is the balancer actually balancing any differently? Or does this just change the conditions under which the balancer will balance? I think the latter, but just want to make sure I understand
@andrwng the answer is both!
|
Got it, thanks for the explanation! That makes sense; I recalled that we don't try to balance space across nodes, but it makes sense that we gate individual moves based on whether there is available/reclaimable space. |
/backport v23.3.x |
/backport v23.2.x |
Failed to create a backport PR to v23.2.x branch. I tried:
|
SM publishes reclaimable space up to the local retention level to the cluster balancer. However, it may be the case that a disk is nearly full even at the local retention target, causing the balancer to believe that any space is not available for reclaim. This is problematic for decommission which needs to be able to find some place to create new replicas.
This change swaps out the reclaimable space at local retention level for total reclaimable space. The balancer policy will be expanded further to also take into account the local retention targets and optimize decisions. Changes for that are here #16372.
Related https://github.com/redpanda-data/core-internal/issues/719
Backports Required
Release Notes
Improvements