Skip to content

2.25.1.0-b102

@spolitov spolitov tagged this 28 Dec 08:06
Summary:
Cluster balancer tries to pick tablets for move in the manner to guarantee even disk load after move.
But move consists of 2 steps, on the first step we add tablet to a new replica.
On the second step we remove overreplicated tablet from some replica.
So there is no guarantee that this tablet will be removed from the replica where it recides on the most loaded disk.

Fixed by taking disk load into account when picking replica to remove tablet.

Also test LoadBalancerMiniClusterTest.CheckLoadBalanceDriveAware could fail because times out waiting for cluster balance.
It happens because we cannot balance leaders. Since have to wait for 20s before step down if protege already lost leader election.
Fixed by decreasing this time to 0s in this test.
Jira: DB-14682

Test Plan: ./yb_build.sh release -n 800 --cxx-test integration-tests_load_balancer_mini_cluster-test --gtest_filter LoadBalancerMiniClusterTest.CheckLoadBalanceDriveAware -- -p 8

Reviewers: zdrudi, asrivastava

Reviewed By: asrivastava

Subscribers: ybase

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D40922
Assets 2
Loading