New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[docdb ] Leader-only tserver blacklisting mode for rolling upgrades #1748
Comments
Some pointers for this: We have a generic mechanism for persisting cluster configuration: This currently has a blacklisting functionality, that is used for draining data from the provided nodes. This needs to be persisted, in case of master failover, as the drain could take a while. This blacklist is then used in the load balancer (see:
There's also a yb-admin command to change the blacklist by adding/removing nodes from it: |
@rajukumaryb and I were discussing this issue -
|
Also, we need to move the leaders off the node quickly, let's check the current rate limiting mechanism and see how we can speed it up. |
|
Summary: Load balancer will move leadership role for all tablet replicas on leader blacklisted tservers to other tservers with follower replicas. Prior leader load balancing mechanism is extended to treat leader blacklisted tserver as hosting infinite leader replicas to achieve this goal. Usage: yb-admin -master_addresses ... change_leader_blacklist ADD 127.0.0.1:9100 yb-admin -master_addresses ... change_leader_blacklist REMOVE 127.0.0.1:9100 yb-admin -master_addresses ... get_leader_blacklist_completion Caveats: - Leader blacklisted tserver is not yet prevented from becoming a leader for some tablet. In this case, load balancer will again move leadership away from it. - If all replicas of a tablet are hosted on leader blacklisted tservers, load balancer cannot (yet) move the leadership role to a non-leader blacklisted tserver. Test Plan: ./build/debug-gcc-dynamic-ninja/tests-master/catalog_manager-test --gtest_filter=TestLoadBalancerCommunity.TestLoadBalancerAlgorithm ./yb_build.sh debug --scb --java-test org.yb.loadtester.TestClusterTserverRollingLeaderBlacklist#testClusterTserverRollingLeaderBlacklist Reviewers: bogdan, rahuldesirazu Reviewed By: rahuldesirazu Subscribers: rao, ybase Differential Revision: https://phabricator.dev.yugabyte.com/D7145
Rolling upgrades should be done as follows:
etc.
This requires splitting the current node blacklisting logic into two parts: for leaders and for data.
The load balancer should be responsible for rebalancing the leader load between upgrade of individual nodes.
The text was updated successfully, but these errors were encountered: