New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow to lock a replicaset's buckets from rebalancing #71
Comments
And allow to lock a concrete bucket. For example, by: vshard.storage.bucket_pin(bucket_id) |
Why?
How it works?Replicaset lock is not the same as pinning all of its buckets. When a replicaset Replicaset lock and rebalancingWhen a replicaset is locked, it does not participate in rebalancing. It means, Bucket pin and rebalancingIt is the much more complex case.
Pseudocode:
|
Locked replicaset can neither receive new buckets nor send its own ones. Actually, it and its buckets do not participate in rebalancing, and non-locked replicasets are rebalanced independently. For example, consider a cluster: replicaset1: locked, weight = 1, bucket count = 1500 replicaset2: weight = 1, bucket count = 1500 When a replicaset3 is added, only rs3 and rs2 participate in rebalancing (respecting their weights): replicaset1: locked, weight = 1, bucket count = 1500 replicaset2: weight = 1, bucket_count = 500 replicaset3: weight = 2, bucket_count = 1000 The lock is useful for example to hold some test data on a particular replicaset, or to store on the replicaset a special data, that must be stored together. Part of #71
Locked replicaset can neither receive new buckets nor send its own ones. Actually, it and its buckets do not participate in rebalancing, and non-locked replicasets are rebalanced independently. For example, consider a cluster: replicaset1: locked, weight = 1, bucket count = 1500 replicaset2: weight = 1, bucket count = 1500 When a replicaset3 is added, only rs3 and rs2 participate in rebalancing (respecting their weights): replicaset1: locked, weight = 1, bucket count = 1500 replicaset2: weight = 1, bucket_count = 500 replicaset3: weight = 2, bucket_count = 1000 The lock is useful for example to hold some test data on a particular replicaset, or to store on the replicaset a special data, that must be stored together. Part of #71
Locked replicaset can neither receive new buckets nor send its own ones. Actually, it and its buckets do not participate in rebalancing, and non-locked replicasets are rebalanced independently. For example, consider a cluster: replicaset1: locked, weight = 1, bucket count = 1500 replicaset2: weight = 1, bucket count = 1500 When a replicaset3 is added, only rs3 and rs2 participate in rebalancing (respecting their weights): replicaset1: locked, weight = 1, bucket count = 1500 replicaset2: weight = 1, bucket_count = 500 replicaset3: weight = 2, bucket_count = 1000 The lock is useful for example to hold some test data on a particular replicaset, or to store on the replicaset a special data, that must be stored together. Part of #71
Pinned bucket is the bucket, that can not be sent out of its replicaset. Taking pinned buckets into account changes rebalancer algorithm, since now on some replicasets the perfect balance can not be reached. Iterative algorithm is used to learn the best balance in a cluster. On each step it calculates perfect bucket count for each replicaset. If this count can not be satisfied due to pinned buckets, the algorithm does best effort to get the perfect balance. This is done via ignoring of replicasets disbalanced via pinning, and their pinned buckets. After that a new balance is calculated. And it can happen, that it can not be satisfied too. It is possible, because ignoring of pinned buckets in overpopulated replicasets leads to decrease of perfect bucket count in other replicasets, and a new values can become less that their pinned bucket count. Part of #71
Pinned bucket is the bucket, that can not be sent out of its replicaset. Taking pinned buckets into account changes rebalancer algorithm, since now on some replicasets the perfect balance can not be reached. Iterative algorithm is used to learn the best balance in a cluster. On each step it calculates perfect bucket count for each replicaset. If this count can not be satisfied due to pinned buckets, the algorithm does best effort to get the perfect balance. This is done via ignoring of replicasets disbalanced via pinning, and their pinned buckets. After that a new balance is calculated. And it can happen, that it can not be satisfied too. It is possible, because ignoring of pinned buckets in overpopulated replicasets leads to decrease of perfect bucket count in other replicasets, and a new values can become less that their pinned bucket count. Part of #71
Locked replicaset can neither receive new buckets nor send its own ones. Actually, it and its buckets do not participate in rebalancing, and non-locked replicasets are rebalanced independently. For example, consider a cluster: replicaset1: locked, weight = 1, bucket count = 1500 replicaset2: weight = 1, bucket count = 1500 When a replicaset3 is added, only rs3 and rs2 participate in rebalancing (respecting their weights): replicaset1: locked, weight = 1, bucket count = 1500 replicaset2: weight = 1, bucket_count = 500 replicaset3: weight = 2, bucket_count = 1000 The lock is useful for example to hold some test data on a particular replicaset, or to store on the replicaset a special data, that must be stored together. Part of #71
Pinned bucket is the bucket, that can not be sent out of its replicaset. Taking pinned buckets into account changes rebalancer algorithm, since now on some replicasets the perfect balance can not be reached. Iterative algorithm is used to learn the best balance in a cluster. On each step it calculates perfect bucket count for each replicaset. If this count can not be satisfied due to pinned buckets, the algorithm does best effort to get the perfect balance. This is done via ignoring of replicasets disbalanced via pinning, and their pinned buckets. After that a new balance is calculated. And it can happen, that it can not be satisfied too. It is possible, because ignoring of pinned buckets in overpopulated replicasets leads to decrease of perfect bucket count in other replicasets, and a new values can become less that their pinned bucket count. Part of #71
Add a method like
This method must protect active buckets of this replicaset from rebalancing, and must forbid to receive new buckets. In other words, rebalancer must consider such replicaset as balanced, even if it contradicts with weights.
The text was updated successfully, but these errors were encountered: