New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow to send multiple buckets from one storage in parallel #161
Comments
Related to #167. |
Gerold103
added a commit
that referenced
this issue
Aug 17, 2019
Rebalancer algorithm is quite a complex thing, and it is better not to complicate it even more with the max receiving bucket count restriction. Now all the control is done on receiver's side. At this moment complexity of rebalancer algo is not big, but it relies on a fact, that each replicaset can send only one bucket at once. It won't be true after #161, and this is where the algorithm would become unbearably complex. Closes #167
Gerold103
added a commit
that referenced
this issue
Aug 26, 2019
Rebalancer algorithm is quite a complex thing, and it is better not to complicate it even more with the max receiving bucket count restriction. Now all the control is done on receiver's side. At this moment complexity of rebalancer algo is not big, but it relies on a fact, that each replicaset can send only one bucket at once. It won't be true after #161, and this is where the algorithm would become unbearably complex. Closes #167
Gerold103
added a commit
that referenced
this issue
Aug 28, 2019
Rebalancer algorithm is quite a complex thing, and it is better not to complicate it even more with the max receiving bucket count restriction. Now all the control is done on receiver's side. At this moment complexity of rebalancer algo is not big, but it relies on a fact, that each replicaset can send only one bucket at once. It won't be true after #161, and this is where the algorithm would become unbearably complex. Closes #167
Gerold103
added a commit
that referenced
this issue
Aug 28, 2019
The rebalancer will become parallel soon - it will be able to send multiple buckets from one node simultaneously. For that it will create several worker fibers. But they need somehow synchronize their work so as not to send too many buckets, or too few, and to spread the load evenly among receivers. Dispenser is a container of routes received from the rebalancer. Its task is to hand out the routes to worker fibers in a round-robin manner. It solves all the problems listed above. Part of #161
Gerold103
added a commit
that referenced
this issue
Aug 29, 2019
Rebalancer algorithm is quite a complex thing, and it is better not to complicate it even more with the max receiving bucket count restriction. Now all the control is done on receiver's side. At this moment complexity of rebalancer algo is not big, but it relies on a fact, that each replicaset can send only one bucket at once. It won't be true after #161, and this is where the algorithm would become unbearably complex. Closes #167
Gerold103
added a commit
that referenced
this issue
Aug 29, 2019
The rebalancer will become parallel soon - it will be able to send multiple buckets from one node simultaneously. For that it will create several worker fibers. But they need somehow synchronize their work so as not to send too many buckets, or too few, and to spread the load evenly among receivers. Dispenser is a container of routes received from the rebalancer. Its task is to hand out the routes to worker fibers in a round-robin manner. It solves all the problems listed above. Part of #161
Gerold103
added a commit
that referenced
this issue
Aug 30, 2019
Rebalancer algorithm is quite a complex thing, and it is better not to complicate it even more with the max receiving bucket count restriction. Now all the control is done on receiver's side. At this moment complexity of rebalancer algo is not big, but it relies on a fact, that each replicaset can send only one bucket at once. It won't be true after #161, and this is where the algorithm would become unbearably complex. Closes #167
Gerold103
added a commit
that referenced
this issue
Aug 30, 2019
The rebalancer will become parallel soon - it will be able to send multiple buckets from one node simultaneously. For that it will create several worker fibers. But they need somehow synchronize their work so as not to send too many buckets, or too few, and to spread the load evenly among receivers. Dispenser is a container of routes received from the rebalancer. Its task is to hand out the routes to worker fibers in a round-robin manner. It solves all the problems listed above. Part of #161
Gerold103
added a commit
that referenced
this issue
Aug 30, 2019
Closes #161 @TarantoolBot document Title: vshard rebalancer_max_sending VShard from its first release had quite a simple rebalancer - one process on one node calculates who and to whom should send how many buckets. The nodes applied these so called routes one by one sequentially. Unfortunately, such a simple schema works not fast enough, especially for Vinyl, where costs of reading disk are comparable with network costs. In fact, with Vinyl the rebalancer routes applier was sleeping most of the time. Now each node can send multiple buckets in parallel in a round-robin manner to several destinations, or even to one. To set degree of parallelism a new option is added - rebalancer_max_sending. It can be specified in a storage configuration in the root table: cfg.rebalancer_max_sending = 5 vshard.storage.cfg(cfg, box.info.uuid) In routers the option is ignored. Note, that max_sending N perhaps won't give N times speed up. It depends on network, disk, number of other fibers in the system. By default the option is 1. Maximal value is 15. One another important thing - from this moment rebalancer_max_receiving is not useless. It can actually limit load at one storage. Consider an example. You have 10 replicasets and a new one is added. Now all the 10 replicasets will try to send buckets to the new one. Assume, that each replicaset has 5 max sending. In that case the new replicaset will experience quite a high load of 50 buckets being downloaded at once. If the node needs to do some other work, perhaps such a big load is undesirable. Also too many parallel buckets can lead to timeouts in the rebalancing process itself. Then you can set lower rebalancer_max_sending on old replicasets, or decrease rebalancer_max_receiving on the new one. In the latter case some workers on old nodes will be throttled, and you will see that in the logs. Rebalancer_max_sending is important, if you have restriction on how many buckets can be read-only at once in the cluster. As you remember, when a bucket is being sent, it does not accept new write requests. For example, you have 100000 buckets and each bucket stores ~0.001% of your data. The cluster has 10 replicasets. And you never can afford > 0.1% of data locked on write. Then you should not set rebalancer_max_sending > 10 on these nodes. It guarantees that the rebalancer won't send more than 100 buckets at once in the whole cluster.
Gerold103
added a commit
that referenced
this issue
Aug 30, 2019
Closes #161 @TarantoolBot document Title: vshard rebalancer_max_sending VShard from its first release had quite a simple rebalancer - one process on one node calculates who and to whom should send how many buckets. The nodes applied these so called routes one by one sequentially. Unfortunately, such a simple schema works not fast enough, especially for Vinyl, where costs of reading disk are comparable with network costs. In fact, with Vinyl the rebalancer routes applier was sleeping most of the time. Now each node can send multiple buckets in parallel in a round-robin manner to several destinations, or even to one. To set degree of parallelism a new option is added - rebalancer_max_sending. It can be specified in a storage configuration in the root table: cfg.rebalancer_max_sending = 5 vshard.storage.cfg(cfg, box.info.uuid) In routers the option is ignored. Note, that max_sending N perhaps won't give N times speed up. It depends on network, disk, number of other fibers in the system. By default the option is 1. Maximal value is 15. One another important thing - from this moment rebalancer_max_receiving is not useless. It can actually limit load at one storage. Consider an example. You have 10 replicasets and a new one is added. Now all the 10 replicasets will try to send buckets to the new one. Assume, that each replicaset has 5 max sending. In that case the new replicaset will experience quite a high load of 50 buckets being downloaded at once. If the node needs to do some other work, perhaps such a big load is undesirable. Also too many parallel buckets can lead to timeouts in the rebalancing process itself. Then you can set lower rebalancer_max_sending on old replicasets, or decrease rebalancer_max_receiving on the new one. In the latter case some workers on old nodes will be throttled, and you will see that in the logs. Rebalancer_max_sending is important, if you have restriction on how many buckets can be read-only at once in the cluster. As you remember, when a bucket is being sent, it does not accept new write requests. For example, you have 100000 buckets and each bucket stores ~0.001% of your data. The cluster has 10 replicasets. And you never can afford > 0.1% of data locked on write. Then you should not set rebalancer_max_sending > 10 on these nodes. It guarantees that the rebalancer won't send more than 100 buckets at once in the whole cluster.
Gerold103
added a commit
that referenced
this issue
Aug 31, 2019
Rebalancer algorithm is quite a complex thing, and it is better not to complicate it even more with the max receiving bucket count restriction. Now all the control is done on receiver's side. At this moment complexity of rebalancer algo is not big, but it relies on a fact, that each replicaset can send only one bucket at once. It won't be true after #161, and this is where the algorithm would become unbearably complex. Closes #167
Gerold103
added a commit
that referenced
this issue
Aug 31, 2019
The rebalancer will become parallel soon - it will be able to send multiple buckets from one node simultaneously. For that it will create several worker fibers. But they need somehow synchronize their work so as not to send too many buckets, or too few, and to spread the load evenly among receivers. Dispenser is a container of routes received from the rebalancer. Its task is to hand out the routes to worker fibers in a round-robin manner. It solves all the problems listed above. Part of #161
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
No description provided.
The text was updated successfully, but these errors were encountered: