Allow to send multiple buckets from one storage in parallel #161

Gerold103 · 2018-11-22T12:54:31Z

No description provided.

Gerold103 · 2019-08-17T22:00:43Z

Related to #167.

Rebalancer algorithm is quite a complex thing, and it is better not to complicate it even more with the max receiving bucket count restriction. Now all the control is done on receiver's side. At this moment complexity of rebalancer algo is not big, but it relies on a fact, that each replicaset can send only one bucket at once. It won't be true after #161, and this is where the algorithm would become unbearably complex. Closes #167

Closes #161

Rebalancer algorithm is quite a complex thing, and it is better not to complicate it even more with the max receiving bucket count restriction. Now all the control is done on receiver's side. At this moment complexity of rebalancer algo is not big, but it relies on a fact, that each replicaset can send only one bucket at once. It won't be true after #161, and this is where the algorithm would become unbearably complex. Closes #167

The rebalancer will become parallel soon - it will be able to send multiple buckets from one node simultaneously. For that it will create several worker fibers. But they need somehow synchronize their work so as not to send too many buckets, or too few, and to spread the load evenly among receivers. Dispenser is a container of routes received from the rebalancer. Its task is to hand out the routes to worker fibers in a round-robin manner. It solves all the problems listed above. Part of #161

Closes #161

Rebalancer algorithm is quite a complex thing, and it is better not to complicate it even more with the max receiving bucket count restriction. Now all the control is done on receiver's side. At this moment complexity of rebalancer algo is not big, but it relies on a fact, that each replicaset can send only one bucket at once. It won't be true after #161, and this is where the algorithm would become unbearably complex. Closes #167

The rebalancer will become parallel soon - it will be able to send multiple buckets from one node simultaneously. For that it will create several worker fibers. But they need somehow synchronize their work so as not to send too many buckets, or too few, and to spread the load evenly among receivers. Dispenser is a container of routes received from the rebalancer. Its task is to hand out the routes to worker fibers in a round-robin manner. It solves all the problems listed above. Part of #161

Closes #161

Rebalancer algorithm is quite a complex thing, and it is better not to complicate it even more with the max receiving bucket count restriction. Now all the control is done on receiver's side. At this moment complexity of rebalancer algo is not big, but it relies on a fact, that each replicaset can send only one bucket at once. It won't be true after #161, and this is where the algorithm would become unbearably complex. Closes #167

The rebalancer will become parallel soon - it will be able to send multiple buckets from one node simultaneously. For that it will create several worker fibers. But they need somehow synchronize their work so as not to send too many buckets, or too few, and to spread the load evenly among receivers. Dispenser is a container of routes received from the rebalancer. Its task is to hand out the routes to worker fibers in a round-robin manner. It solves all the problems listed above. Part of #161

@TarantoolBot

Closes #161 @TarantoolBot document Title: vshard rebalancer_max_sending VShard from its first release had quite a simple rebalancer - one process on one node calculates who and to whom should send how many buckets. The nodes applied these so called routes one by one sequentially. Unfortunately, such a simple schema works not fast enough, especially for Vinyl, where costs of reading disk are comparable with network costs. In fact, with Vinyl the rebalancer routes applier was sleeping most of the time. Now each node can send multiple buckets in parallel in a round-robin manner to several destinations, or even to one. To set degree of parallelism a new option is added - rebalancer_max_sending. It can be specified in a storage configuration in the root table: cfg.rebalancer_max_sending = 5 vshard.storage.cfg(cfg, box.info.uuid) In routers the option is ignored. Note, that max_sending N perhaps won't give N times speed up. It depends on network, disk, number of other fibers in the system. By default the option is 1. Maximal value is 15. One another important thing - from this moment rebalancer_max_receiving is not useless. It can actually limit load at one storage. Consider an example. You have 10 replicasets and a new one is added. Now all the 10 replicasets will try to send buckets to the new one. Assume, that each replicaset has 5 max sending. In that case the new replicaset will experience quite a high load of 50 buckets being downloaded at once. If the node needs to do some other work, perhaps such a big load is undesirable. Also too many parallel buckets can lead to timeouts in the rebalancing process itself. Then you can set lower rebalancer_max_sending on old replicasets, or decrease rebalancer_max_receiving on the new one. In the latter case some workers on old nodes will be throttled, and you will see that in the logs. Rebalancer_max_sending is important, if you have restriction on how many buckets can be read-only at once in the cluster. As you remember, when a bucket is being sent, it does not accept new write requests. For example, you have 100000 buckets and each bucket stores ~0.001% of your data. The cluster has 10 replicasets. And you never can afford > 0.1% of data locked on write. Then you should not set rebalancer_max_sending > 10 on these nodes. It guarantees that the rebalancer won't send more than 100 buckets at once in the whole cluster.

@TarantoolBot

Closes #161 @TarantoolBot document Title: vshard rebalancer_max_sending VShard from its first release had quite a simple rebalancer - one process on one node calculates who and to whom should send how many buckets. The nodes applied these so called routes one by one sequentially. Unfortunately, such a simple schema works not fast enough, especially for Vinyl, where costs of reading disk are comparable with network costs. In fact, with Vinyl the rebalancer routes applier was sleeping most of the time. Now each node can send multiple buckets in parallel in a round-robin manner to several destinations, or even to one. To set degree of parallelism a new option is added - rebalancer_max_sending. It can be specified in a storage configuration in the root table: cfg.rebalancer_max_sending = 5 vshard.storage.cfg(cfg, box.info.uuid) In routers the option is ignored. Note, that max_sending N perhaps won't give N times speed up. It depends on network, disk, number of other fibers in the system. By default the option is 1. Maximal value is 15. One another important thing - from this moment rebalancer_max_receiving is not useless. It can actually limit load at one storage. Consider an example. You have 10 replicasets and a new one is added. Now all the 10 replicasets will try to send buckets to the new one. Assume, that each replicaset has 5 max sending. In that case the new replicaset will experience quite a high load of 50 buckets being downloaded at once. If the node needs to do some other work, perhaps such a big load is undesirable. Also too many parallel buckets can lead to timeouts in the rebalancing process itself. Then you can set lower rebalancer_max_sending on old replicasets, or decrease rebalancer_max_receiving on the new one. In the latter case some workers on old nodes will be throttled, and you will see that in the logs. Rebalancer_max_sending is important, if you have restriction on how many buckets can be read-only at once in the cluster. As you remember, when a bucket is being sent, it does not accept new write requests. For example, you have 100000 buckets and each bucket stores ~0.001% of your data. The cluster has 10 replicasets. And you never can afford > 0.1% of data locked on write. Then you should not set rebalancer_max_sending > 10 on these nodes. It guarantees that the rebalancer won't send more than 100 buckets at once in the whole cluster.

Rebalancer algorithm is quite a complex thing, and it is better not to complicate it even more with the max receiving bucket count restriction. Now all the control is done on receiver's side. At this moment complexity of rebalancer algo is not big, but it relies on a fact, that each replicaset can send only one bucket at once. It won't be true after #161, and this is where the algorithm would become unbearably complex. Closes #167

The rebalancer will become parallel soon - it will be able to send multiple buckets from one node simultaneously. For that it will create several worker fibers. But they need somehow synchronize their work so as not to send too many buckets, or too few, and to spread the load evenly among receivers. Dispenser is a container of routes received from the rebalancer. Its task is to hand out the routes to worker fibers in a round-robin manner. It solves all the problems listed above. Part of #161

Gerold103 added storage feature A new functionality labels Nov 22, 2018

Gerold103 added this to the 0.2 milestone Nov 22, 2018

Gerold103 added the customer label Nov 22, 2018

Gerold103 mentioned this issue Nov 28, 2018

Introduce real master-master #165

Open

Gerold103 self-assigned this Aug 14, 2019

Gerold103 added a commit that referenced this issue Aug 27, 2019

parallel rebalancer

13fcd40

Closes #161

Gerold103 added a commit that referenced this issue Aug 28, 2019

parallel rebalancer

67fa459

Closes #161

Gerold103 added a commit that referenced this issue Aug 29, 2019

parallel rebalancer

c70a44a

Closes #161

Gerold103 added the rebalancer label Aug 31, 2019

Gerold103 closed this as completed in b9dc5fe Aug 31, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow to send multiple buckets from one storage in parallel #161

Allow to send multiple buckets from one storage in parallel #161

Gerold103 commented Nov 22, 2018

Gerold103 commented Aug 17, 2019

Allow to send multiple buckets from one storage in parallel #161

Allow to send multiple buckets from one storage in parallel #161

Comments

Gerold103 commented Nov 22, 2018

Gerold103 commented Aug 17, 2019