New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems with rebalance-threshold #26012

Closed
patriknw opened this Issue Nov 30, 2018 · 12 comments

Comments

Projects
None yet
4 participants
@patriknw
Copy link
Member

patriknw commented Nov 30, 2018

Given the recommendation of using 10x shards than number of nodes the default rebalance-threshold=10 is too hight.

It can result in one node hosting ~2 times the number of shards of other nodes. Example: 1000 shards on 100 nodes means 10 shards per node. One node may have 19 shards and others 10 without a rebalance occurring.

There is also a bug possibly causing continuous rebalance if rebalance-threshold=1.
Here is a reproducer test that can be added to LeastShardAllocationStrategy:

    "not rebalance one" in {
      val allocations = Map(regionA → Vector("shard1"), regionB → Vector("shard2"), regionC → Vector.empty)
      val strategy = new LeastShardAllocationStrategy(rebalanceThreshold = 1, maxSimultaneousRebalance = 2)
      strategy.rebalance(allocations, Set.empty).futureValue should ===(Set.empty[String])

      val allocations2 = Map(regionA → Vector.empty, regionB → Vector("shard2"), regionC → Vector("shard1"))
      strategy.rebalance(allocations2, Set.empty).futureValue should ===(Set.empty[String])
    }
@patriknw

This comment has been minimized.

Copy link
Member Author

patriknw commented Nov 30, 2018

Thanks @WadeWaldron for bringing this to our attention.

@patriknw

This comment has been minimized.

Copy link
Member Author

patriknw commented Dec 7, 2018

The workaround for the rebalance-threshold=1 until this is fixed is to use at least 2

@eloots

This comment has been minimized.

Copy link
Contributor

eloots commented Dec 7, 2018

Note that with rebalance-threshold=2, something odd happens.
Imagine the following scenario with that setting applied: we have one node (one region) that has two shards in it. Adding a second node will trigger a rebalance ending in each node holding one shard. This end state is correct.
When rebalancing the shards, the rebalancing process should stop one of the shards on the first node and re-start it on the second node, thereby leaving the second shard on the first node untouched.
What is weird though is that, instead, both shards are stopped on the first node, followed by the re-start of one shard on the second node and another on the first node. So there's an unnecessary stop/re-start of one shard.

@WadeWaldron

This comment has been minimized.

Copy link
Contributor

WadeWaldron commented Dec 7, 2018

@eloots Do you know if behaves that way with other rebalance-threshold values? I.E. Is this a problem that only occurs with a rebalance-threshold=2 or is this the behaviour no matter what value is used?

@eloots

This comment has been minimized.

Copy link
Contributor

eloots commented Dec 7, 2018

I just tried it with rebalance-threshold=3. On the first rebalance (after adding a second node), all 3 shards are stopped on the first node after which two are re-started on that node (and one shard is started on the second node).
When a third node is added, rebalancing takes place from the first node (2 shards) to the third node (0 shards). Again, on the first node, the two shards are stopped and one on these shards is re-started on that node.
So, there's seems to be a pattern here...

@WadeWaldron

This comment has been minimized.

Copy link
Contributor

WadeWaldron commented Dec 7, 2018

It seems like the way the rebalance algorithm works is probably to choose a node to rebalance and then stop all shards on that node, regardless of whether or not they will be migrated.

It does seem like it would be more efficient to first decide which ones to rebalance, and then only stop those.

@eloots

This comment has been minimized.

Copy link
Contributor

eloots commented Dec 7, 2018

@WadeWaldron Indeed. This is wasteful and can potentially have a bigger impact than strictly needed.

@WadeWaldron

This comment has been minimized.

Copy link
Contributor

WadeWaldron commented Dec 7, 2018

@eloots I would suggest creating a separate issue around the efficiency of the sharding algorithm.

@patriknw

This comment has been minimized.

Copy link
Member Author

patriknw commented Dec 8, 2018

I will look into this on Monday

@patriknw patriknw self-assigned this Dec 10, 2018

patriknw added a commit that referenced this issue Dec 11, 2018

Improve default shard rebalancing algorithm, #26012
* Use rebalance-threshold=1 because it will give the best distribution,
  and previous default could result in too large difference between nodes
* Off by one error, difference > threshold vs >=
* Added more unit tests
* Note that in some cases it may still not be optimal, stopping more
  shards than necessary, but a different strategy that looks at more
  than most and least is out of scope for this issue. In practise
  those cases shouldn't matter much.
* Also note that the rebalance interval is by default 10 seconds,
  so typically shards will start up before next rebalance tick.
  It's intentionally a slow process to not cause instabilities by
  moving too much at the same time.
@eloots

This comment has been minimized.

Copy link
Contributor

eloots commented Dec 11, 2018

@patriknw I tested bbae1ec on akka-sample-cqrs-scala which depends on setting least-shard-allocation-strategy.rebalance-threshold = 1 and it now shows the expected behaviour:

  • No unnecessary stopping and restarting of shards when all shards are fully spread across nodes (1 shard per node) and additional nodes are added (which will not host any shard).
  • In the case of a rebalance, no unnecessary stopping and restarting of shards that stay on the same node.

Thanks!

patriknw added a commit that referenced this issue Dec 21, 2018

Improve default shard rebalancing algorithm, #26012 (#26101)
* Improve default shard rebalancing algorithm, #26012

* Use rebalance-threshold=1 because it will give the best distribution,
  and previous default could result in too large difference between nodes
* Off by one error, difference > threshold vs >=
* Added more unit tests
* Note that in some cases it may still not be optimal, stopping more
  shards than necessary, but a different strategy that looks at more
  than most and least is out of scope for this issue. In practise
  those cases shouldn't matter much.
* Also note that the rebalance interval is by default 10 seconds,
  so typically shards will start up before next rebalance tick.
  It's intentionally a slow process to not cause instabilities by
  moving too much at the same time.

@patriknw patriknw added this to the 2.5.20 milestone Dec 21, 2018

@patriknw patriknw closed this Dec 21, 2018

@t3hnar

This comment has been minimized.

Copy link
Contributor

t3hnar commented Jan 29, 2019

@patriknw

This comment has been minimized.

Copy link
Member Author

patriknw commented Jan 29, 2019

thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment