Distribute leaders uniformly in the cluster in a best-effort way #7258
Labels
area/performance
Marks an issue as performance related
kind/feature
Categorizes an issue or PR as a feature, i.e. new behavior
scope/broker
Marks an issue or PR to appear in the broker section of the changelog
Milestone
In Zeebe, there is no way to control which nodes becomes the leader of which partition. The raft leader election is based on randomized timeout values which is not controllable. As a result, leaders are frequently concentrated in a small number of nodes.
Because leaders typically do more work than followers, this situation can easily become a performance bottleneck. This situation is also inefficient in terms of resource allocation. We should always over-provision nodes to get good performance. Therefore, to improve the performance of the system and for an optimal resource usage, it is required to distributed the leaders uniformly among the nodes.
We propose to use priority based election in raft. The solution and alternative approaches are described here zeebe-io/enhancements#15 The goal is not to achieve strictly uniform distribution, but to achieve it in a best effort way.
A poc is evaluated and the results are explained in #7223
The text was updated successfully, but these errors were encountered: