Add even Kafka partition distribution between all or set of indexers

As for now QW tries to distribute load between indexers using pipelines as a measure of work.
In the issue https://github.com/quickwit-oss/quickwit/issues/5833 there was a discussion about even pipeline distribution.
But even with even pipeline distribution there can be an uneven load distribution between indexers.
Here is my case:
* Kafka source for **topic1** with 24 partitions and 4 pipelines. This topic is very loaded, more than 10 times in comparing to other topics
* Kafka source for **topic2** with 24 partitions and also 4 pipelines. This topic has very small load of data during the time

When you run these 2 sources on QW cluster with 2 instances you are getting the following outcome:
* Indexer1 gets all pipelines for **topic1**
* Indexer2 gets all pipelines for **topic2**

So, pipelines are distributed evenly, but
Indexer1 get all partitions for the Kafka topic which is 10 times more loaded then the topic for Indexer2.
This cause Indexer1 to have 100% CPU load and lag in processing **topic1** messages
At the same time Indexer2 uses less than 10% CPU and does nothing.

I tried the following tricks to force QW to spread partitions between the Indexers:
* tried to set `cpu_capacity` to 1m on all indexers - didn't help
* tried to test image from https://github.com/quickwit-oss/quickwit/issues/5833 , which is `quickwit/quickwit:qw-collocation-20250710` . It also didn't help with partition distribution, but during the load I noticed more even pipeline spread
* tried to add 3rd Indexer that also didn't affect partition distribution.

It looks like QW has some logic in the code which tries to put pipelines related to same source/topic on the same Indexer. Because even with random distribution at some point I should have seen partitions spread between multiple instance.

Here is a screenshot of CPU load on the indexers:

<img width="2411" height="640" alt="Image" src="https://github.com/user-attachments/assets/afc84dc6-391c-4c85-b471-283fa90039ba" />

Let me know if you need more details. Also ready to test custom builds if you want to try something

Thanks,

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add even Kafka partition distribution between all or set of indexers #5924

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add even Kafka partition distribution between all or set of indexers #5924

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions