Skip to content

Limit to 3 pipelines per node per source #5792

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: impr-shard-collocation
Choose a base branch
from

Conversation

rdettai
Copy link
Collaborator

@rdettai rdettai commented Jun 10, 2025

Description

Closes #4470
Closes #5747
Closes #4630

This addresses both the limitation that only 1 merge pipelines can run per indexer at any given time and the fact that nodes systematically end up with all pipelines of a source when using Kafka, even if the number of pipelines for that source is rather large.

How was this PR tested?

Added unit test, should add more

@fmassot
Copy link
Collaborator

fmassot commented Jun 10, 2025

why 3?

@rdettai
Copy link
Collaborator Author

rdettai commented Jun 10, 2025

why 3?

It's a ratio proposed by Paul here. The rational is that if you you start saturating the systems, merges being 3x faster than indexing, with this ratio merges would be able to keep up.

@rdettai rdettai requested a review from fulmicoton-dd June 10, 2025 13:50
@rdettai
Copy link
Collaborator Author

rdettai commented Jun 10, 2025

There are 2 issues with the logic so far:

  1. when you have 4 pipelines, they will be split into 3 and 1, which is not ideally balanced
  2. upon new iterations the extra pipeline might end up on other nodes (e.g 3 on a node, and 1 on each other nodes)

EDIT: 2) is solved in #5808

@rdettai rdettai marked this pull request as draft June 11, 2025 08:33
@rdettai rdettai force-pushed the explicit-scheduling-rescaling branch from c065f94 to 1b57434 Compare June 11, 2025 09:20
@rdettai
Copy link
Collaborator Author

rdettai commented Jun 11, 2025

Actually this is wrong for now, I thought shards in the simplified problem were indexing pipelines in the physical plan, but it's not true. Shards in the simplified problem are physical shards, and they are mapped to pipelines in

fn convert_scheduling_solution_to_physical_plan_single_node_single_source(

EDIT: solved now

Base automatically changed from explicit-scheduling-rescaling to main June 11, 2025 12:47
@rdettai rdettai force-pushed the limit-3-pipelines-per-node branch from 2c62834 to 0648c42 Compare June 11, 2025 12:47
@rdettai rdettai changed the base branch from main to test-solution-stability June 11, 2025 12:50
@rdettai rdettai force-pushed the limit-3-pipelines-per-node branch 2 times, most recently from e6c4ebd to 761073e Compare June 12, 2025 12:16
Comment on lines +359 to +414
// To ensure that merges can keep up, we try not to assign more than 3
// pipelines per indexer for a source (except if there aren't enough nodes).
let target_limit_num_shards_per_indexer_per_source =
3 * MAX_LOAD_PER_PIPELINE.cpu_millis() / source.load_per_shard.get();
Copy link
Collaborator Author

@rdettai rdettai Jun 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is creating some undesired coupling with the rest of the code:

  • we rely on convert_scheduling_solution_to_physical_plan to use the same MAX_LOAD_PER_PIPELINE to create the right amount of pipelines
  • we rely on the default load_per_pipeline for non-ingest sources (e.g Kafka) to also use MAX_LOAD_PER_PIPELINE.

@rdettai rdettai force-pushed the test-solution-stability branch from b62e363 to 2aa62ec Compare June 19, 2025 07:53
@rdettai rdettai force-pushed the limit-3-pipelines-per-node branch from d1003c7 to 38cd299 Compare June 19, 2025 09:05
@rdettai rdettai changed the base branch from test-solution-stability to impr-shard-collocation June 19, 2025 09:06
@rdettai rdettai marked this pull request as ready for review June 19, 2025 09:14
@rdettai rdettai force-pushed the limit-3-pipelines-per-node branch from 1a870a8 to 9513e00 Compare June 19, 2025 09:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants