Skip to content

Add WorkerManager extension points for customizing leaf-stage segment assignment#18645

Merged
KKcorps merged 1 commit into
apache:masterfrom
KKcorps:workermanager-leaf-segment-filter-hooks
Jun 2, 2026
Merged

Add WorkerManager extension points for customizing leaf-stage segment assignment#18645
KKcorps merged 1 commit into
apache:masterfrom
KKcorps:workermanager-leaf-segment-filter-hooks

Conversation

@KKcorps
Copy link
Copy Markdown
Contributor

@KKcorps KKcorps commented Jun 1, 2026

Description

Subclasses of WorkerManager can already customize multi-stage leaf-stage worker selection by overriding
getCandidateServers / getCandidateServersForReplicatedLeaf, but the per-worker segment assignment is finalized
internally with no extension point to adjust it.

This PR adds two protected no-op extension points, invoked once the leaf-stage segment assignment has been built:

  • filterLeafStageSegments(DispatchablePlanContext, DispatchablePlanMetadata) — invoked from
    updateContextForLeafStage, which the non-partitioned, partitioned, and logical-table leaf paths all funnel through.
  • filterReplicatedLeafStageSegments(DispatchablePlanContext, DispatchablePlanMetadata) — invoked from
    setSegmentsForReplicatedLeafFragment for the replicated (broadcast) leaf path.

A subclass can rewrite DispatchablePlanMetadata#getWorkerIdToSegmentsMap() / #getReplicatedSegments() through the
existing setters — for example to apply a custom segment-pruning policy — with access to the query options via
DispatchablePlanContext#getPlannerContext().getOptions().

Backward compatibility

Purely additive. Both hooks default to no-ops, so behavior is unchanged for the base WorkerManager and all existing
subclasses.

Testing

Existing multi-stage planning / WorkerManager tests continue to pass; the hooks are no-ops by default.

Release notes

No behavior change — new protected extension points only.

… assignment

Subclasses can already customize multi-stage worker selection by overriding getCandidateServers / getCandidateServersForReplicatedLeaf, but the per-worker segment assignment is finalized internally with no extension point. This adds two protected no-op hooks invoked once the assignment is built: filterLeafStageSegments (from updateContextForLeafStage, covering the non-partitioned, partitioned and logical-table paths) and filterReplicatedLeafStageSegments (from setSegmentsForReplicatedLeafFragment). A subclass can rewrite DispatchablePlanMetadata's worker/replicated segment maps via the existing setters. Both default to no-ops, so behavior is unchanged.
@KKcorps KKcorps force-pushed the workermanager-leaf-segment-filter-hooks branch from d656769 to 24d4d54 Compare June 1, 2026 13:55
@KKcorps KKcorps requested review from gortiz and yashmayya June 1, 2026 14:06
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Jun 1, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 64.40%. Comparing base (6ed151a) to head (24d4d54).

Additional details and impacted files
@@            Coverage Diff            @@
##             master   #18645   +/-   ##
=========================================
  Coverage     64.39%   64.40%           
  Complexity     1282     1282           
=========================================
  Files          3362     3362           
  Lines        207915   207919    +4     
  Branches      32463    32464    +1     
=========================================
+ Hits         133883   133900   +17     
+ Misses        63258    63245   -13     
  Partials      10774    10774           
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-21 64.40% <100.00%> (+<0.01%) ⬆️
temurin 64.40% <100.00%> (+<0.01%) ⬆️
unittests 64.39% <100.00%> (+<0.01%) ⬆️
unittests1 56.79% <100.00%> (+<0.01%) ⬆️
unittests2 37.15% <50.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@yashmayya yashmayya added multi-stage Related to the multi-stage query engine extension-point Adds or modifies an extension/SPI point labels Jun 2, 2026
@KKcorps KKcorps merged commit 3693493 into apache:master Jun 2, 2026
11 checks passed
xiangfu0 added a commit to pinot-contrib/pinot-docs that referenced this pull request Jun 2, 2026
@xiangfu0
Copy link
Copy Markdown
Contributor

xiangfu0 commented Jun 2, 2026

Opened the docs follow-up PR for this change: pinot-contrib/pinot-docs#848

xiangfu0 added a commit to pinot-contrib/pinot-docs that referenced this pull request Jun 2, 2026
## Summary
- document the new post-assignment `WorkerManager` hooks for customizing
multi-stage leaf segment routing
- explain where subclasses can read query options and rewrite
`DispatchablePlanMetadata` segment maps
- position the hooks as an upgrade-sensitive code-level extension, not a
stable plugin family

## Cross-check
- verified the merged Apache Pinot behavior against
`apache/pinot#18645`, including the new `filterLeafStageSegments(...)`
and `filterReplicatedLeafStageSegments(...)` hooks and the existing
candidate-server hooks

## Validation
- `git diff --check`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

extension-point Adds or modifies an extension/SPI point multi-stage Related to the multi-stage query engine

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants