Skip to content

Add RuleSetCustomizer SPI for multi-stage planner Calcite rules#18387

Merged
gortiz merged 16 commits into
apache:masterfrom
gortiz:broker-rule-customizer-spi
May 22, 2026
Merged

Add RuleSetCustomizer SPI for multi-stage planner Calcite rules#18387
gortiz merged 16 commits into
apache:masterfrom
gortiz:broker-rule-customizer-spi

Conversation

@gortiz
Copy link
Copy Markdown
Contributor

@gortiz gortiz commented Apr 30, 2026

Introduces a ServiceLoader-discovered SPI that lets plugins add, remove, reorder, or replace the Calcite HEP rules used by the multi-stage query engine. The OSS defaults are themselves contributed by DefaultRuleSetCustomizer (registered via META-INF/services), and plugin customizers run on top.

  • New Phase enum covers every HEP phase QueryEnvironment runs: BASIC, FILTER_PUSHDOWN, PROJECT_PUSHDOWN, PRUNE, POST_LOGICAL, POST_LOGICAL_V2, POST_LOGICAL_ENRICHED_JOIN. Append-only contract.
  • New RuleSetCustomizer interface — plugins implement customize(Phase, List<RelOptRule>) and modify the per-phase list in place.
  • New PinotRuleSet owns the per-phase, immutable rule lists and is the single source of truth read by QueryEnvironment. The default instance is built lazily from ServiceLoader discovery.
  • DefaultRuleSetCustomizer replaces the static PinotQueryRuleSets lists; the deleted class is the rename source.
  • QueryEnvironment reads every phase's rules through PinotRuleSet via a new Config#getRuleSet() @Value.Default. The per-query sortExchangeCopyLimit override stays inside getTraitProgram (per-query copy of POST_LOGICAL with all PinotSortExchangeCopyRule instances replaced by the override).

Per-query usePlannerRules / skipPlannerRules filtering is unchanged (still applied on top of the rule lists in getOptProgram).

Introduces a ServiceLoader-discovered SPI that lets plugins add, remove,
reorder, or replace the Calcite HEP rules used by the multi-stage query
engine. The OSS defaults are themselves contributed by
DefaultRuleSetCustomizer (registered via META-INF/services), and plugin
customizers run on top.

- New `Phase` enum covers every HEP phase QueryEnvironment runs:
  BASIC, FILTER_PUSHDOWN, PROJECT_PUSHDOWN, PRUNE, POST_LOGICAL,
  POST_LOGICAL_V2, POST_LOGICAL_ENRICHED_JOIN. Append-only contract.
- New `RuleSetCustomizer` interface — plugins implement
  `customize(Phase, List<RelOptRule>)` and modify the per-phase list
  in place.
- New `PinotRuleSet` owns the per-phase, immutable rule lists and is
  the single source of truth read by `QueryEnvironment`. The default
  instance is built lazily from `ServiceLoader` discovery.
- `DefaultRuleSetCustomizer` replaces the static `PinotQueryRuleSets`
  lists; the deleted class is the rename source.
- `QueryEnvironment` reads every phase's rules through `PinotRuleSet`
  via a new `Config#getRuleSet()` `@Value.Default`. The per-query
  `sortExchangeCopyLimit` override stays inside `getTraitProgram`
  (per-query copy of POST_LOGICAL with all `PinotSortExchangeCopyRule`
  instances replaced by the override).

Per-query `usePlannerRules` / `skipPlannerRules` filtering is unchanged
(still applied on top of the rule lists in `getOptProgram`).
@gortiz gortiz added extension-point Adds or modifies an extension/SPI point multi-stage Related to the multi-stage query engine labels Apr 30, 2026
Copy link
Copy Markdown
Contributor

@xiangfu0 xiangfu0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found a few high-signal SPI/plugin issues; see inline comments.

gortiz and others added 6 commits May 11, 2026 15:33
Three issues raised in PR review:

1. Restore deprecated PinotQueryRuleSets bridge. The original class
   org.apache.pinot.calcite.rel.rules.PinotQueryRuleSets had a public API
   (five static rule lists + getPinotPostRules). Deleting it was a backward-
   compat break for out-of-tree code. The bridge forwards to the canonical
   constants in DefaultRuleSetCustomizer; per-query sortExchangeCopyLimit
   overrides are now handled by QueryEnvironment.getTraitProgram.
   DefaultRuleSetCustomizer rule-list fields made public to support this.

2. Enumerate plugin classloaders in loadFromServiceLoader. ServiceLoader.load
   with the context classloader only finds application-classpath customizers;
   plugin JARs in isolated ClassRealm/PluginClassLoader realms are invisible.
   PinotRuleSet.loadFromServiceLoader now iterates PluginManager.getPlugin
   ClassLoaders() and does a per-classloader ServiceLoader.load. Duplicates
   (same class name discovered via both classloaders) are de-duped by class
   name. RuleSetCustomizer Javadoc explains the pinot-plugin.properties
   importFrom.pinot requirement for plugins implementing this interface.

3. Add getPluginClassLoaders() to PluginManager. Returns the set of all
   loaded plugin classloaders (old-style PluginClassLoader registry and
   new-style ClassRealm realms), excluding the DEFAULT slot. Used by
   loadFromServiceLoader for plugin discovery. Covered by two new tests in
   PluginManagerTest.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Phase and RuleSetCustomizer (and their ServiceLoader registration) move from
pinot-query-planner to a new pinot-query-planner-spi module that only depends
on calcite-core. This lets plugins implement the broker rule-customization SPI
without pulling in the full query planner on their compile classpath.

PluginManager now exports org.apache.pinot.query.planner.rules and
org.apache.calcite.plan from the pinot realm into every plugin realm so
plugins can link RuleSetCustomizer and RelOptRule without bundling or shading
those packages.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…_LOGICAL_V2, add realm test

- Move META-INF/services from pinot-query-planner-spi to pinot-query-planner:
  the SPI jar should not register DefaultRuleSetCustomizer which lives in the
  implementation module.

- Rename Phase.POST_LOGICAL_V2 → Phase.POST_LOGICAL_PHYSICAL: the V2 suffix
  leaked an internal versioning convention into a permanent SPI enum name.
  The new name encodes the semantic (physical optimizer enabled). All call
  sites updated (DefaultRuleSetCustomizer, QueryEnvironment, PinotRuleSetTest,
  PinotQueryRuleSets bridge).

- Add PluginRealmExportTest in pinot-query-planner-spi: verifies that
  PluginManager exports org.apache.pinot.query.planner.rules and
  org.apache.calcite.plan from the pinotRealm so plugin classloaders can load
  RuleSetCustomizer and RelOptRule without bundling those packages.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The package-info Maven plugin generates a package-info.java annotated with
@javax.annotation.ParametersAreNonnullByDefault for every module. Unlike
other modules, pinot-query-planner-spi has no transitive dependency that
pulls in javax.annotation-api, causing a compilation failure in CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… delegates

Reverses the movement of rule lists out of PinotQueryRuleSets into
DefaultRuleSetCustomizer, reducing the diff of this PR. PinotQueryRuleSets
keeps BASIC_RULES, FILTER_PUSHDOWN_RULES, PROJECT_PUSHDOWN_RULES, PRUNE_RULES,
PINOT_POST_RULES_V2, and the new POST_LOGICAL_RULES (extracted from the old
getPinotPostRules). DefaultRuleSetCustomizer.customize() simply delegates to
those lists. A TODO comment notes the rules may be consolidated into
DefaultRuleSetCustomizer once the RuleSetCustomizer SPI is the established
extension point.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ParametersAreNonnullByDefault (used in generated package-info.java) is from
com.google.code.findbugs:jsr305, not javax.annotation:javax.annotation-api.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@yashmayya yashmayya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there's a number of CI failures that need to be addressed

…ginManager realm exports

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 13, 2026

Codecov Report

❌ Patch coverage is 88.88889% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.25%. Comparing base (a0cba7b) to head (fc6db34).
⚠️ Report is 4 commits behind head on master.

Files with missing lines Patch % Lines
...apache/pinot/query/planner/rules/PinotRuleSet.java 76.66% 5 Missing and 2 partials ⚠️
.../query/planner/rules/DefaultRuleSetCustomizer.java 87.50% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #18387      +/-   ##
============================================
- Coverage     64.29%   64.25%   -0.04%     
- Complexity     1126     1127       +1     
============================================
  Files          3311     3314       +3     
  Lines        203827   203890      +63     
  Branches      31721    31733      +12     
============================================
- Hits         131048   131009      -39     
- Misses        62252    62374     +122     
+ Partials      10527    10507      -20     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-21 64.25% <88.88%> (-0.04%) ⬇️
temurin 64.25% <88.88%> (-0.04%) ⬇️
unittests 64.25% <88.88%> (-0.04%) ⬇️
unittests1 56.71% <88.88%> (+0.01%) ⬆️
unittests2 35.50% <64.19%> (-0.05%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

gortiz and others added 4 commits May 13, 2026 12:59
Copy link
Copy Markdown
Contributor

@yashmayya yashmayya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lint issues need to be resolved and I have some minor comments, but this looks good overall

gortiz and others added 2 commits May 20, 2026 17:37
…iched-join phase, fix stale comment

- Rename Phase.POST_LOGICAL_PHYSICAL → POST_LOGICAL_PHYSICAL_OPT for clarity
- Remove Phase.POST_LOGICAL_ENRICHED_JOIN and its QueryEnvironment wiring
  (enriched join was experimental and didn't work well; likely to be removed)
- Fix stale package name in PluginRealmExportTest comment
- Fix pre-existing blank-line-before-closing-brace checkstyle violation in PinotQueryRuleSets

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
gortiz and others added 2 commits May 21, 2026 14:51
… enum value

The previous commit removed Phase.POST_LOGICAL_ENRICHED_JOIN from the SPI, but
also dropped the QueryEnvironment wiring that applies PinotEnrichedJoinRule when
usePlannerRules='JoinToEnrichedJoin'. This broke ResourceBasedQueryPlansTest (6
failures). The rule collection is disabled by default (JOIN_TO_ENRICHED_JOIN is
in DEFAULT_DISABLED_RULES) and is applied directly, without going through a Phase.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@gortiz gortiz merged commit bc2f955 into apache:master May 22, 2026
13 checks passed
@xiangfu0
Copy link
Copy Markdown
Contributor

Docs PR: pinot-contrib/pinot-docs#821

xiangfu0 added a commit to pinot-contrib/pinot-docs that referenced this pull request May 22, 2026
## Summary

Documents the broker-side `RuleSetCustomizer` SPI added in
[apache/pinot#18387](apache/pinot#18387).

## Changes

- add a planner-rule customizers subsection to the plugin architecture
page
- explain ServiceLoader discovery across the broker classpath and plugin
classloaders
- call out one-time initialization and restart requirements
- note the upgrade-sensitive nature of this SPI

## Validation

- cross-checked against the local `apache/pinot` source in
`pinot-query-planner-spi`
- ran `git diff --check`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

extension-point Adds or modifies an extension/SPI point multi-stage Related to the multi-stage query engine

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants