Add RuleSetCustomizer SPI for multi-stage planner Calcite rules#18387
Conversation
Introduces a ServiceLoader-discovered SPI that lets plugins add, remove, reorder, or replace the Calcite HEP rules used by the multi-stage query engine. The OSS defaults are themselves contributed by DefaultRuleSetCustomizer (registered via META-INF/services), and plugin customizers run on top. - New `Phase` enum covers every HEP phase QueryEnvironment runs: BASIC, FILTER_PUSHDOWN, PROJECT_PUSHDOWN, PRUNE, POST_LOGICAL, POST_LOGICAL_V2, POST_LOGICAL_ENRICHED_JOIN. Append-only contract. - New `RuleSetCustomizer` interface — plugins implement `customize(Phase, List<RelOptRule>)` and modify the per-phase list in place. - New `PinotRuleSet` owns the per-phase, immutable rule lists and is the single source of truth read by `QueryEnvironment`. The default instance is built lazily from `ServiceLoader` discovery. - `DefaultRuleSetCustomizer` replaces the static `PinotQueryRuleSets` lists; the deleted class is the rename source. - `QueryEnvironment` reads every phase's rules through `PinotRuleSet` via a new `Config#getRuleSet()` `@Value.Default`. The per-query `sortExchangeCopyLimit` override stays inside `getTraitProgram` (per-query copy of POST_LOGICAL with all `PinotSortExchangeCopyRule` instances replaced by the override). Per-query `usePlannerRules` / `skipPlannerRules` filtering is unchanged (still applied on top of the rule lists in `getOptProgram`).
xiangfu0
left a comment
There was a problem hiding this comment.
Found a few high-signal SPI/plugin issues; see inline comments.
Three issues raised in PR review: 1. Restore deprecated PinotQueryRuleSets bridge. The original class org.apache.pinot.calcite.rel.rules.PinotQueryRuleSets had a public API (five static rule lists + getPinotPostRules). Deleting it was a backward- compat break for out-of-tree code. The bridge forwards to the canonical constants in DefaultRuleSetCustomizer; per-query sortExchangeCopyLimit overrides are now handled by QueryEnvironment.getTraitProgram. DefaultRuleSetCustomizer rule-list fields made public to support this. 2. Enumerate plugin classloaders in loadFromServiceLoader. ServiceLoader.load with the context classloader only finds application-classpath customizers; plugin JARs in isolated ClassRealm/PluginClassLoader realms are invisible. PinotRuleSet.loadFromServiceLoader now iterates PluginManager.getPlugin ClassLoaders() and does a per-classloader ServiceLoader.load. Duplicates (same class name discovered via both classloaders) are de-duped by class name. RuleSetCustomizer Javadoc explains the pinot-plugin.properties importFrom.pinot requirement for plugins implementing this interface. 3. Add getPluginClassLoaders() to PluginManager. Returns the set of all loaded plugin classloaders (old-style PluginClassLoader registry and new-style ClassRealm realms), excluding the DEFAULT slot. Used by loadFromServiceLoader for plugin discovery. Covered by two new tests in PluginManagerTest. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Phase and RuleSetCustomizer (and their ServiceLoader registration) move from pinot-query-planner to a new pinot-query-planner-spi module that only depends on calcite-core. This lets plugins implement the broker rule-customization SPI without pulling in the full query planner on their compile classpath. PluginManager now exports org.apache.pinot.query.planner.rules and org.apache.calcite.plan from the pinot realm into every plugin realm so plugins can link RuleSetCustomizer and RelOptRule without bundling or shading those packages. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…_LOGICAL_V2, add realm test - Move META-INF/services from pinot-query-planner-spi to pinot-query-planner: the SPI jar should not register DefaultRuleSetCustomizer which lives in the implementation module. - Rename Phase.POST_LOGICAL_V2 → Phase.POST_LOGICAL_PHYSICAL: the V2 suffix leaked an internal versioning convention into a permanent SPI enum name. The new name encodes the semantic (physical optimizer enabled). All call sites updated (DefaultRuleSetCustomizer, QueryEnvironment, PinotRuleSetTest, PinotQueryRuleSets bridge). - Add PluginRealmExportTest in pinot-query-planner-spi: verifies that PluginManager exports org.apache.pinot.query.planner.rules and org.apache.calcite.plan from the pinotRealm so plugin classloaders can load RuleSetCustomizer and RelOptRule without bundling those packages. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The package-info Maven plugin generates a package-info.java annotated with @javax.annotation.ParametersAreNonnullByDefault for every module. Unlike other modules, pinot-query-planner-spi has no transitive dependency that pulls in javax.annotation-api, causing a compilation failure in CI. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… delegates Reverses the movement of rule lists out of PinotQueryRuleSets into DefaultRuleSetCustomizer, reducing the diff of this PR. PinotQueryRuleSets keeps BASIC_RULES, FILTER_PUSHDOWN_RULES, PROJECT_PUSHDOWN_RULES, PRUNE_RULES, PINOT_POST_RULES_V2, and the new POST_LOGICAL_RULES (extracted from the old getPinotPostRules). DefaultRuleSetCustomizer.customize() simply delegates to those lists. A TODO comment notes the rules may be consolidated into DefaultRuleSetCustomizer once the RuleSetCustomizer SPI is the established extension point. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ParametersAreNonnullByDefault (used in generated package-info.java) is from com.google.code.findbugs:jsr305, not javax.annotation:javax.annotation-api. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
yashmayya
left a comment
There was a problem hiding this comment.
Looks like there's a number of CI failures that need to be addressed
…ginManager realm exports Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #18387 +/- ##
============================================
- Coverage 64.29% 64.25% -0.04%
- Complexity 1126 1127 +1
============================================
Files 3311 3314 +3
Lines 203827 203890 +63
Branches 31721 31733 +12
============================================
- Hits 131048 131009 -39
- Misses 62252 62374 +122
+ Partials 10527 10507 -20
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…age; deprecate getPinotPostRules(int) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…tree Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
yashmayya
left a comment
There was a problem hiding this comment.
Lint issues need to be resolved and I have some minor comments, but this looks good overall
…iched-join phase, fix stale comment - Rename Phase.POST_LOGICAL_PHYSICAL → POST_LOGICAL_PHYSICAL_OPT for clarity - Remove Phase.POST_LOGICAL_ENRICHED_JOIN and its QueryEnvironment wiring (enriched join was experimental and didn't work well; likely to be removed) - Fix stale package name in PluginRealmExportTest comment - Fix pre-existing blank-line-before-closing-brace checkstyle violation in PinotQueryRuleSets Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… enum value The previous commit removed Phase.POST_LOGICAL_ENRICHED_JOIN from the SPI, but also dropped the QueryEnvironment wiring that applies PinotEnrichedJoinRule when usePlannerRules='JoinToEnrichedJoin'. This broke ResourceBasedQueryPlansTest (6 failures). The rule collection is disabled by default (JOIN_TO_ENRICHED_JOIN is in DEFAULT_DISABLED_RULES) and is applied directly, without going through a Phase. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Docs PR: pinot-contrib/pinot-docs#821 |
## Summary Documents the broker-side `RuleSetCustomizer` SPI added in [apache/pinot#18387](apache/pinot#18387). ## Changes - add a planner-rule customizers subsection to the plugin architecture page - explain ServiceLoader discovery across the broker classpath and plugin classloaders - call out one-time initialization and restart requirements - note the upgrade-sensitive nature of this SPI ## Validation - cross-checked against the local `apache/pinot` source in `pinot-query-planner-spi` - ran `git diff --check`
Introduces a ServiceLoader-discovered SPI that lets plugins add, remove, reorder, or replace the Calcite HEP rules used by the multi-stage query engine. The OSS defaults are themselves contributed by DefaultRuleSetCustomizer (registered via META-INF/services), and plugin customizers run on top.
Phaseenum covers every HEP phase QueryEnvironment runs: BASIC, FILTER_PUSHDOWN, PROJECT_PUSHDOWN, PRUNE, POST_LOGICAL, POST_LOGICAL_V2, POST_LOGICAL_ENRICHED_JOIN. Append-only contract.RuleSetCustomizerinterface — plugins implementcustomize(Phase, List<RelOptRule>)and modify the per-phase list in place.PinotRuleSetowns the per-phase, immutable rule lists and is the single source of truth read byQueryEnvironment. The default instance is built lazily fromServiceLoaderdiscovery.DefaultRuleSetCustomizerreplaces the staticPinotQueryRuleSetslists; the deleted class is the rename source.QueryEnvironmentreads every phase's rules throughPinotRuleSetvia a newConfig#getRuleSet()@Value.Default. The per-querysortExchangeCopyLimitoverride stays insidegetTraitProgram(per-query copy of POST_LOGICAL with allPinotSortExchangeCopyRuleinstances replaced by the override).Per-query
usePlannerRules/skipPlannerRulesfiltering is unchanged (still applied on top of the rule lists ingetOptProgram).