Skip to content

Make CREATE MATERIALIZED VIEW DDL pluggable via MaterializedViewDdlHandler#18639

Merged
xiangfu0 merged 1 commit into
masterfrom
mse-mv-ddl-extensibility
Jun 1, 2026
Merged

Make CREATE MATERIALIZED VIEW DDL pluggable via MaterializedViewDdlHandler#18639
xiangfu0 merged 1 commit into
masterfrom
mse-mv-ddl-extensibility

Conversation

@xiangfu0
Copy link
Copy Markdown
Contributor

Summary

Makes CREATE MATERIALIZED VIEW ... AS <query> DDL pluggable via a new MaterializedViewDdlHandler extension point, so downstream distributions can support materialized views that are materialized by a different engine / minion task type (e.g. a multi-stage-engine MV whose AS clause is a JOIN) without forking the DDL compiler.

OSS behavior is byte-for-byte unchanged when no alternative handler is registered: a JOIN in the AS clause is still rejected, and the MV is still routed under the built-in MaterializedViewTask.

Motivation

Today the single-source / single-stage assumption is hardcoded in DdlCompiler.compileCreateMaterializedView:

  • a JOIN in the AS clause is rejected unconditionally, and
  • the MV task config is always stamped under MaterializedViewTask.

A distribution that wants a richer MV (e.g. materialize the result of a JOIN through the multi-stage engine via its own minion task type) has no clean seam — it cannot reuse the DDL compile path. Because the controller validates the resulting TableConfig (running the task generator's validation) before persisting, simply "allowing the JOIN at compile" is insufficient: the handler must be able to stamp a task type whose generator can validate that definition.

What changed (pinot-sql-ddl)

  • New SPI MaterializedViewDdlHandler with two methods: validateDefinedQuery(queryNode, properties) (query/join policy, run before column resolution) and applyTaskConfig(properties, definedSql, schedule, builder) (routes the MV task config and returns the task type stamped). Includes a shared static containsJoin(SqlNode) helper.
  • DefaultMaterializedViewDdlHandler — preserves current behavior exactly (reject JOINs, route under MaterializedViewTask).
  • MaterializedViewDdlHandlerRegistry — process-wide registry, defaults to the single-source handler; a distribution registers its handler once at controller startup (mirrors existing pluggable-component registry patterns).
  • DdlCompiler.compileCreateMaterializedView delegates query validation + task routing to the registered handler, and validates MV consistency (bucketTimePeriod present) against the task type the handler stamped.
  • MaterializedViewPropertyRouter.apply gains a taskType-parameterized overload so a handler can route the MV task config under an alternative task type (the no-arg form delegates with MaterializedViewTask).
  • verifyDefinedSqlIsParseable now does a syntactic parse (compileToSqlNodeAndOptions) rather than full single-stage compilation (compileToPinotQuery). The slicing-bug guard only needs to confirm the extracted text is well-formed; the syntactic parse also accepts multi-stage shapes (e.g. JOINs) a handler may permit. Engine-specific validity is still enforced later by the MV analyzer / task generator for whichever task type the handler stamped. JOINs remain rejected by default (the default handler rejects them earlier, before this check).

Testing

  • All existing pinot-sql-ddl tests pass (169), including DdlCompilerMaterializedViewTest (40) — default behavior (single-source, JOIN rejected, MaterializedViewTask routing) is unchanged.

Backward compatibility

No config, wire-format, or default-behavior changes. The extension point is opt-in via the registry; absent any registration, the default handler reproduces the prior code path.

🤖 Generated with Claude Code

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 30, 2026

Codecov Report

❌ Patch coverage is 77.19298% with 13 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.40%. Comparing base (5a6cd7b) to head (c91cccd).
⚠️ Report is 6 commits behind head on master.

Files with missing lines Patch % Lines
...ntroller/helix/core/PinotHelixResourceManager.java 0.00% 8 Missing ⚠️
...ddl/compile/DefaultMaterializedViewDdlHandler.java 72.72% 3 Missing ⚠️
...ot/sql/ddl/compile/MaterializedViewDdlHandler.java 87.50% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #18639      +/-   ##
============================================
+ Coverage     64.38%   64.40%   +0.01%     
- Complexity     1282     1291       +9     
============================================
  Files          3359     3364       +5     
  Lines        207825   207935     +110     
  Branches      32447    32467      +20     
============================================
+ Hits         133818   133919     +101     
- Misses        63244    63249       +5     
- Partials      10763    10767       +4     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-21 64.40% <77.19%> (+0.01%) ⬆️
temurin 64.40% <77.19%> (+0.01%) ⬆️
unittests 64.40% <77.19%> (+0.01%) ⬆️
unittests1 56.80% <ø> (+0.03%) ⬆️
unittests2 37.14% <77.19%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@xiangfu0 xiangfu0 force-pushed the mse-mv-ddl-extensibility branch 2 times, most recently from 81a00ee to 90c91f3 Compare May 31, 2026 06:16
Copy link
Copy Markdown
Contributor Author

@xiangfu0 xiangfu0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found a high-signal issue; see inline comment.

@xiangfu0 xiangfu0 force-pushed the mse-mv-ddl-extensibility branch from 90c91f3 to 69f4e4a Compare May 31, 2026 21:19
@Jackie-Jiang Jackie-Jiang added the plugins Related to the plugin system label Jun 1, 2026
@xiangfu0 xiangfu0 force-pushed the mse-mv-ddl-extensibility branch 2 times, most recently from c00300f to 5f534c0 Compare June 1, 2026 06:07
Copy link
Copy Markdown
Contributor Author

@xiangfu0 xiangfu0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found a high-signal issue; see inline comment.

…ndler

Adds a MaterializedViewDdlHandler extension point so CREATE MATERIALIZED VIEW
can target an alternative engine / minion task type. The default handler targets
the single-stage engine (re-compiles the AS-clause as a single-stage Pinot query,
rejecting JOIN / multi-source, routes under MaterializedViewTask); a downstream
distribution can register an MSE handler that accepts joins and stamps its own
task type. The handler — installed via DdlCompiler.setMaterializedViewDdlHandler —
owns engine-specific verification (validateDefinedQuery) and whether projection
schema inference is allowed (supportsSchemaInference). DdlCompiler fails fast with
a clear message if a handler returns a null/unstamped task type.

MaterializedViewPropertyRouter.apply gains a taskType-parameterized overload, and
the "MV's own task type" matching is generalized from the hard-coded
MaterializedViewTask to that task type so task.<taskType>.* knobs route correctly
(and are not dropped) for custom task types.

Documents the extension-point contract: a handler stamping a non-built-in task type
owns that type's complete runtime including definition-metadata persistence and
consistency tracking. The built-in controller-side MV machinery
(MaterializedViewDefinitionMetadata persistence + MaterializedViewConsistencyManager)
keys on MaterializedViewTask and now intentionally and explicitly skips MVs stamped
with a different task type (logged at INFO, not WARN), leaving freshness/consistency
to the owning task type.

Behavior is unchanged when no alternative handler is registered. All pinot-sql-ddl
tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@xiangfu0 xiangfu0 force-pushed the mse-mv-ddl-extensibility branch from 5f534c0 to c91cccd Compare June 1, 2026 18:13
@xiangfu0 xiangfu0 merged commit edfbf69 into master Jun 1, 2026
11 checks passed
@xiangfu0 xiangfu0 deleted the mse-mv-ddl-extensibility branch June 1, 2026 19:48
@xiangfu0
Copy link
Copy Markdown
Contributor Author

xiangfu0 commented Jun 2, 2026

Opened the docs follow-up PR for this change: pinot-contrib/pinot-docs#847

xiangfu0 added a commit to pinot-contrib/pinot-docs that referenced this pull request Jun 2, 2026
## Summary
- document that the built-in OSS materialized-view path is still the
single-source `MaterializedViewTask` flow by default
- add plugin-architecture guidance for the new
`MaterializedViewDdlHandler` extension point
- clarify that custom handlers can require explicit MV column lists and
own alternate task/runtime wiring

## Cross-check
- verified the merged Apache Pinot behavior against
`apache/pinot#18639`, including `MaterializedViewDdlHandler`,
`DdlCompiler`, and the default built-in behavior

## Validation
- `git diff --check`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

materialized-view plugins Related to the plugin system

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants