[SPARK-56920][SQL][FOLLOWUP] Add CreateMetricView logical plan and pre-parse inputColumns#56010
Closed
cloud-fan wants to merge 2 commits into
Closed
[SPARK-56920][SQL][FOLLOWUP] Add CreateMetricView logical plan and pre-parse inputColumns#56010cloud-fan wants to merge 2 commits into
cloud-fan wants to merge 2 commits into
Conversation
…e-parse inputColumns
### What changes were proposed in this pull request?
Two refactors on top of SPARK-54119 that make the metric-view plan shape
more amenable to downstream extension (e.g., TEMPORARY/MATERIALIZED
metric views) and simpler for resolvers.
**1. Introduce `CreateMetricView` logical plan as the parser's return type.**
- Previously `CreateMetricViewCommand` doubled as both the parser output
and the V1 runnable command. The V2 strategy pattern-matched on it for
non-session catalogs, while the V1 path executed via `.run()`.
- Now the parser returns `CreateMetricView` (a `UnaryCommand`); for the
session catalog `ResolveSessionCatalog` rewrites it to
`CreateMetricViewCommand` (V1 runnable); for non-session v2 catalogs
`DataSourceV2Strategy` continues to dispatch to `CreateV2MetricViewExec`.
- This gives the parser a single, v1/v2-agnostic logical shape and frees
`CreateMetricViewCommand` to be V1-execution-only.
**2. Pre-parse YAML expressions into `inputColumns` on `MetricViewPlaceholder`.**
- `MetricViewPlaceholder.desc: MetricView` is replaced with
`inputColumns: Seq[InputColumn]`. `MetricViewPlanner.parseYAML` now
populates parsed `Expression` and column `Metadata` for each
dimension/measure column. `ResolveMetricView` reads pre-parsed
expressions directly instead of re-parsing from `desc.select`.
- `MetricViewPlanner.planWrite` returns the descriptor alongside the
placeholder (used only for property emission at CREATE time), so
callers that need it don't have to recover it from the placeholder.
### Why are the changes needed?
- Splitting the parser-output logical plan from the runnable command is a
standard Spark pattern (cf. `CreateView` -> `CreateViewCommand`) and lets
future extensions (e.g., schema modes, temp/materialized variants) add
fields to the logical plan without changing the runnable's shape.
- Carrying pre-parsed `inputColumns` on the placeholder gives a stable,
analyzer-friendly representation and decouples the resolver from the
YAML serde. The resolver no longer needs a `ParserInterface` field for
re-parsing expressions, and the per-column metadata conversion happens
once at planning time.
### Does this PR introduce _any_ user-facing change?
No. Internal refactor only.
### How was this patch tested?
Existing test suites pass locally:
- `MetricViewV2CatalogSuite` (31/31)
- `SimpleMetricViewSuite` (19/19)
- `MetricViewFactorySuite` (16/16)
### Was this patch authored or co-authored using generative AI tooling?
Co-authored using Claude Code.
Contributor
Author
2314f36 to
8d4dfc4
Compare
8d4dfc4 to
2371d30
Compare
zhengruifeng
approved these changes
May 21, 2026
Contributor
Author
|
thanks for review, merging to master/4.x/4.2! |
cloud-fan
added a commit
that referenced
this pull request
May 21, 2026
…e-parse inputColumns ### What changes were proposed in this pull request? Two refactors on top of SPARK-56920 that make the metric-view plan shape more amenable to downstream extension and simpler for resolvers. **1. Introduce `CreateMetricView` logical plan as the parser's return type.** - Previously `CreateMetricViewCommand` doubled as both the parser output and the V1 runnable command. The V2 strategy pattern-matched on it for non-session catalogs, while the V1 path executed via `.run()`. - Now the parser returns `CreateMetricView` (a `UnaryCommand`); for the session catalog `ResolveSessionCatalog` rewrites it to `CreateMetricViewCommand` (V1 runnable); for non-session v2 catalogs `DataSourceV2Strategy` continues to dispatch to `CreateV2MetricViewExec`. - This gives the parser a single, v1/v2-agnostic logical shape and frees `CreateMetricViewCommand` to be V1-execution-only. **2. Pre-parse YAML expressions into `inputColumns` on `MetricViewPlaceholder`.** - `MetricViewPlaceholder.desc: MetricView` is replaced with `inputColumns: Seq[InputColumn]`. `MetricViewPlanner.parseYAML` now populates parsed `Expression` and column `Metadata` for each dimension/measure column. `ResolveMetricView` reads pre-parsed expressions directly instead of re-parsing from `desc.select`. - `MetricViewPlanner.planWrite` returns the descriptor alongside the placeholder (used only for property emission at CREATE time), so callers that need it don't have to recover it from the placeholder. ### Why are the changes needed? - Splitting the parser-output logical plan from the runnable command is a widely adopted pattern in Spark — `CreateView` → `CreateViewCommand`, `CreateTable` → V1/V2 runnable commands, `DropTable` → `DropTableCommand`/V2 drop, etc. Aligning metric views with this pattern lets future extensions (e.g., schema modes, temp/materialized variants) add fields to the logical plan without changing the runnable's shape, and gives downstream rules a single match target to dispatch from. - Carrying pre-parsed `inputColumns` on the placeholder gives a stable, analyzer-friendly representation and decouples the resolver from the YAML serde. The resolver no longer needs a `ParserInterface` field for re-parsing expressions, and the per-column metadata conversion happens once at planning time. ### Does this PR introduce _any_ user-facing change? No. Internal refactor only. ### How was this patch tested? Existing test suites pass locally: - `MetricViewV2CatalogSuite` (31/31) - `SimpleMetricViewSuite` (19/19) - `MetricViewFactorySuite` (16/16) ### Was this patch authored or co-authored using generative AI tooling? Co-authored using Claude Code. Closes #56010 from cloud-fan/SPARK-54119-followup. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 37a442c) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
cloud-fan
added a commit
that referenced
this pull request
May 21, 2026
…e-parse inputColumns ### What changes were proposed in this pull request? Two refactors on top of SPARK-56920 that make the metric-view plan shape more amenable to downstream extension and simpler for resolvers. **1. Introduce `CreateMetricView` logical plan as the parser's return type.** - Previously `CreateMetricViewCommand` doubled as both the parser output and the V1 runnable command. The V2 strategy pattern-matched on it for non-session catalogs, while the V1 path executed via `.run()`. - Now the parser returns `CreateMetricView` (a `UnaryCommand`); for the session catalog `ResolveSessionCatalog` rewrites it to `CreateMetricViewCommand` (V1 runnable); for non-session v2 catalogs `DataSourceV2Strategy` continues to dispatch to `CreateV2MetricViewExec`. - This gives the parser a single, v1/v2-agnostic logical shape and frees `CreateMetricViewCommand` to be V1-execution-only. **2. Pre-parse YAML expressions into `inputColumns` on `MetricViewPlaceholder`.** - `MetricViewPlaceholder.desc: MetricView` is replaced with `inputColumns: Seq[InputColumn]`. `MetricViewPlanner.parseYAML` now populates parsed `Expression` and column `Metadata` for each dimension/measure column. `ResolveMetricView` reads pre-parsed expressions directly instead of re-parsing from `desc.select`. - `MetricViewPlanner.planWrite` returns the descriptor alongside the placeholder (used only for property emission at CREATE time), so callers that need it don't have to recover it from the placeholder. ### Why are the changes needed? - Splitting the parser-output logical plan from the runnable command is a widely adopted pattern in Spark — `CreateView` → `CreateViewCommand`, `CreateTable` → V1/V2 runnable commands, `DropTable` → `DropTableCommand`/V2 drop, etc. Aligning metric views with this pattern lets future extensions (e.g., schema modes, temp/materialized variants) add fields to the logical plan without changing the runnable's shape, and gives downstream rules a single match target to dispatch from. - Carrying pre-parsed `inputColumns` on the placeholder gives a stable, analyzer-friendly representation and decouples the resolver from the YAML serde. The resolver no longer needs a `ParserInterface` field for re-parsing expressions, and the per-column metadata conversion happens once at planning time. ### Does this PR introduce _any_ user-facing change? No. Internal refactor only. ### How was this patch tested? Existing test suites pass locally: - `MetricViewV2CatalogSuite` (31/31) - `SimpleMetricViewSuite` (19/19) - `MetricViewFactorySuite` (16/16) ### Was this patch authored or co-authored using generative AI tooling? Co-authored using Claude Code. Closes #56010 from cloud-fan/SPARK-54119-followup. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 37a442c) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Two refactors on top of SPARK-56920 that make the metric-view plan shape more amenable to downstream extension and simpler for resolvers.
1. Introduce
CreateMetricViewlogical plan as the parser's return type.CreateMetricViewCommanddoubled as both the parser output and the V1 runnable command. The V2 strategy pattern-matched on it for non-session catalogs, while the V1 path executed via.run().CreateMetricView(aUnaryCommand); for the session catalogResolveSessionCatalogrewrites it toCreateMetricViewCommand(V1 runnable); for non-session v2 catalogsDataSourceV2Strategycontinues to dispatch toCreateV2MetricViewExec.CreateMetricViewCommandto be V1-execution-only.2. Pre-parse YAML expressions into
inputColumnsonMetricViewPlaceholder.MetricViewPlaceholder.desc: MetricViewis replaced withinputColumns: Seq[InputColumn].MetricViewPlanner.parseYAMLnow populates parsedExpressionand columnMetadatafor each dimension/measure column.ResolveMetricViewreads pre-parsed expressions directly instead of re-parsing fromdesc.select.MetricViewPlanner.planWritereturns the descriptor alongside the placeholder (used only for property emission at CREATE time), so callers that need it don't have to recover it from the placeholder.Why are the changes needed?
CreateView→CreateViewCommand,CreateTable→ V1/V2 runnable commands,DropTable→DropTableCommand/V2 drop, etc. Aligning metric views with this pattern lets future extensions (e.g., schema modes, temp/materialized variants) add fields to the logical plan without changing the runnable's shape, and gives downstream rules a single match target to dispatch from.inputColumnson the placeholder gives a stable, analyzer-friendly representation and decouples the resolver from the YAML serde. The resolver no longer needs aParserInterfacefield for re-parsing expressions, and the per-column metadata conversion happens once at planning time.Does this PR introduce any user-facing change?
No. Internal refactor only.
How was this patch tested?
Existing test suites pass locally:
MetricViewV2CatalogSuite(31/31)SimpleMetricViewSuite(19/19)MetricViewFactorySuite(16/16)Was this patch authored or co-authored using generative AI tooling?
Co-authored using Claude Code.