[SPARK-51337][SQL] Add maxRows to CTERelationDef and CTERelationRef #50104

vladimirg-db · 2025-02-27T11:20:48Z

What changes were proposed in this pull request?

Add maxRows field to CTERelationRef.

Why are the changes needed?

The Analyzer validates scalar subqueries by checking if it outputs just one row or not: https://github.com/vladimirg-db/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ValidateSubqueryExpression.scala#L144.

This works in fixed-point Analyzer, because CTEs are inlined before CheckAnalysis. However, in single-pass Analyzer, we validate the subquery right after its validation, so CTERelationRef must output correct maxRows as well, based on the related CTERelationDef's maxRows.

Does this PR introduce any user-facing change?

There should be no changes to existing Catalyst behavior.
Added a flag to mitigate the potential edge-cases: spark.sql.cteRelationDefMaxRows.enabled.

How was this patch tested?

Existing tests.

Was this patch authored or co-authored using generative AI tooling?

No.

cloud-fan · 2025-02-28T01:06:15Z

thanks, merging to master!

github-actions bot added the SQL label Feb 27, 2025

vladimirg-db force-pushed the vladimir-golubev_data/add-max-rows-field-to-cte-relation-ref branch from 3b1bb8c to c77e7fe Compare February 27, 2025 11:26

vladimirg-db changed the title ~~[SPARK-51337][SQL] Add maxRows to CTERelationRef~~ [SPARK-51337][SQL] Add maxRows to CTERelationDef and CTERelationRef Feb 27, 2025

Add maxRows to CTERelationRef

463c1a9

vladimirg-db force-pushed the vladimir-golubev_data/add-max-rows-field-to-cte-relation-ref branch from c77e7fe to 463c1a9 Compare February 27, 2025 14:14

cloud-fan approved these changes Feb 28, 2025

View reviewed changes

cloud-fan closed this in 88addf4 Feb 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-51337][SQL] Add maxRows to CTERelationDef and CTERelationRef #50104

[SPARK-51337][SQL] Add maxRows to CTERelationDef and CTERelationRef #50104

Uh oh!

vladimirg-db commented Feb 27, 2025 •

edited

Loading

Uh oh!

cloud-fan commented Feb 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[SPARK-51337][SQL] Add maxRows to CTERelationDef and CTERelationRef #50104

[SPARK-51337][SQL] Add maxRows to CTERelationDef and CTERelationRef #50104

Uh oh!

Conversation

vladimirg-db commented Feb 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

cloud-fan commented Feb 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vladimirg-db commented Feb 27, 2025 •

edited

Loading