Skip to content

[SPARK-56921][SQL] Fix CTE ID normalization for nested CTEs#55985

Open
puneetdixit200 wants to merge 2 commits into
apache:masterfrom
puneetdixit200:fix-spark-56921-normalize-nested-cte
Open

[SPARK-56921][SQL] Fix CTE ID normalization for nested CTEs#55985
puneetdixit200 wants to merge 2 commits into
apache:masterfrom
puneetdixit200:fix-spark-56921-normalize-nested-cte

Conversation

@puneetdixit200
Copy link
Copy Markdown

@puneetdixit200 puneetdixit200 commented May 19, 2026

What changes were proposed in this pull request?

This updates NormalizeCTEIds so normalization for the current WithCTE scope does not cross into nested WithCTE nodes. Nested WithCTEs are normalized separately by the normal top-down traversal, including when they are reached through subquery expressions.

A regression test covers the reported temp-view + nested CTE + union pattern.

Why are the changes needed?

The previous traversal could double-remap refs because cteIdToNewId is shared across the rule run. When sibling WithCTEs sit under a Union, the left side can map the inner CTE id to a new id, the right side can rewrite the same nested ref through that map, and the outer traversal can then rewrite that new id again if it is also a key in the map. The second rewrite can point the ref at the wrong outer CTE definition, which later lets InlineCTE substitute the wrong body.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

  • git diff --check origin/master HEAD -- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/normalizer/NormalizeCTEIds.scala sql/core/src/test/scala/org/apache/spark/sql/CTEInlineSuite.scala
  • JAVA_HOME=/opt/homebrew/opt/openjdk@17 build/sbt 'sql/testOnly *CTEInlineSuiteAEOff *CTEInlineSuiteAEOn -- -z SPARK-56921'

Was this patch authored or co-authored using generative AI tooling?

Generated-by: OpenAI GPT-5

@hkeller7kfw
Copy link
Copy Markdown

@cloud-fan can you please have a look, as you reviewed the original code?

Copy link
Copy Markdown
Contributor

@cloud-fan cloud-fan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed the bug by re-running the new test against HEAD^: fails with PLAN_VALIDATION_FAILED_RULE_IN_BATCH from CombineUnions on the old code, passes on the new code. All 59 CTEInlineSuite{AEOff,AEOn} tests pass under the fix, including the SPARK-46741 case that exercises EXISTS (WITH cte AS ...).

A few suggestions inline. The main one is to hoist the nested-WithCTE skip to a single guard at the top of canonicalizeCTE — this makes the invariant provable from one line ("canonicalizeCTE never crosses a WithCTE boundary"), incidentally closes a small asymmetric hole in the subquery path, and supersedes the now-stale comment that used to describe the old safeguard.

One thing worth adding to the PR description: a sentence on why the old code double-remaps. The mechanism is a "swap" — cteIdToNewId is shared across the whole rule run, so when sibling WithCTEs land under a Union, processing of LEFT populates the map with the inner CTE's id mapped to a small new id (e.g. 0→1). When RIGHT's canonicalizeCTE then descends through the shared inner WithCTE, it rewrites Ref(0) → Ref(1). Then applyInternal's outer traversal reaches that same inner WithCTE and rewrites again — but now Ref(1) matches a key (1→0), so it gets flipped back to Ref(0), pointing at the outer cte's def. That's the swap that causes InlineCTE to substitute cte_a's body for cte_common. Worth a sentence in the description so future readers don't have to reconstruct it from the test.

Comment thread sql/core/src/test/scala/org/apache/spark/sql/CTEInlineSuite.scala Outdated
@puneetdixit200 puneetdixit200 force-pushed the fix-spark-56921-normalize-nested-cte branch from af06f0e to 92d50c1 Compare May 23, 2026 11:26
@puneetdixit200
Copy link
Copy Markdown
Author

Resolved the two stale review threads after rechecking the current head. The nested WithCTE guard now covers both child and subquery traversal paths, and the test no longer includes the redundant explicit normalization call.\n\nVerified locally with Java 17:\n- git diff --check origin/master HEAD -- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/normalizer/NormalizeCTEIds.scala sql/core/src/test/scala/org/apache/spark/sql/CTEInlineSuite.scala\n- build/sbt 'sql/testOnly *CTEInlineSuiteAEOff *CTEInlineSuiteAEOn -- -z SPARK-56921'\n\nGenerated-by: OpenAI GPT-5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants