[SPARK-56921][SQL] Fix CTE ID normalization for nested CTEs#55985
[SPARK-56921][SQL] Fix CTE ID normalization for nested CTEs#55985puneetdixit200 wants to merge 2 commits into
Conversation
|
@cloud-fan can you please have a look, as you reviewed the original code? |
cloud-fan
left a comment
There was a problem hiding this comment.
Confirmed the bug by re-running the new test against HEAD^: fails with PLAN_VALIDATION_FAILED_RULE_IN_BATCH from CombineUnions on the old code, passes on the new code. All 59 CTEInlineSuite{AEOff,AEOn} tests pass under the fix, including the SPARK-46741 case that exercises EXISTS (WITH cte AS ...).
A few suggestions inline. The main one is to hoist the nested-WithCTE skip to a single guard at the top of canonicalizeCTE — this makes the invariant provable from one line ("canonicalizeCTE never crosses a WithCTE boundary"), incidentally closes a small asymmetric hole in the subquery path, and supersedes the now-stale comment that used to describe the old safeguard.
One thing worth adding to the PR description: a sentence on why the old code double-remaps. The mechanism is a "swap" — cteIdToNewId is shared across the whole rule run, so when sibling WithCTEs land under a Union, processing of LEFT populates the map with the inner CTE's id mapped to a small new id (e.g. 0→1). When RIGHT's canonicalizeCTE then descends through the shared inner WithCTE, it rewrites Ref(0) → Ref(1). Then applyInternal's outer traversal reaches that same inner WithCTE and rewrites again — but now Ref(1) matches a key (1→0), so it gets flipped back to Ref(0), pointing at the outer cte's def. That's the swap that causes InlineCTE to substitute cte_a's body for cte_common. Worth a sentence in the description so future readers don't have to reconstruct it from the test.
af06f0e to
92d50c1
Compare
|
Resolved the two stale review threads after rechecking the current head. The nested |
What changes were proposed in this pull request?
This updates
NormalizeCTEIdsso normalization for the currentWithCTEscope does not cross into nestedWithCTEnodes. NestedWithCTEs are normalized separately by the normal top-down traversal, including when they are reached through subquery expressions.A regression test covers the reported temp-view + nested CTE + union pattern.
Why are the changes needed?
The previous traversal could double-remap refs because
cteIdToNewIdis shared across the rule run. When siblingWithCTEs sit under aUnion, the left side can map the inner CTE id to a new id, the right side can rewrite the same nested ref through that map, and the outer traversal can then rewrite that new id again if it is also a key in the map. The second rewrite can point the ref at the wrong outer CTE definition, which later letsInlineCTEsubstitute the wrong body.Does this PR introduce any user-facing change?
No.
How was this patch tested?
git diff --check origin/master HEAD -- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/normalizer/NormalizeCTEIds.scala sql/core/src/test/scala/org/apache/spark/sql/CTEInlineSuite.scalaJAVA_HOME=/opt/homebrew/opt/openjdk@17 build/sbt 'sql/testOnly *CTEInlineSuiteAEOff *CTEInlineSuiteAEOn -- -z SPARK-56921'Was this patch authored or co-authored using generative AI tooling?
Generated-by: OpenAI GPT-5