planner: lateral join quality updates#67482
Conversation
|
Review Complete Findings: 0 issues ℹ️ Learn more details on Pantheon AI. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughDetect nested LATERAL occurrences in deeper table-source structures, adjust when CTE plans are marked "in apply" after full-schema propagation, defer restore of Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.11.4)Command failed Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Hi @terry1purcell. Thanks for your PR. PRs from untrusted users cannot be marked as trusted with I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
pkg/planner/core/operator/logicalop/logical_apply.go (1)
217-223: Keep Semi/AntiColNDVsaligned with the new lateral row-count scaling.On
Line 222, lateral Semi/Anti Apply now scalesRowCount, butColNDVsare still copied unscaled later (Line 229-Line 231). That diverges fromLogicalJoin.DeriveStatsbehavior and can skew downstream selectivity/costing.♻️ Suggested adjustment
- } else if la.IsLateral && (la.JoinType == base.SemiJoin || la.JoinType == base.AntiSemiJoin) { + semiAntiLateral := la.IsLateral && (la.JoinType == base.SemiJoin || la.JoinType == base.AntiSemiJoin) + if semiAntiLateral { // For LATERAL SemiJoin/AntiSemiJoin Apply operators, apply SelectionFactor // to the row count estimate, consistent with LogicalJoin.DeriveStats. // Non-lateral Apply (correlated subqueries) keeps the original left row count // to avoid changing existing plan estimates. rowCount *= cost.SelectionFactor } @@ - for id, c := range leftProfile.ColNDVs { - la.StatsInfo().ColNDVs[id] = c + for id, c := range leftProfile.ColNDVs { + if semiAntiLateral { + c *= cost.SelectionFactor + } + la.StatsInfo().ColNDVs[id] = c }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/planner/core/operator/logicalop/logical_apply.go` around lines 217 - 223, LogicalApply currently scales rowCount for lateral SemiJoin/AntiSemiJoin (la.IsLateral && la.JoinType == base.SemiJoin/AntiSemiJoin) but does not scale per-column NDV estimates (ColNDVs) when they are copied later, causing divergence from LogicalJoin.DeriveStats; update the copy/assignment of ColNDVs in the LogicalApply instance to scale NDV values by the same SelectionFactor when la.IsLateral and JoinType is SemiJoin or AntiSemiJoin so ColNDVs remain consistent with the adjusted RowCount (reference symbols: la, IsLateral, JoinType, RowCount, ColNDVs, cost.SelectionFactor, LogicalJoin.DeriveStats).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/planner/core/logical_plan_builder.go`:
- Around line 688-696: The helper containsLateralTableSource currently only
recurses into n.Source when it's an ast.ResultSetNode but doesn't handle wrapper
nodes like *ast.SelectStmt and *ast.SetOprStmt, so LATERAL inside derived
SELECTs/set-ops isn't detected and buildJoin's isLateral logic is wrong; update
containsLateralTableSource to type-switch on wrapper types (e.g.,
*ast.SelectStmt and *ast.SetOprStmt) and descend into their inner
result-set/operands (recursing into the contained ResultSetNode(s)) so that
nested LATERALs are found and buildJoin's left-schema push / join-reordering
behavior is correct.
---
Nitpick comments:
In `@pkg/planner/core/operator/logicalop/logical_apply.go`:
- Around line 217-223: LogicalApply currently scales rowCount for lateral
SemiJoin/AntiSemiJoin (la.IsLateral && la.JoinType ==
base.SemiJoin/AntiSemiJoin) but does not scale per-column NDV estimates
(ColNDVs) when they are copied later, causing divergence from
LogicalJoin.DeriveStats; update the copy/assignment of ColNDVs in the
LogicalApply instance to scale NDV values by the same SelectionFactor when
la.IsLateral and JoinType is SemiJoin or AntiSemiJoin so ColNDVs remain
consistent with the adjusted RowCount (reference symbols: la, IsLateral,
JoinType, RowCount, ColNDVs, cost.SelectionFactor, LogicalJoin.DeriveStats).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 04607bb4-c3e4-4da5-8c33-4cbdf3f7e66d
📒 Files selected for processing (4)
pkg/planner/core/logical_plan_builder.gopkg/planner/core/operator/logicalop/logical_apply.gopkg/planner/core/operator/logicalop/logical_cte.gopkg/planner/core/rule_decorrelate.go
|
/ok-to-test |
…rStmt Remove redundant type assertion on n.Source (already ast.ResultSetNode) and add cases for *ast.SelectStmt and *ast.SetOprStmt so containsLateralTableSource descends into derived subqueries and set-ops. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
♻️ Duplicate comments (1)
pkg/planner/core/logical_plan_builder.go (1)
707-716:⚠️ Potential issue | 🟠 MajorParenthesized set-op operands still bypass this LATERAL walk.
SetOprStmt.SelectList.Selectsis not limited toast.ResultSetNode; the same file’s set-op builder later handles*ast.SetOprSelectListexplicitly for parenthesized operands. Those still fall through here, so a shape likeJOIN ((SELECT ...) UNION ALL ((SELECT ... JOIN LATERAL ...))) dtcan keepisLateral == falseand skip the outer-schema push for the nested branch.🛠️ Suggested fix
+func containsLateralNode(node ast.Node) bool { + switch n := node.(type) { + case ast.ResultSetNode: + return containsLateralTableSource(n) + case *ast.SetOprSelectList: + for _, sel := range n.Selects { + if containsLateralNode(sel) { + return true + } + } + } + return false +} + case *ast.SetOprStmt: // Check each operand in the UNION/INTERSECT/EXCEPT list. if n.SelectList != nil { for _, sel := range n.SelectList.Selects { - if rs, ok := sel.(ast.ResultSetNode); ok && containsLateralTableSource(rs) { + if containsLateralNode(sel) { return true } } } return falsePlease add a regression for a parenthesized set-op operand that contains
LATERAL.Based on learnings, "For planner rule or logical/physical plan changes, perform targeted planner unit tests and update rule testdata when needed."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/planner/core/logical_plan_builder.go` around lines 707 - 716, The SetOprStmt branch only inspects SelectList.Selects assuming each is an ast.ResultSetNode, so parenthesized operands represented by *ast.SetOprSelectList are skipped and can hide LATERALs; update the check in the SetOprStmt case in logical_plan_builder.go to also handle ast.SetOprSelectList by descending into its inner Selects (reusing containsLateralTableSource or similar traversal) so parenthesized operands are scanned, ensure isLateral can become true for those cases, and add a planner unit regression that builds a join over a set-op where a parenthesized operand contains a LATERAL to validate the outer-schema push occurs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@pkg/planner/core/logical_plan_builder.go`:
- Around line 707-716: The SetOprStmt branch only inspects SelectList.Selects
assuming each is an ast.ResultSetNode, so parenthesized operands represented by
*ast.SetOprSelectList are skipped and can hide LATERALs; update the check in the
SetOprStmt case in logical_plan_builder.go to also handle ast.SetOprSelectList
by descending into its inner Selects (reusing containsLateralTableSource or
similar traversal) so parenthesized operands are scanned, ensure isLateral can
become true for those cases, and add a planner unit regression that builds a
join over a set-op where a parenthesized operand contains a LATERAL to validate
the outer-schema push occurs.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 8df46d6a-40e8-48ac-bfd3-8a6e38814e17
📒 Files selected for processing (1)
pkg/planner/core/logical_plan_builder.go
Apply SelectionFactor to all SemiJoin/AntiSemiJoin Apply operators, not just LATERAL ones, for consistency with LogicalJoin.DeriveStats. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #67482 +/- ##
================================================
+ Coverage 77.8123% 78.5339% +0.7216%
================================================
Files 2023 1972 -51
Lines 556615 550884 -5731
================================================
- Hits 433115 432631 -484
+ Misses 121754 117817 -3937
+ Partials 1746 436 -1310
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
♻️ Duplicate comments (1)
pkg/planner/core/logical_plan_builder.go (1)
690-721:⚠️ Potential issue | 🟠 MajorStill descend into nested
SetOprSelectListoperands here.Line 716 only recurses into
ast.ResultSetNode, but parenthesized UNION/INTERSECT/EXCEPT branches are represented later in this file as*ast.SetOprSelectList. A right side likeJOIN ((SELECT ...) UNION ((SELECT ... FROM t2 JOIN LATERAL (...) l))) AS dtstill returnsfalsehere, sobuildJoinwill skip the outer-schema push and can leave join reordering enabled for a lateral-dependent subtree.♻️ Suggested fix
+func containsLateralSetOprSelectList(list *ast.SetOprSelectList) bool { + if list == nil { + return false + } + for _, sel := range list.Selects { + switch x := sel.(type) { + case ast.ResultSetNode: + if containsLateralTableSource(x) { + return true + } + case *ast.SetOprSelectList: + if containsLateralSetOprSelectList(x) { + return true + } + } + } + return false +} + func containsLateralTableSource(node ast.ResultSetNode) bool { switch n := node.(type) { ... case *ast.SetOprStmt: - // Check each operand in the UNION/INTERSECT/EXCEPT list. - if n.SelectList != nil { - for _, sel := range n.SelectList.Selects { - if rs, ok := sel.(ast.ResultSetNode); ok && containsLateralTableSource(rs) { - return true - } - } - } - return false + return containsLateralSetOprSelectList(n.SelectList) default: return false } }Please add a planner regression covering a parenthesized set-op operand with nested
LATERALas well. Based on learnings: "Applies to pkg/planner/** : For planner rule or logical/physical plan changes, perform targeted planner unit tests and update rule testdata when needed."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/planner/core/logical_plan_builder.go` around lines 690 - 721, The containsLateralTableSource function currently skips operands represented as *ast.SetOprSelectList, so add a case for *ast.SetOprSelectList that iterates its Selects (similar to the *ast.SetOprStmt case) and recurses into each operand via containsLateralTableSource; update the switch in containsLateralTableSource to handle both *ast.SetOprStmt and *ast.SetOprSelectList, returning true if any operand contains a LATERAL, and false otherwise; then add a planner regression test exercising a parenthesized set-op operand (e.g., JOIN ((SELECT ...) UNION ((SELECT ... FROM t2 JOIN LATERAL (...) l))) AS dt) to ensure buildJoin sees the nested LATERAL and outer-schema push / join-reordering behavior is correct.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@pkg/planner/core/logical_plan_builder.go`:
- Around line 690-721: The containsLateralTableSource function currently skips
operands represented as *ast.SetOprSelectList, so add a case for
*ast.SetOprSelectList that iterates its Selects (similar to the *ast.SetOprStmt
case) and recurses into each operand via containsLateralTableSource; update the
switch in containsLateralTableSource to handle both *ast.SetOprStmt and
*ast.SetOprSelectList, returning true if any operand contains a LATERAL, and
false otherwise; then add a planner regression test exercising a parenthesized
set-op operand (e.g., JOIN ((SELECT ...) UNION ((SELECT ... FROM t2 JOIN LATERAL
(...) l))) AS dt) to ensure buildJoin sees the nested LATERAL and outer-schema
push / join-reordering behavior is correct.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 9813e52e-0de6-4484-b2c4-13df6772e53c
📒 Files selected for processing (1)
pkg/planner/core/logical_plan_builder.go
|
/retest-required |
| vars.EnableParallelApply = false | ||
| defer func() { vars.EnableParallelApply = savedParallelApply }() | ||
| _, p.Cte.RecursivePartPhysicalPlan, _, err = utilfuncp.DoOptimize(context.TODO(), p.SCtx(), p.Cte.OptFlag, p.Cte.RecursivePartLogicalPlan) | ||
| vars.EnableParallelApply = savedParallelApply |
There was a problem hiding this comment.
Wow, will the original code cause any issues?
There was a problem hiding this comment.
From a deeper analysis - the original code works correctly in a normal execution. However, it is not panic-safe. If DoOptimize panics, then the original code would leak EnableParallelApply = false for the rest of the session. It is not clear to me how many users enable parallel Apply currently. And this fix is already in the cherry pick branch (where this bug was found). And it is unlikely that anyone else has this issue - since lateral join fix was recently merged to master.
|
/retest-required |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: hawkingrei, qw4990, winoros The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest |
|
/test all |
|
/retest-required |
2 similar comments
|
/retest-required |
|
/retest-required |
|
/test all |
|
/test mysql-test |
|
@qw4990: The specified target(s) for Use DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/retest |
1 similar comment
|
/retest |
|
In response to a cherrypick label: new pull request created to branch |
What problem does this PR solve?
Issue Number: ref #40328
Problem Summary:
What changed and how does it work?
During a cherry pick of lateral join to a prior branch - new AI reviews found additional issues with the original implementation. These are addressed in this PR.
Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.
Summary by CodeRabbit