[fix](fe) Prune empty and cascading CTE plans#62828
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
/review |
|
run buildall |
FE UT Coverage ReportIncrement line coverage |
|
OpenCode automated review failed and did not complete. Error: Review step was failure (possibly timeout or cancelled) Please inspect the workflow logs and rerun the review after the underlying issue is resolved. |
FE Regression Coverage ReportIncrement line coverage |
|
/review |
There was a problem hiding this comment.
I found 2 blocking issues.
Critical checkpoint conclusions:
- Goal of this PR: partially accomplished. The change does address empty/cascading CTE pruning in the optimizer path, and the new regression suite covers two representative plan shapes, but one fixpoint bug in the new normalization loop means cascading cleanup can still stop too early.
- Scope/minimality: mostly focused, but the
SessionVariableedit also removes a legacy variable alias unrelated to the optimizer fix. - Concurrency: no concurrency or locking concerns are introduced in these touched FE optimizer/session-variable paths.
- Lifecycle/static-init: no special lifecycle or static initialization issues are involved here.
- Configuration items: no new config was added, but an existing session variable declaration was changed in a user-visible way.
- Compatibility: not preserved. Removing
cbo_cte_inline_modebreaks existingSET/SHOW VARIABLESclients that still use the old alias. - Parallel code paths: the new optimizer-side normalization path is distinct from the main rewrite path, and its fixpoint detection currently relies on shallow
equals()semantics that do not reflect subtree changes. - Special conditions/checks: the new
currentPlan.equals(normalizedPlan)stop condition is not strong enough for Nereids plans. - Test coverage: improved for empty/cascade pruning, but still missing coverage for the shallow-equality fixpoint case and for preserving the legacy session-variable alias.
- Test result updates: one new regression test file was added; I did not run the test suite locally in this runner.
- Observability: no additional observability appears necessary for this change.
- Transaction/persistence/data-write/FE-BE variable passing: not applicable for this PR.
- Performance: no blocking performance issue identified beyond the correctness problems above.
- Other issues: none beyond the two findings below.
User focus: /tmp/review_focus.txt contains no additional focus items, so there were no extra focus-specific findings to report.
|
/review |
|
run buildall |
There was a problem hiding this comment.
I did not add duplicate inline comments because the remaining blocking issue is already covered by the existing review thread on SessionVariable.java: removing the cbo_cte_inline_mode alias is a user-visible compatibility break and should be restored or intentionally documented with a migration plan. The earlier Plan.equals() fixpoint issue appears addressed by the explicit changed flag in the current head.
Critical checkpoint conclusions:
- Goal/test: The PR targets CTE pruning and cascading inline normalization, with regression coverage added for empty and cascading CTE cases.
- Scope: The optimizer changes are focused, but the alias removal is unrelated to the optimizer fix and should not be included.
- Concurrency/lifecycle: No new concurrent mutable state, locks, or special lifecycle hazards identified.
- Configuration/compatibility: Compatibility issue remains for the removed session-variable alias.
- Parallel code paths: The new normalization is applied to the base rewrite plan and CBO inline alternatives; no distinct missing path found.
- Testing: Regression coverage exists for the main CTE pruning behavior; no additional user focus points were provided.
- Observability/persistence/data writes: Not applicable for this optimizer-only change.
- Performance: No blocking performance issue found in the changed code; the normalization loop is bounded by CTE removal/inlining progress.
Focus response: review_focus.txt had no additional focus points.
|
run buildall |
1 similar comment
|
run buildall |
TPC-H: Total hot run time: 29234 ms |
TPC-H: Total hot run time: 29546 ms |
TPC-DS: Total hot run time: 170284 ms |
TPC-DS: Total hot run time: 170993 ms |
FE UT Coverage ReportIncrement line coverage |
1 similar comment
FE UT Coverage ReportIncrement line coverage |
|
run buildall |
TPC-H: Total hot run time: 29539 ms |
TPC-DS: Total hot run time: 171356 ms |
FE UT Coverage ReportIncrement line coverage |
|
run buildall |
FE Regression Coverage ReportIncrement line coverage |
TPC-H: Total hot run time: 29596 ms |
TPC-DS: Total hot run time: 170170 ms |
FE UT Coverage ReportIncrement line coverage |
|
run buildall |
FE UT Coverage ReportIncrement line coverage |
TPC-H: Total hot run time: 29687 ms |
TPC-DS: Total hot run time: 170622 ms |
FE Regression Coverage ReportIncrement line coverage |
|
run buildall |
TPC-H: Total hot run time: 29297 ms |
TPC-DS: Total hot run time: 168555 ms |
FE Regression Coverage ReportIncrement line coverage |
|
run buildall |
TPC-H: Total hot run time: 29355 ms |
TPC-DS: Total hot run time: 171694 ms |
FE UT Coverage ReportIncrement line coverage |
FE Regression Coverage ReportIncrement line coverage |
|
run cloud_p0 |
|
/review |
FE Regression Coverage ReportIncrement line coverage |
There was a problem hiding this comment.
I found one additional issue beyond the existing review threads. The existing alias-removal compatibility concern in SessionVariable also still appears unresolved in the current diff, but I did not duplicate that inline comment.
Critical checkpoint conclusions:
- Goal/test: the PR targets CTE empty-prune and cascading inline normalization, and adds focused regression coverage, but the implementation leaves optimizer CTE bookkeeping stale after mutating the rewrite plan.
- Scope: the change is generally focused, but the new normalization hook needs to maintain existing
StatementContextinvariants. - Concurrency/lifecycle: no new concurrent execution path or special lifecycle issue found.
- Config/compatibility: no new config is added; the removed
cbo_cte_inline_modealias is an existing user-visible compatibility issue already raised in a prior thread. - Parallel paths: CTE consumer/producer bookkeeping paths used by costing and runtime filter generation are not updated after this new normalization path.
- Tests: regression tests cover plan-shape normalization, but they do not cover stale CTE consumer metadata effects on costing/runtime-filter decisions.
- Observability/transactions/persistence: not applicable.
- Performance: stale consumer counts can mis-cost remaining CTE producers/consumers and can suppress runtime-filter pushdown.
User focus: no additional user-provided review focus was supplied.
### What problem does this PR solve?
Issue Number: None
Related PR: None
Problem Summary: CTE normalization can remove empty branches or inline/remove anchors after StatementContext has already recorded producer and consumer metadata. Refresh the CTE context from the normalized rewrite plan before memo construction so CBO costing and runtime-filter pushdown use the live CTE shape.
### Release note
None
### Check List (For Author)
- Test: Unit Test
- org.apache.doris.nereids.rules.rewrite.CTEInlineTest#refreshCteConsumersAfterNormalizeEliminatesEmptyBranch
- Maven FE validate
- Behavior changed: No
- Does this need documentation: No
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
run buildall |
FE UT Coverage ReportIncrement line coverage |
TPC-H: Total hot run time: 31424 ms |
TPC-DS: Total hot run time: 169824 ms |
FE Regression Coverage ReportIncrement line coverage |
|
/review |
There was a problem hiding this comment.
Review conclusion: request changes. I did not add a duplicate inline comment because the remaining blocking issue is already covered by the existing thread on fe/fe-core/src/main/java/org/apache/doris/qe/SessionVariable.java: removing alias = "cbo_cte_inline_mode" is still present in the current PR diff and remains a user-visible compatibility break for existing SET cbo_cte_inline_mode = ... usage. Please keep the alias unless there is an intentional deprecation/migration plan.
Critical checkpoint conclusions:
- Goal/test: The CTE normalization goal is addressed by fixpoint normalization and added FE/regression coverage for empty/cascading CTE pruning. The existing alias compatibility regression is unrelated to the stated optimizer fix and remains unaddressed.
- Scope: The CTE changes are focused, but the session variable alias removal is not justified by the PR goal.
- Concurrency/lifecycle: No new FE locking/concurrency or special lifecycle issue found in the reviewed CTE path.
- Configuration/compatibility: Blocking issue remains for the removed legacy session-variable alias.
- Parallel paths: Existing CTE context refresh paths were reviewed; no new distinct issue beyond prior threads found.
- Tests/results: New CTE tests cover the main normalization behavior; I did not run tests in this review runner.
- Observability/transactions/persistence/data writes: Not applicable to this PR.
- Performance: No additional performance blocker found in the current CTE normalization logic.
User focus: no additional user-provided review focus was specified.
|
PR approved by at least one committer and no changes requested. |
What problem does this PR solve?
Issue Number: None
Related PR: #60601
Problem Summary: Fix optimizer-side CTE pruning so empty-relation producers, zero-consumer anchors, and cascading inline opportunities are normalized to a fixpoint before memoization, and add regression coverage for empty and cascading CTE elimination.
Release note
None
Check List (For Author)
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)