[fix](nereids) Guard UniqueFunction in multiple filter/topn pushdown rules#62742
[fix](nereids) Guard UniqueFunction in multiple filter/topn pushdown rules#62742yujun777 wants to merge 7 commits into
Conversation
|
run buildall |
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
/review |
There was a problem hiding this comment.
I found two remaining wrong-result paths, so I can't approve this batch yet.
PushDownJoinOtherConditionstill rewrites unique-function join predicates to one side. Because it only checkspredicate.getInputSlots()(fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/PushDownJoinOtherCondition.java:77-84,110-111),ON rand() > 0.5andON t1.id + rand() > 5are treated as pushable. That changes evaluation from once per joined row to once per input row.PushDownFilterThroughSetOperationstill pushes unique predicates below set operators. It rewrites every conjunct into every child with nocontainsUniqueFunction()guard (fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/PushDownFilterThroughSetOperation.java:146-149). OnUNION DISTINCT/INTERSECT/EXCEPT, movingrand()below the set op changes semantics.
Critical checkpoints:
- Goal of current task: partially achieved. The touched rules are guarded, but the overall unique-function pushdown issue is not closed because the two parallel rewrites above remain unsafe.
- Scope/simplicity: the local edits are small and focused.
- Concurrency: not applicable.
- Lifecycle/static init: not applicable.
- Config: no new config.
- Compatibility: no FE/BE protocol or storage compatibility impact.
- Parallel code paths: not fully covered;
PushDownJoinOtherConditionandPushDownFilterThroughSetOperationstill need the same treatment. - Special conditions/comments: the added
containsUniqueFunction()guards are appropriate where added. - Test coverage: good for the touched rules, but still missing coverage for the remaining join-other-condition and set-operation cases.
- Test result files: the added shape files look consistent with the touched rewrites.
- Observability/transactions/data writes/FE-BE variable passing: not applicable.
- Performance: negligible impact.
User focus points: none provided.
| Set<Slot> conjunctSlots = conjunct.getInputSlots(); | ||
| if (commonGroupingSetExpressions.containsAll(conjunctSlots)) { | ||
| if (commonGroupingSetExpressions.containsAll(conjunctSlots) | ||
| && !conjunct.containsUniqueFunction()) { |
There was a problem hiding this comment.
Same non-idempotent gap still exists in PushDownJoinOtherCondition. That rule only checks predicate.getInputSlots(), so ON rand() > 0.5 (empty slot set) and ON t1.id + rand() > 5 are still pushed to one child for inner/cross joins. That changes evaluation from once per joined row to once per left/right input row: with one left row joining two right rows, the original plan can keep one pair and drop the other, while the rewritten plan can only keep both or none. Since this PR is fixing exactly this class of unique-function pushdown bug, I think we need the same containsUniqueFunction() guard there as well.
There was a problem hiding this comment.
See main PR comment — the three raised concerns (PushDownJoinOtherCondition, PushDownFilterThroughSetOperation, normalized-slot NLJ path) are respectively already fixed in this PR, already fixed in the follow-up #62754, and a pre-existing BETWEEN-expansion trade-off out of scope for this PR.
| return ImmutableList.of( | ||
| logicalTopN(logicalUnion().when(union -> union.getQualifier() == Qualifier.ALL)) | ||
| .when(topN -> topN.getOrderKeys().stream() | ||
| .noneMatch(key -> key.getExpr().containsUniqueFunction())) |
There was a problem hiding this comment.
There's still a parallel set-operation rewrite with the same issue: PushDownFilterThroughSetOperation copies every conjunct into every child without checking containsUniqueFunction(). For SELECT * FROM (SELECT 1 UNION SELECT 1) u WHERE rand() > 0.5, the original plan evaluates rand() once after UNION DISTINCT (keep probability 0.5), but the rewritten plan evaluates it once per branch and the row survives if either branch passes (probability 0.75). INTERSECT and EXCEPT have the same problem, so this batch still leaves a wrong-result path open.
There was a problem hiding this comment.
See main PR comment — the three raised concerns (PushDownJoinOtherCondition, PushDownFilterThroughSetOperation, normalized-slot NLJ path) are respectively already fixed in this PR, already fixed in the follow-up #62754, and a pre-existing BETWEEN-expansion trade-off out of scope for this PR.
de43dd6 to
12d07aa
Compare
|
run buildall |
|
/review |
There was a problem hiding this comment.
Requesting changes.
Findings:
- I am not repeating the already-open inline thread on
PushDownFilterThroughSetOperation, but that wrong-result path still appears unresolved.RuleSet.PUSH_DOWN_FILTERSstill runsPushDownFilterThroughSetOperationbeforeAddProjectForUniqueFunction, sorand()/uuid()/random_bytes()predicates aboveUNION DISTINCT/INTERSECT/EXCEPTcan still be copied into each child and change semantics. - I left one new inline comment for a distinct nested-loop join path: after a one-sided unique
ONpredicate is normalized behind slots, a laterPUSH_DOWN_FILTERSpass can still push it into a child, so it is evaluated once per input row instead of once per joined pair.
Critical checkpoints:
- Goal: Partially met. The direct Window / PartitionTopN / Join-other-conjunct cases are guarded, but the PR still leaves wrong-result paths for set operations and normalized nested-loop join predicates.
- Minimality/focus: The patch is small and focused on the intended rules.
- Concurrency: No concurrency or locking changes are involved.
- Lifecycle/static initialization: No special lifecycle or initialization changes are involved.
- Config: No new configuration is added.
- Compatibility: No FE/BE protocol, symbol, or storage compatibility impact is involved.
- Parallel code paths: Not fully covered;
PushDownFilterThroughSetOperationand the later nested-loop join projection/pushdown path still bypass the new guard. - Special conditions/comments: The new
containsUniqueFunction()checks are correct for direct AST forms, but they are not sufficient once unique expressions have been rewritten behind slots. - Test coverage: Improved, but still incomplete. The added FE UTs and regression suites cover the direct cases; they do not cover the remaining set-operation case or the slot-normalized nested-loop join case.
- Test result files: The new
.outfiles look auto-generated. The updatedadd_project_for_unique_function.outalso exposes the remaining nested-loop join issue. - Observability: Not applicable for these optimizer rewrites.
- Transaction/persistence/data writes/FE-BE variable passing: Not applicable.
- Performance: The new guards are cheap; the remaining bad nested-loop plan also introduces redundant random evaluation/work on child filters.
- Other issues: No additional user-provided review focus was provided.
| // In addition, a predicate like `rand() > 0.5` has empty input slots, | ||
| // so `allCoveredBy(..., child.getOutputSet())` would otherwise wrongly | ||
| // return true because `Set.containsAll(emptySet) == true`. | ||
| if (otherConjunct.containsUniqueFunction()) { |
There was a problem hiding this comment.
This fixes the direct AST case, but I think a normalized-slot path is still open. ProjectOtherJoinConditionForNestedLoopJoin can later rewrite a one-sided predicate like t2.id * random(1, 100) between 100 and 200 into a child-local boolean slot; the final PUSH_DOWN_FILTERS pass then sees only that slot and pushes it into the child because containsUniqueFunction() is now false. The updated regression-test/data/nereids_rules_p0/unique_function/add_project_for_unique_function.out still shows exactly that for join_1 (and likewise pushes random(1, 100) between 1 and 10 to the left child), which changes evaluation from once per joined pair to once per input row. This is distinct from the already-raised direct containsUniqueFunction() thread because the unique function has already been hidden behind a slot by the time the later pushdown runs.
There was a problem hiding this comment.
See main PR comment — the three raised concerns (PushDownJoinOtherCondition, PushDownFilterThroughSetOperation, normalized-slot NLJ path) are respectively already fixed in this PR, already fixed in the follow-up #62754, and a pre-existing BETWEEN-expansion trade-off out of scope for this PR.
…ewrites ### What problem does this PR solve? Issue Number: N/A Related PR: apache#62742 Problem Summary: Two additional Nereids rewrite rules were not aware of `UniqueFunction` (e.g. `rand()`, `uuid()`, `random()`) and could silently change the semantics of SQL that uses such functions. This PR is batch 2 of the `UniqueFunction` pushdown audit. The previously-reviewed batch 1 is in apache#62742. 1. `PushDownFilterThroughSetOperation` Pushing a filter that references a `UniqueFunction` through `UNION DISTINCT` / `INTERSECT` / `EXCEPT` duplicates each unique-fn call per child, which changes result semantics (e.g. `rand() > 0.5` would be evaluated independently for each branch instead of once above the set op). `UNION ALL` remains safe, because the set op is a pure concatenation and every row evaluates the filter exactly once below or above. The rewrite now splits conjuncts: pushable conjuncts (`UNION ALL`, or not containing a unique function) go through the child branches; non-pushable conjuncts stay above the set op. 2. `ProjectOtherJoinConditionForNestedLoopJoin` The rule extracts sub-expressions of NLJ other conditions into child projects as aliases. An expression containing a unique function must not be materialized once above the scan and then referenced in the join, because that separates its evaluation from the join pair. Expressions that contain a unique function are now left inline in the other condition. Both fixes follow the same pattern used in batch 1: an early no-op branch when `expression.containsUniqueFunction()` is true. ### Release note None ### Check List (For Author) - Test: Regression test - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ewrites ### What problem does this PR solve? Issue Number: N/A Related PR: apache#62742 Problem Summary: Two additional Nereids rewrite rules were not aware of `UniqueFunction` (e.g. `rand()`, `uuid()`, `random()`) and could silently change the semantics of SQL that uses such functions. This PR is batch 2 of the `UniqueFunction` pushdown audit. The previously-reviewed batch 1 is in apache#62742. 1. `PushDownFilterThroughSetOperation` Pushing a filter that references a `UniqueFunction` through `UNION DISTINCT` / `INTERSECT` / `EXCEPT` duplicates each unique-fn call per child, which changes result semantics (e.g. `rand() > 0.5` would be evaluated independently for each branch instead of once above the set op). `UNION ALL` remains safe, because the set op is a pure concatenation and every row evaluates the filter exactly once below or above. The rewrite now splits conjuncts: pushable conjuncts (`UNION ALL`, or not containing a unique function) go through the child branches; non-pushable conjuncts stay above the set op. 2. `ProjectOtherJoinConditionForNestedLoopJoin` The rule extracts sub-expressions of NLJ other conditions into child projects as aliases. An expression containing a unique function must not be materialized once above the scan and then referenced in the join, because that separates its evaluation from the join pair. Expressions that contain a unique function are now left inline in the other condition. Both fixes follow the same pattern used in batch 1: an early no-op branch when `expression.containsUniqueFunction()` is true. ### Release note None ### Check List (For Author) - Test: Regression test - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… SetOp predicate inference rewrites ### What problem does this PR solve? Issue Number: N/A Related PR: apache#62742 Problem Summary: Three Nereids rewrite rules were not aware of `UniqueFunction` (e.g. `rand()`, `uuid()`, `random()`) and could silently change the semantics of SQL that uses such functions. This PR is batch 2 of the `UniqueFunction` pushdown audit. The previously-reviewed batch 1 is in apache#62742. 1. `PushDownFilterThroughSetOperation` Pushing a filter that references a `UniqueFunction` through `UNION DISTINCT` / `INTERSECT` / `EXCEPT` duplicates each unique-fn call per child, which changes result semantics (e.g. `rand() > 0.5` would be evaluated independently for each branch instead of once above the set op). `UNION ALL` remains safe, because the set op is a pure concatenation and every row evaluates the filter exactly once below or above. The rewrite now splits conjuncts: pushable conjuncts (`UNION ALL`, or not containing a unique function) go through the child branches; non-pushable conjuncts stay above the set op. 2. `ProjectOtherJoinConditionForNestedLoopJoin` The rule extracts sub-expressions of NLJ other conditions into child projects as aliases. An expression containing a unique function must not be materialized once above the scan and then referenced in the join, because that separates its evaluation from the join pair. Expressions that contain a unique function are now left inline in the other condition. 3. `InferPredicates` (EXCEPT / INTERSECT branches) `visitLogicalExcept` / `visitLogicalIntersect` substitute slots of the first child through the set-op output to sibling children so that predicates on the first child can be inferred onto sibling children. When the pulled-up predicate contains a unique function (e.g. `t1.id + rand() > 5`), the substitution rewrites it into a predicate on the sibling (`t2.id + rand() > 5`) and re-adds it, which re-evaluates `rand()` on a different set of rows. The JOIN path does not do this substitution, so its pre-existing slot subset check is sufficient. We now skip any inferred expression that contains a unique function inside the universal inference loop, which protects SetOp as well as any future consumer that does similar slot substitution. All three fixes follow the same pattern used in batch 1: an early no-op branch when `expression.containsUniqueFunction()` is true. ### Release note None ### Check List (For Author) - Test: Regression test - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Thanks for the reviews. Consolidated response:
|
FE Regression Coverage ReportIncrement line coverage |
12d07aa to
310308d
Compare
|
run buildall |
|
/review |
There was a problem hiding this comment.
Requesting changes because one blocking wrong-result issue remains.
Finding
1.
This new narrowing is not semantically safe. A one-sided unique predicate such as is still evaluated once per joined row in the original plan, but after this rewrite it is evaluated once per row and then reused for every matching left row. If one right row joins two left rows, the original query can keep one pair and drop the other, while the rewritten plan can only keep both or none. This is distinct from the existing thread because it goes through the path, and the updated regression outputs now explicitly expect that unsafe pushdown.
Critical Checkpoints
- Goal / correctness: Partially met. The window, partition-topn, and empty-slot join fixes address the empty-input-slot bug, but still has a wrong-result path for one-sided unique predicates.
- Scope / minimality: Not fully minimal. Narrowing the guard from to broadens behavior beyond the empty-slot fix.
- Concurrency: Not applicable; these FE rewrite rules do not introduce new shared-state or locking behavior.
- Lifecycle / static init: Not applicable.
- Config changes: None.
- Compatibility / persistence / FE-BE protocol: None.
- Parallel code paths: Not fully covered. This filter-above-join path remains unsafe even after the fix.
- Special conditions / comments: The new condition documents only the empty-slot case, but one-sided unique predicates are not safe either.
- Test coverage: Added tests for window / partition top-N / join-other-condition, but the join-filter regression now asserts the unsafe pushed plan instead of catching it.
- Test result files: New files are internally consistent with the current planner output, but encodes incorrect behavior for this case.
- Observability: No new observability needed for these planner-only changes.
- Data writes / transactions: Not applicable.
- Other issues: No additional blocking issue beyond the above one.
User Focus
- No additional user-provided focus points.
| for (Expression p : filterPredicates) { | ||
| if (p.containsUniqueFunction()) { | ||
| Set<Slot> slots = p.collect(SlotReference.class::isInstance); | ||
| if (p.containsUniqueFunction() && slots.isEmpty()) { |
There was a problem hiding this comment.
This guard is still too narrow. A one-sided predicate like above a is evaluated once per joined row in the original plan, but after pushing it into the right child it is evaluated once per row and then reused for every matching left row. With one right row joining two left rows, the original plan can keep one pair and drop the other, while the rewritten plan can only keep both or none. That makes this a wrong-result path distinct from the already-discussed thread, and the updated regression outputs now lock in that unsafe plan.
There was a problem hiding this comment.
This is addressed after the rebase to current upstream/master. PushDownFilterThroughJoin now checks p.containsVolatileExpression() before slot-based pushdown, so both empty-slot predicates like rand() > 0.1 and one-sided predicates like t2.id + rand() > 0.2 are kept in remainingPredicates above the join. I also updated the regression case comment to reflect that volatile predicates should stay above the join even when they reference only one side.
There was a problem hiding this comment.
Requesting changes because one blocking wrong-result issue remains.
Findings
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/PushDownFilterThroughJoin.java:124
The new&& slots.isEmpty()narrowing is not semantically safe. A one-sided unique predicate such asWHERE t2.id + rand() > 0.2is evaluated once per joined row in the originalFilter(Join(...))plan, but after this rewrite it is evaluated once pert2row and then reused for every matching left row. If one right row joins two left rows, the original query can keep one pair and drop the other, while the rewritten plan can only keep both or none. This is distinct from the existingPushDownJoinOtherConditionreview threads because it goes through the separateLogicalFilter(logicalJoin())path, and the updatedpush_down_filter_through_join_3/4regression outputs now explicitly expect that unsafe pushdown.
Open Questions / Assumptions
- This review assumes Doris keeps the usual row-by-row semantics for non-deterministic predicates above a join. The existing guards in
PushFilterInsideJoin,PushDownFilterThroughProject, and the rest of this PR all point to that same invariant.
Critical Checkpoints
- Goal of the task: Partially accomplished. The window, partition-topn, and empty-slot join fixes address the empty-input-slot bug, but
PushDownFilterThroughJoinstill leaves a wrong-result path for one-sided unique predicates. - Small / clear / focused: Not fully. Narrowing the guard from
p.containsUniqueFunction()top.containsUniqueFunction() && slots.isEmpty()broadens behavior beyond the stated empty-slot fix. - Concurrency: Not applicable. These FE rewrite rules do not introduce new shared state, threads, or locks.
- Lifecycle / static initialization: Not applicable.
- Configuration changes: None.
- Compatibility / persistence / FE-BE variable passing: None.
- Functionally parallel code paths: Not fully covered.
PushDownJoinOtherConditiongot an empty-slot guard, but the separate filter-above-join path remains unsafe. - Special conditional checks: The new condition only documents the empty-slot case, while one-sided unique predicates are still not safe to move below the join.
- Test coverage: Improved for window, partition-topn, and join-other-condition. Missing the corresponding negative coverage for
PushDownFilterThroughJoin; the join regression now asserts the unsafe pushed plan instead. - Test result files: The new and updated
.outfiles are internally consistent with current planner output, butpush_down_filter_through_join_with_unique_function.outnow encodes incorrect behavior for this case. - Observability: No additional observability is needed for these planner-only changes.
- Transactions / data writes / persistence: Not applicable.
- Performance: The pushed plan may reduce join work, but correctness has to win here because row-granularity evaluation changes.
- Other issues: No additional blocking issue found beyond the one above.
User Focus
- No additional user-provided focus points.
| for (Expression p : filterPredicates) { | ||
| if (p.containsUniqueFunction()) { | ||
| Set<Slot> slots = p.collect(SlotReference.class::isInstance); | ||
| if (p.containsUniqueFunction() && slots.isEmpty()) { |
There was a problem hiding this comment.
This new && slots.isEmpty() guard is still too narrow. A one-sided predicate like WHERE t2.id + rand() > 0.2 above a LogicalJoin is evaluated once per joined row in the original plan, but after pushing it into the right child it is evaluated once per t2 row and then reused for every matching left row. With one right row joining two left rows, the original plan can keep one pair and drop the other, while the rewritten plan can only keep both or none.
That makes this a wrong-result path distinct from the already-discussed PushDownJoinOtherCondition thread, and the updated push_down_filter_through_join_3/4 regression outputs now lock in that unsafe plan.
| if (p.containsUniqueFunction() && slots.isEmpty()) { | |
| if (p.containsUniqueFunction()) { |
There was a problem hiding this comment.
This has been fixed by the current rebased code as well. The old narrowed guard no longer exists on this branch: PushDownFilterThroughJoin first routes any containsVolatileExpression() predicate to remainingPredicates, before checking input slots. Therefore a one-sided volatile predicate such as WHERE t2.id + rand() > 0.2 is not pushed into the right child and remains evaluated per joined row. I updated the test comment accordingly.
Superseded by corrected review 4161915500 after shell-escaping error in the automation payload.
FE UT Coverage ReportIncrement line coverage |
FE Regression Coverage ReportIncrement line coverage |
…in pushdown Issue Number: N/A Related PR: apache#62705 Problem Summary: Following up on apache#62705, several Nereids rewrite rules still moved predicates / conjuncts containing non-idempotent (unique) functions such as rand() / uuid() / random_bytes() across operator boundaries in ways that changed query semantics. The common root cause is that a predicate like `rand() > 0.5` has an empty input-slot set, so the `containsAll(emptySet)` / `allMatch(emptyStream)` guards used by these rules silently returned true and allowed the push down. This PR adds guards to the reachable cases: 1. PushDownFilterThroughWindow: skip conjuncts with unique functions. Pushing a unique predicate below a window operator re-samples the base rows and changes every window-function value. 2. PushDownFilterThroughPartitionTopN: same as Window — skip unique conjuncts in the split loop. 3. PushDownJoinOtherCondition: keep conjuncts with unique functions AND empty input slots in the join's otherJoinConjuncts. `rand() > 0.5` with no slots would otherwise be pushed arbitrarily to the left child because `leftOutput.containsAll(emptySet)` is always true. When the unique conjunct has side-specific input slots (e.g. `t1.id + rand() > 0.5`), push-down is still allowed — output cardinality expectation is preserved and pre-join evaluation is what users typically want for sampling predicates. 4. PushDownFilterThroughJoin: same as (3) for the filter-through-join path. Also protects the "no slots → duplicate to both children" branch, which would otherwise produce two independent random draws per row. Other candidates that were initially explored (Repeat / Sort elimination / TopN-through-Union) were empirically verified to be unreachable due to upstream normalization (PushDownFilterThroughAggregation already blocks unique conjuncts before they reach Repeat; NormalizeSort wraps rand() in a Project; TopN order keys are already materialized Slots), so no dead guards were added for those. Fix wrong results when `rand() > 0.5`-style predicates with empty input slots were pushed below Window / PartitionTopN, or moved out of a Join's other-conjuncts / above-Join filter onto one of the join inputs. - Test: - Regression Test: 3 suites under regression-test/suites/nereids_rules_p0/unique_function/ (window, partition_topn, join_other_condition) plus new cases in push_down_filter_through_join_with_unique_function for the side-specific unique-slot vs empty-slot distinction. - Behavior changed: Yes (queries that previously returned incorrect results due to unsafe pushdown of unique-function predicates across Window / PartitionTopN / Join boundaries will now return correct results). - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adapt the unique function pushdown guards to the upstream volatile expression API and update CastException to the current AnalysisException constructors. Key changes: - Replace removed containsUniqueFunction checks with containsVolatileExpression in pushdown rules. - Update CastException to use the current AnalysisException constructor. Unit Test: - FE Core Maven compile without build cache - build.sh --fe
310308d to
abb6587
Compare
Update the regression case comment to match the current PushDownFilterThroughJoin behavior after rebasing to upstream master. Key changes: - Document that volatile predicates stay above joins even when they reference one side. Unit Test: - Not run; comment-only change.
… SetOp predicate inference rewrites Issue Number: N/A Related PR: apache#62742 Problem Summary: Three Nereids rewrite rules were not aware of `UniqueFunction` (e.g. `rand()`, `uuid()`, `random()`) and could silently change the semantics of SQL that uses such functions. This PR is batch 2 of the `UniqueFunction` pushdown audit. The previously-reviewed batch 1 is in 1. `PushDownFilterThroughSetOperation` Pushing a filter that references a `UniqueFunction` through `UNION DISTINCT` / `INTERSECT` / `EXCEPT` duplicates each unique-fn call per child, which changes result semantics (e.g. `rand() > 0.5` would be evaluated independently for each branch instead of once above the set op). `UNION ALL` remains safe, because the set op is a pure concatenation and every row evaluates the filter exactly once below or above. The rewrite now splits conjuncts: pushable conjuncts (`UNION ALL`, or not containing a unique function) go through the child branches; non-pushable conjuncts stay above the set op. 2. `ProjectOtherJoinConditionForNestedLoopJoin` The rule extracts sub-expressions of NLJ other conditions into child projects as aliases. An expression containing a unique function must not be materialized once above the scan and then referenced in the join, because that separates its evaluation from the join pair. Expressions that contain a unique function are now left inline in the other condition. 3. `InferPredicates` (EXCEPT / INTERSECT branches) `visitLogicalExcept` / `visitLogicalIntersect` substitute slots of the first child through the set-op output to sibling children so that predicates on the first child can be inferred onto sibling children. When the pulled-up predicate contains a unique function (e.g. `t1.id + rand() > 5`), the substitution rewrites it into a predicate on the sibling (`t2.id + rand() > 5`) and re-adds it, which re-evaluates `rand()` on a different set of rows. The JOIN path does not do this substitution, so its pre-existing slot subset check is sufficient. We now skip any inferred expression that contains a unique function inside the universal inference loop, which protects SetOp as well as any future consumer that does similar slot substitution. All three fixes follow the same pattern used in batch 1: an early no-op branch when `expression.containsUniqueFunction()` is true. None - Test: Regression test - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Restore the original volatile-predicate handling for join other conditions while keeping slot-free cases from being pushed to an arbitrary child. Key changes: - Keep only slot-free volatile join predicates in join otherCondition. - Preserve performance-oriented pushdown for side-specific volatile ON predicates. - Update FE unit test and regression expectations to match the restored behavior. Unit Test: - bash ./run-fe-ut.sh --run PushDownJoinOtherConditionTest - MVN_OPT=-Dmaven.build.cache.enabled=false ./build.sh --fe - ./run-regression-test.sh --run -d nereids_rules_p0/unique_function -s push_down_filter_through_join_with_unique_function,push_down_filter_through_partition_topn_with_unique_function,push_down_filter_through_window_with_unique_function,push_down_join_other_condition_with_unique_function,infer_predicates_set_op_with_unique_function,project_other_join_condition_for_nlj_with_unique_function,push_down_filter_through_set_operation_with_unique_function
|
run buildall |
FE UT Coverage ReportIncrement line coverage |
|
/review |
There was a problem hiding this comment.
I reviewed the current GitHub PR diff and did not find a new blocking issue distinct from the existing review threads.
Critical checkpoint conclusions:
- Goal/test coverage: The PR addresses volatile expression pushdown/inference hazards in Nereids rules, with FE unit/regression coverage for the affected paths.
- Scope: The current GitHub diff is focused on Nereids volatile-expression handling and related tests/comments.
- Concurrency/lifecycle/config/compatibility: No new concurrency, lifecycle, configuration, or serialization compatibility issue found.
- Parallel code paths: Join filter pushdown, join other-condition pushdown, set-operation filter pushdown, NLJ project extraction, and set-op predicate inference were checked. No additional unreported instance found beyond the existing threads.
- Data correctness: The current guarded paths preserve volatile predicate evaluation granularity according to the behavior documented by this PR; the known side-specific ON-predicate concern is already covered by existing review discussion and was not duplicated.
- Tests/results: Added regression outputs appear consistent with the changed plans; no handwritten-result concern was identified from the review context.
- Observability/performance: No additional observability or performance blocker found.
User focus: no additional user-provided review focus was supplied.
FE Regression Coverage ReportIncrement line coverage |
TPC-H: Total hot run time: 29246 ms |
TPC-DS: Total hot run time: 170126 ms |
ProjectOtherJoinConditionForNestedLoopJoin skipped the whole expression when a join condition contained a volatile function. This avoided changing volatile evaluation granularity, but also prevented deterministic child expressions from being extracted, which changed existing join case-when plans. Key changes: - Keep volatile join expressions inline while still visiting deterministic child expressions. - Update the unique-function regression expectation for the resulting project shape. Unit Test: - Regression test: add_project_for_unique_function, join_extract_or_from_case_when - FE checkstyle: fe-core validate with dependent modules
|
run buildall |
TPC-H: Total hot run time: 28968 ms |
TPC-DS: Total hot run time: 170725 ms |
FE UT Coverage ReportIncrement line coverage |
FE Regression Coverage ReportIncrement line coverage |
| // children. Side-specific volatile predicates are still pushed down for | ||
| // performance, matching PostgreSQL's treatment of single-relation INNER | ||
| // JOIN ON quals as base restrictions. | ||
| if (otherConjunct.containsVolatileExpression() |
There was a problem hiding this comment.
This guard is still too narrow for volatile ON predicates. A one-sided condition such as t1.id + rand() > 0.5 has non-empty input slots, so it falls through to allCoveredBy(...) and is pushed into one child. For an INNER/CROSS JOIN, though, the ON predicate is evaluated per joined pair; after this rewrite the random value is evaluated once per input row and reused for every matching row from the other side. With one left row joining two right rows, the original plan can keep one pair and drop the other, while the pushed plan can only keep both or none. PushDownFilterThroughJoin already keeps all containsVolatileExpression() predicates above the join for this reason, so this path should not push side-specific volatile ON conjuncts either.
There was a problem hiding this comment.
Thanks, addressed in 0e52387. PushDownJoinOtherCondition now keeps every containsVolatileExpression() ON conjunct in otherJoinConjuncts, including one-sided predicates such as t1.id + rand() > 0.5, so the join-pair evaluation granularity is preserved. For BETWEEN-expanded repeated volatile expressions, AddProjectForVolatileExpression materializes one value later: right-only expressions are projected on the right child, otherwise it falls back to the left child, and volatile expressions whose own slots span both join children are skipped because join has no pair-scope project. I also added FE unit tests and explain-shape regression cases for the one-sided volatile ON predicate and left/right/slot-free BETWEEN cases.
Keep volatile ON predicates at join scope so their evaluation granularity is not changed by predicate pushdown. Repeated volatile expressions are still materialized by AddProjectForVolatileExpression, using the right child when the expression is right-only and the left child as the default fallback. Key changes: - Keep all volatile ON predicates in otherJoinConjuncts. - Materialize repeated join volatile expressions on the legal child side. - Skip volatile expressions whose own input slots span both join children. - Add FE unit tests and explain shape regression coverage for volatile JOIN cases. Unit Test: - AddProjectForVolatileExpressionTest - PushDownJoinOtherConditionTest - push_down_join_other_condition_with_unique_function regression - mvn validate
|
run buildall |
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #62705
Problem Summary:
Following up on #62705, several Nereids rewrite rules still moved predicates
containing non-idempotent (unique) functions such as rand() / uuid() /
random_bytes() across operator boundaries in ways that changed query
semantics. The common root cause is that a predicate like
rand() > 0.5has an empty input-slot set, so the
containsAll(emptySet)/allMatchguards used by these rules silently returned true and allowed the push
down / elimination.
This PR adds
containsUniqueFunction()guards to the following rules:PushDownFilterThroughRepeat: skip conjuncts with unique functions.
Pushing
rand() > xbelow Repeat changes which rows feed eachgrouping set and alters aggregate results.
PushDownFilterThroughWindow: skip conjuncts with unique functions.
Pushing a unique predicate below a window operator re-samples the
base rows and changes every window-function value.
PushDownFilterThroughPartitionTopN: same as Window — skip unique
conjuncts in the split loop.
PushDownFilterThroughSort (sort-elimination branch): do not drop the
sort when any order key contains a unique function.
ORDER BY rand()has empty input slots, so the old
allMatchon an empty streamreturned true and wrongly eliminated the sort.
PushDownTopNThroughUnion / PushDownTopNDistinctThroughUnion: do not
push a TopN whose order keys contain unique functions below Union.
With non-idempotent ordering each branch would be sorted by a
different random draw than the global one.
Release note
Fix wrong results when predicates / order keys containing rand(), uuid(),
random_bytes() or uuid_numeric() were pushed across Repeat, Window,
PartitionTopN, Sort, or Union boundaries.
Check List (For Author)
PushDownFilterThroughSortTest (+1 test for unique order key),
PushDownFilterThroughWindowTest (+1 test for unique conjunct).
regression-test/suites/nereids_rules_p0/unique_function/.
results due to unsafe pushdown of unique-function predicates / order
keys will now return correct results).
Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com