[fix](fe) Backport runtime filter outer join fix to 4.1#64162
[fix](fe) Backport runtime filter outer join fix to 4.1#64162BiteTheDDDDt wants to merge 1 commit into
Conversation
…pache#64102) related with: apache#57425 Runtime filters from a parent inner join could be pushed through an outer join into the null-generating child even when the probe expression was not null-propagating for that child. The problem can be reproduced with this SQL shape: ```sql create table rf_outer_join_nullable_a (pk int) duplicate key(pk) distributed by hash(pk) buckets 1 properties("replication_num" = "1"); create table rf_outer_join_nullable_b (pk int) duplicate key(pk) distributed by hash(pk) buckets 1 properties("replication_num" = "1"); create table rf_outer_join_nullable_c (pk int) duplicate key(pk) distributed by hash(pk) buckets 1 properties("replication_num" = "1"); insert into rf_outer_join_nullable_a values (1); insert into rf_outer_join_nullable_b values (1); insert into rf_outer_join_nullable_c values (0); set disable_join_reorder = true; select coalesce(b.pk, 0) as k, count(*) as cnt from rf_outer_join_nullable_a a left join rf_outer_join_nullable_b b on a.pk = b.pk inner join rf_outer_join_nullable_c c on coalesce(b.pk, 0) = c.pk group by 1 order by 1; ``` The correct result is empty. `a.pk = 1` matches `b.pk = 1` in the left outer join, then the parent inner join evaluates `coalesce(1, 0) = 0`, which is false. The wrong plan generated a runtime filter from the parent inner join, effectively `c.pk -> coalesce(b.pk, 0)`, and pushed it through the lower `LEFT OUTER JOIN` into the right side scan of `b`. If `b.pk = 1` is pre-filtered before the left outer join, the join emits a NULL-extended row for `b`; then `coalesce(NULL, 0) = 0` becomes true and incorrectly returns `(0, 1)`. Therefore the runtime filter `c.pk -> coalesce(b.pk, 0)` must not be planned on the null-generating side of the lower outer join. This PR blocks runtime filter pushdown through an outer join's null-generating child unless the probe expression preserves NULL semantics for slots from that child. Normal pushdown through preserved sides and null-propagating expressions is kept unchanged. The bug became observable after apache#57425 changed the target lookup for expression runtime filters from `ctx.probeExpr` to `ctx.probeSlot`. Before that change, an expression such as `coalesce(b.pk, 0)` could not resolve the target relation in this path and the unsafe pushdown was not generated. None - Test <!-- At least one of them must be included. --> - [x] Regression test - [x] Unit Test - [ ] Manual test (add detailed scripts or steps below) - [ ] No need to test or manual test. Explain why: - [ ] This is a refactor/code format and no logic has been changed. - [ ] Previous test can cover this change. - [ ] No code files have been changed. - [ ] Other reason <!-- Add your reason? --> Added regression case with `disable_join_reorder`, `qt_shape`, and empty result verification: `regression-test/suites/correctness_p0/test_runtime_filter_outer_join_nullable_side.groovy` Unit test: `./run-fe-ut.sh --run org.apache.doris.nereids.postprocess.RuntimeFilterTest#testDoNotPushDownNonNullPropagatingRuntimeFilterThroughOuterJoin,org.apache.doris.nereids.postprocess.RuntimeFilterTest#testPushDownNullPropagatingRuntimeFilterThroughOuterJoin` The SQL regression case was not run locally against the available 9333 cluster because that cluster was the unpatched repro cluster. - Behavior changed: - [ ] No. - [x] Yes. Runtime filters are no longer pushed through an outer join into its null-generating child when the probe expression can convert NULL to a non-NULL value. - Does this need documentation? - [x] No. - [ ] Yes. <!-- Add document PR link here. eg: apache/doris-website#1214 --> - [ ] Confirm the release note - [ ] Confirm test cases - [ ] Confirm document - [ ] Add branch pick label <!-- Add branch pick label that this PR should merge into --> (cherry picked from commit 691189a)
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
There was a problem hiding this comment.
Pull request overview
Backports the fix from #64102 to branch-4.1 to prevent unsafe runtime filter pushdown through the null-generating side of outer joins, specifically when the probe expression is not NULL-propagating (e.g., coalesce(b.pk, 0)), which can change query semantics.
Changes:
- Add outer-join-aware pushdown gating in
RuntimeFilterPushDownVisitorto block pushing runtime filters into null-generating children unless the probe expression is NULL-propagating. - Add FE unit tests covering both the blocked (
coalesce) and allowed (slot-equality) cases. - Add a regression test reproducer that validates both plan shape and the empty-result correctness.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
fe/fe-core/src/main/java/org/apache/doris/nereids/processor/post/RuntimeFilterPushDownVisitor.java |
Adds null-generating-side checks for join-child pushdown and NULL-propagation detection via PropagateNullable/Cast/Slot. |
fe/fe-core/src/test/java/org/apache/doris/nereids/postprocess/RuntimeFilterTest.java |
Adds unit tests validating runtime filter presence/absence across the outer-join boundary depending on NULL-propagation. |
regression-test/suites/correctness_p0/test_runtime_filter_outer_join_nullable_side.groovy |
Introduces a regression suite that reproduces the original wrong-results scenario and asserts stable plan shape + correct result. |
regression-test/data/correctness_p0/test_runtime_filter_outer_join_nullable_side.out |
Golden output for the new regression test (shape + empty result). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
FE UT Coverage ReportIncrement line coverage |
FE Regression Coverage ReportIncrement line coverage |
What problem does this PR solve?
Issue Number: N/A
Related PR: #64102
Problem Summary:
Backport #64102 to branch-4.1. The fix prevents unsafe runtime filter pushdown through the null-generating side of outer joins. The implementation was adapted to the branch-4.1 RuntimeFilterPushDownVisitor structure.
Release note
None
Check List (For Author)