branch-4.1: [fix](mtmv) Infer null-reject from INNER JoinEdge for multi-hop outer join MV rewrite #62492#62635
Open
github-actions[bot] wants to merge 1 commit into
Open
branch-4.1: [fix](mtmv) Infer null-reject from INNER JoinEdge for multi-hop outer join MV rewrite #62492#62635github-actions[bot] wants to merge 1 commit into
github-actions[bot] wants to merge 1 commit into
Conversation
… join MV rewrite (#62492) ### What problem does this PR solve? Related PR: #30374 Problem Summary: In multi-hop LEFT JOIN materialized view transparent rewrite (e.g., `fact LEFT JOIN dim1 LEFT JOIN dim2`), when the query has a WHERE clause that null-rejects only the outermost dimension table (e.g., `WHERE dim2.col = 'value'`), the MV rewrite fails with "Predicate compensate fail". **Root cause:** In `AbstractMaterializedViewRule.containsNullRejectSlot()`, the original code only checked **filter predicates** (`queryPredicates`) for NOT NULL evidence. After the Nereids rewrite pipeline runs: 1. `EliminateOuterJoin` converts all eligible LEFT JOINs → INNER (cascading through `InferJoinNotNull` across multiple passes) 2. `EliminateNotNull` **unconditionally removes** all generated NOT NULL predicates (`isGeneratedIsNotNull=true`) By the time MV rewrite (exploration phase) runs, the query plan has INNER JOINs but **zero NOT NULL filter predicates**. The only surviving predicate is the user's WHERE clause (e.g., `dim2.region_name = 'West'`), which can only prove NOT NULL for outermost dim2 slots — leaving intermediate dim1 slots uncovered. **Fix:** Read INNER JoinEdge conditions directly from the query HyperGraph. After `EliminateOuterJoin` converts LEFT→INNER, JoinEdge objects retain their INNER type and join condition expressions even though `EliminateNotNull` removes filter-level NOT NULL predicates. `ExpressionUtils.inferNotNullSlots()` extracts NOT NULL slots from these INNER join conditions, covering all intermediate join tables. | File | Change Description | |------|-------------------| | `AbstractMaterializedViewRule.java` | `containsNullRejectSlot()`: Add loop over INNER JoinEdges to collect NOT NULL slots from join conditions via `inferNotNullSlots`. Also add `shuttleExpressionWithLineage` for correct slot-level mapping. | | `NullRejectInferenceTest.java` (new) | FE unit test: query=2-hop INNER JOIN vs view=2-hop LEFT JOIN, verifies `predicatesCompensate` succeeds | | `outer_join_two_hop_null_reject.groovy` (new) | Regression test: 3 tables, async MV with 2-hop LEFT JOIN + WHERE + aggregate rollup, verifies rewrite success and result correctness | **2-hop example walkthrough:** ``` Query HyperGraph (after EliminateOuterJoin): JoinEdge 1 (INNER): o.store_id = d.id → {o.store_id, d.id} NOT NULL JoinEdge 2 (INNER): d.id = r.store_id → {d.id, r.store_id} NOT NULL FilterEdge: r.region_name = 'West' → {r.region_name} NOT NULL queryNullRejectSlots = {o.store_id, d.id, r.store_id, r.region_name} requireNoNullableViewSlot (view has LEFT JOINs): Set 1: {d.id, d.store_name} ∩ queryNullRejectSlots → {d.id} ≠ ∅ ✓ Set 2: {r.store_id, r.region_name} ∩ queryNullRejectSlots → {r.store_id, r.region_name} ≠ ∅ ✓ ``` ### Release note Fix multi-hop LEFT JOIN materialized view transparent rewrite failure when the WHERE clause only references the outermost dimension table. --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
|
run buildall |
Contributor
FE UT Coverage ReportIncrement line coverage |
Contributor
FE Regression Coverage ReportIncrement line coverage |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Cherry-picked from #62492