Skip to content

[fix](mtmv) Infer null-reject from INNER JoinEdge for multi-hop outer join MV rewrite#62492

Merged
morrySnow merged 12 commits into
apache:masterfrom
seawinde:fix-mtmv-2hop-null-reject
Apr 20, 2026
Merged

[fix](mtmv) Infer null-reject from INNER JoinEdge for multi-hop outer join MV rewrite#62492
morrySnow merged 12 commits into
apache:masterfrom
seawinde:fix-mtmv-2hop-null-reject

Conversation

@seawinde
Copy link
Copy Markdown
Member

What problem does this PR solve?

Issue Number: N/A

Related PR: #30374

Problem Summary:

In multi-hop LEFT JOIN materialized view transparent rewrite (e.g., fact LEFT JOIN dim1 LEFT JOIN dim2), when the query has a WHERE clause that null-rejects only the outermost dimension table (e.g., WHERE dim2.col = 'value'), the MV rewrite fails with "Predicate compensate fail".

Root cause: In AbstractMaterializedViewRule.containsNullRejectSlot(), the original code only checked filter predicates (queryPredicates) for NOT NULL evidence. After the Nereids rewrite pipeline runs:

  1. EliminateOuterJoin converts all eligible LEFT JOINs → INNER (cascading through InferJoinNotNull across multiple passes)
  2. EliminateNotNull unconditionally removes all generated NOT NULL predicates (isGeneratedIsNotNull=true)

By the time MV rewrite (exploration phase) runs, the query plan has INNER JOINs but zero NOT NULL filter predicates. The only surviving predicate is the user's WHERE clause (e.g., dim2.region_name = 'West'), which can only prove NOT NULL for outermost dim2 slots — leaving intermediate dim1 slots uncovered.

Fix: Read INNER JoinEdge conditions directly from the query HyperGraph. After EliminateOuterJoin converts LEFT→INNER, JoinEdge objects retain their INNER type and join condition expressions even though EliminateNotNull removes filter-level NOT NULL predicates. ExpressionUtils.inferNotNullSlots() extracts NOT NULL slots from these INNER join conditions, covering all intermediate join tables.

File Change Description
AbstractMaterializedViewRule.java containsNullRejectSlot(): Add loop over INNER JoinEdges to collect NOT NULL slots from join conditions via inferNotNullSlots. Also add shuttleExpressionWithLineage for correct slot-level mapping.
NullRejectInferenceTest.java (new) FE unit test: query=2-hop INNER JOIN vs view=2-hop LEFT JOIN, verifies predicatesCompensate succeeds
outer_join_two_hop_null_reject.groovy (new) Regression test: 3 tables, async MV with 2-hop LEFT JOIN + WHERE + aggregate rollup, verifies rewrite success and result correctness

2-hop example walkthrough:

Query HyperGraph (after EliminateOuterJoin):
  JoinEdge 1 (INNER): o.store_id = d.id    → {o.store_id, d.id} NOT NULL
  JoinEdge 2 (INNER): d.id = r.store_id    → {d.id, r.store_id} NOT NULL
  FilterEdge:         r.region_name = 'West' → {r.region_name} NOT NULL

queryNullRejectSlots = {o.store_id, d.id, r.store_id, r.region_name}

requireNoNullableViewSlot (view has LEFT JOINs):
  Set 1: {d.id, d.store_name} ∩ queryNullRejectSlots → {d.id} ≠ ∅ ✓
  Set 2: {r.store_id, r.region_name} ∩ queryNullRejectSlots → {r.store_id, r.region_name} ≠ ∅ ✓

Release note

Fix multi-hop LEFT JOIN materialized view transparent rewrite failure when the WHERE clause only references the outermost dimension table.

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
  • Behavior changed:

    • No.
  • Does this need documentation?

    • No.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@seawinde
Copy link
Copy Markdown
Member Author

run buildall

@seawinde seawinde changed the title [fix](fe) Infer null-reject from INNER JoinEdge for multi-hop outer join MV rewrite [fix](mtmv) Infer null-reject from INNER JoinEdge for multi-hop outer join MV rewrite Apr 14, 2026
…oin MV rewrite

### What problem does this PR solve?

Issue Number: close #xxx

Problem Summary: In multi-hop LEFT JOIN MV rewrite (e.g.,
fact LEFT JOIN dim1 LEFT JOIN dim2), when the query has a WHERE clause
that null-rejects the outermost table (dim2), EliminateOuterJoin
converts all LEFT JOINs to INNER. However, containsNullRejectSlot only
checked filter predicates for NOT NULL proof, which only covers the
outermost table slots. The intermediate table (dim1) slots had no
NOT NULL evidence, causing "Predicate compensate fail".

The fix reads INNER JoinEdge conditions from the query HyperGraph.
After EliminateOuterJoin converts LEFT→INNER, JoinEdge objects retain
their INNER type and join condition expressions even though
EliminateNotNull removes filter-level NOT NULL predicates.
ExpressionUtils.inferNotNullSlots extracts NOT NULL slots from these
INNER join conditions, covering all intermediate join tables.

### Release note

Fix multi-hop LEFT JOIN materialized view transparent rewrite failure
when WHERE clause only references the outermost dimension table.

### Check List (For Author)

- Test: Unit Test (NullRejectInferenceTest) / Regression test (outer_join_two_hop_null_reject)
- Behavior changed: No
- Does this need documentation: No

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@seawinde seawinde force-pushed the fix-mtmv-2hop-null-reject branch from f2f6c8a to 488c34d Compare April 14, 2026 14:45
@seawinde
Copy link
Copy Markdown
Member Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 100.00% (20/20) 🎉
Increment coverage report
Complete coverage report

seawinde and others added 11 commits April 15, 2026 14:44
…ll-reject inference

### What problem does this PR solve?

Issue Number: close #xxx

Problem Summary: The fix in containsNullRejectSlot that infers null-rejection
from INNER JoinEdge conditions enables 4 new MV rewrite success cases in the
dimension_2_left_join test:

- (i=0,j=8) and (i=2,j=8): View LEFT_OUTER(lineitem,orders) with query
  "orders LEFT JOIN lineitem WHERE l_shipdate" -> INNER. Previously, filter
  only null-rejected lineitem, missing orders. Now INNER JOIN condition
  l_orderkey=o_orderkey proves o_orderkey IS NOT NULL, null-rejecting orders.

- (i=7,j=3) and (i=9,j=3): View LEFT_OUTER(orders,lineitem) with query
  "lineitem LEFT JOIN orders WHERE o_orderdate" -> INNER. Previously, filter
  only null-rejected orders, missing lineitem. Now INNER JOIN condition
  proves l_orderkey IS NOT NULL, null-rejecting lineitem.

### Release note

None

### Check List (For Author)

- Test: Regression test (dimension_2_left_join expectations updated)
- Behavior changed: No
- Does this need documentation: No

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ctations

### What problem does this PR solve?

Problem Summary: INNER JoinEdge null-reject inference now allows more MV rewrites to succeed.
Update success lists: i=0,2 add j=8; i=7,9 add j=3.

### Release note
None

### Check List (For Author)
- Test: Regression test expectation update
- Behavior changed: No
- Does this need documentation: No

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…tions

### What problem does this PR solve?

Problem Summary: INNER JoinEdge null-reject inference allows INNER query to rewrite with LEFT/RIGHT/FULL MV.
Update join_type else block: add success when j=1 (INNER) and i in [0,2,3] (LEFT/RIGHT/FULL MV).

### Release note
None

### Check List (For Author)
- Test: Regression test expectation update
- Behavior changed: No
- Does this need documentation: No

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
### What problem does this PR solve?

Problem Summary: Same pattern as dimension_self_conn - INNER query now succeeds with OUTER MV.

### Release note
None

### Check List (For Author)
- Test: Regression test expectation update
- Behavior changed: No
- Does this need documentation: No

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ow succeed

### What problem does this PR solve?

Problem Summary: All 14 INNER queries now succeed against LEFT/RIGHT/FULL OUTER MVs.
INNER JoinEdge provides null-reject on both sides, resolving all type mismatches.

### Release note
None

### Check List (For Author)
- Test: Regression test expectation update
- Behavior changed: No
- Does this need documentation: No

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…,6,8,11]->[1,6]

### What problem does this PR solve?

Problem Summary: WHERE clause null-rejects nullable side, converting LEFT->INNER.
INNER JoinEdge then provides null-reject for FULL MV type mismatch resolution.
Remaining failures j=1,6: couldNotPulledUp predicate mismatch.

### Release note
None

### Check List (For Author)
- Test: Regression test expectation update
- Behavior changed: No
- Does this need documentation: No

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…5,7,9,10]->[0,7]

### What problem does this PR solve?

Problem Summary: Mirror of left_join_infer_and_derive changes.
Remaining failures j=0,7: couldNotPulledUp predicate mismatch.

### Release note
None

### Check List (For Author)
- Test: Regression test expectation update
- Behavior changed: No
- Does this need documentation: No

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
### What problem does this PR solve?

Problem Summary: i=2 add j=7 success; i=8 add j=3 success.
INNER JoinEdge null-reject resolves LEFT MV type mismatch.

### Release note
None

### Check List (For Author)
- Test: Regression test expectation update
- Behavior changed: No
- Does this need documentation: No

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
### What problem does this PR solve?

Problem Summary: i=3 add j=8 success; i=7 add j=2 success.
Mirror of left_join_filter changes.

### Release note
None

### Check List (For Author)
- Test: Regression test expectation update
- Behavior changed: No
- Does this need documentation: No

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… gain success

### What problem does this PR solve?

Problem Summary: Star topology (lineitem center -> orders, partsupp). When query has INNER JOIN
on an edge where MV has LEFT JOIN, INNER JoinEdge null-reject now resolves the type mismatch.
All 8 LEFT MV sections updated with additional successful query indices.

### Release note
None

### Check List (For Author)
- Test: Regression test expectation update
- Behavior changed: No
- Does this need documentation: No

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… gain success

### What problem does this PR solve?

Problem Summary: Line topology (orders->lineitem->partsupp). Section 6 and 16 LEFT MVs:
after partsupp elimination, orders-lineitem INNER vs LEFT mismatch resolved by null-reject.
j=2,4,6 added to success lists.

### Release note
None

### Check List (For Author)
- Test: Regression test expectation update
- Behavior changed: No
- Does this need documentation: No

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@seawinde
Copy link
Copy Markdown
Member Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 100.00% (20/20) 🎉
Increment coverage report
Complete coverage report

@morrySnow
Copy link
Copy Markdown
Contributor

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings:

  1. Blocking: containsNullRejectSlot() now harvests null-reject evidence from JoinEdge, but only for INNER_JOIN. This rule already supports ASOF_LEFT/RIGHT_{INNER,OUTER}_JOIN, EliminateOuterJoin can convert ASOF outer joins to ASOF inner joins, and HyperGraphComparator already models that inference. As written, the same root cause remains for the parallel ASOF rewrite path.
  2. Test gap: the new end-to-end regression uses mv_rewrite_success_without_check_chosen(...), so it can pass even when the memo plan says .mv_name not chose and both executions read the base tables. That means the only new regression case does not prove the fixed rewrite is actually selected.

Checkpoint conclusions:

  • Goal of the task: Fix multi-hop outer-join MV rewrite null-reject inference after join elimination. The ordinary INNER path is addressed, but the supported ASOF path is still missing, so the goal is only partially met.
  • Scope/focus: The code change is small and focused on MV null-reject inference plus tests.
  • Concurrency/locking: No new concurrency, locking, or lifecycle risks were introduced here.
  • Config/compatibility/persistence/data writes: None involved.
  • Parallel code paths: Not fully covered; ASOF join inference is the missing sibling path.
  • Special conditions: The new join-type check is too narrow (isInnerJoin() only).
  • Test coverage: There is positive coverage for the ordinary INNER case, but no ASOF coverage, and the new regression does not assert that the MV is actually chosen.
  • Observability/performance: No obvious new observability or performance concerns in this path.

I did not run FE unit/regression suites in this review environment.

@github-actions github-actions Bot added the approved Indicates a PR has been approved by one committer. label Apr 20, 2026
@github-actions
Copy link
Copy Markdown
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Copy Markdown
Contributor

PR approved by anyone and no changes requested.

@morrySnow morrySnow merged commit 673b028 into apache:master Apr 20, 2026
50 checks passed
github-actions Bot pushed a commit that referenced this pull request Apr 20, 2026
… join MV rewrite (#62492)

### What problem does this PR solve?

Related PR: #30374

Problem Summary:

In multi-hop LEFT JOIN materialized view transparent rewrite (e.g.,
`fact LEFT JOIN dim1 LEFT JOIN dim2`), when the query has a WHERE clause
that null-rejects only the outermost dimension table (e.g., `WHERE
dim2.col = 'value'`), the MV rewrite fails with "Predicate compensate
fail".

**Root cause:** In
`AbstractMaterializedViewRule.containsNullRejectSlot()`, the original
code only checked **filter predicates** (`queryPredicates`) for NOT NULL
evidence. After the Nereids rewrite pipeline runs:

1. `EliminateOuterJoin` converts all eligible LEFT JOINs → INNER
(cascading through `InferJoinNotNull` across multiple passes)
2. `EliminateNotNull` **unconditionally removes** all generated NOT NULL
predicates (`isGeneratedIsNotNull=true`)

By the time MV rewrite (exploration phase) runs, the query plan has
INNER JOINs but **zero NOT NULL filter predicates**. The only surviving
predicate is the user's WHERE clause (e.g., `dim2.region_name =
'West'`), which can only prove NOT NULL for outermost dim2 slots —
leaving intermediate dim1 slots uncovered.

**Fix:** Read INNER JoinEdge conditions directly from the query
HyperGraph. After `EliminateOuterJoin` converts LEFT→INNER, JoinEdge
objects retain their INNER type and join condition expressions even
though `EliminateNotNull` removes filter-level NOT NULL predicates.
`ExpressionUtils.inferNotNullSlots()` extracts NOT NULL slots from these
INNER join conditions, covering all intermediate join tables.

| File | Change Description |
|------|-------------------|
| `AbstractMaterializedViewRule.java` | `containsNullRejectSlot()`: Add
loop over INNER JoinEdges to collect NOT NULL slots from join conditions
via `inferNotNullSlots`. Also add `shuttleExpressionWithLineage` for
correct slot-level mapping. |
| `NullRejectInferenceTest.java` (new) | FE unit test: query=2-hop INNER
JOIN vs view=2-hop LEFT JOIN, verifies `predicatesCompensate` succeeds |
| `outer_join_two_hop_null_reject.groovy` (new) | Regression test: 3
tables, async MV with 2-hop LEFT JOIN + WHERE + aggregate rollup,
verifies rewrite success and result correctness |

**2-hop example walkthrough:**

```
Query HyperGraph (after EliminateOuterJoin):
  JoinEdge 1 (INNER): o.store_id = d.id    → {o.store_id, d.id} NOT NULL
  JoinEdge 2 (INNER): d.id = r.store_id    → {d.id, r.store_id} NOT NULL
  FilterEdge:         r.region_name = 'West' → {r.region_name} NOT NULL

queryNullRejectSlots = {o.store_id, d.id, r.store_id, r.region_name}

requireNoNullableViewSlot (view has LEFT JOINs):
  Set 1: {d.id, d.store_name} ∩ queryNullRejectSlots → {d.id} ≠ ∅ ✓
  Set 2: {r.store_id, r.region_name} ∩ queryNullRejectSlots → {r.store_id, r.region_name} ≠ ∅ ✓
```

### Release note

Fix multi-hop LEFT JOIN materialized view transparent rewrite failure
when the WHERE clause only references the outermost dimension table.

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
github-actions Bot pushed a commit that referenced this pull request Apr 20, 2026
… join MV rewrite (#62492)

### What problem does this PR solve?

Related PR: #30374

Problem Summary:

In multi-hop LEFT JOIN materialized view transparent rewrite (e.g.,
`fact LEFT JOIN dim1 LEFT JOIN dim2`), when the query has a WHERE clause
that null-rejects only the outermost dimension table (e.g., `WHERE
dim2.col = 'value'`), the MV rewrite fails with "Predicate compensate
fail".

**Root cause:** In
`AbstractMaterializedViewRule.containsNullRejectSlot()`, the original
code only checked **filter predicates** (`queryPredicates`) for NOT NULL
evidence. After the Nereids rewrite pipeline runs:

1. `EliminateOuterJoin` converts all eligible LEFT JOINs → INNER
(cascading through `InferJoinNotNull` across multiple passes)
2. `EliminateNotNull` **unconditionally removes** all generated NOT NULL
predicates (`isGeneratedIsNotNull=true`)

By the time MV rewrite (exploration phase) runs, the query plan has
INNER JOINs but **zero NOT NULL filter predicates**. The only surviving
predicate is the user's WHERE clause (e.g., `dim2.region_name =
'West'`), which can only prove NOT NULL for outermost dim2 slots —
leaving intermediate dim1 slots uncovered.

**Fix:** Read INNER JoinEdge conditions directly from the query
HyperGraph. After `EliminateOuterJoin` converts LEFT→INNER, JoinEdge
objects retain their INNER type and join condition expressions even
though `EliminateNotNull` removes filter-level NOT NULL predicates.
`ExpressionUtils.inferNotNullSlots()` extracts NOT NULL slots from these
INNER join conditions, covering all intermediate join tables.

| File | Change Description |
|------|-------------------|
| `AbstractMaterializedViewRule.java` | `containsNullRejectSlot()`: Add
loop over INNER JoinEdges to collect NOT NULL slots from join conditions
via `inferNotNullSlots`. Also add `shuttleExpressionWithLineage` for
correct slot-level mapping. |
| `NullRejectInferenceTest.java` (new) | FE unit test: query=2-hop INNER
JOIN vs view=2-hop LEFT JOIN, verifies `predicatesCompensate` succeeds |
| `outer_join_two_hop_null_reject.groovy` (new) | Regression test: 3
tables, async MV with 2-hop LEFT JOIN + WHERE + aggregate rollup,
verifies rewrite success and result correctness |

**2-hop example walkthrough:**

```
Query HyperGraph (after EliminateOuterJoin):
  JoinEdge 1 (INNER): o.store_id = d.id    → {o.store_id, d.id} NOT NULL
  JoinEdge 2 (INNER): d.id = r.store_id    → {d.id, r.store_id} NOT NULL
  FilterEdge:         r.region_name = 'West' → {r.region_name} NOT NULL

queryNullRejectSlots = {o.store_id, d.id, r.store_id, r.region_name}

requireNoNullableViewSlot (view has LEFT JOINs):
  Set 1: {d.id, d.store_name} ∩ queryNullRejectSlots → {d.id} ≠ ∅ ✓
  Set 2: {r.store_id, r.region_name} ∩ queryNullRejectSlots → {r.store_id, r.region_name} ≠ ∅ ✓
```

### Release note

Fix multi-hop LEFT JOIN materialized view transparent rewrite failure
when the WHERE clause only references the outermost dimension table.

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
seawinde pushed a commit to seawinde/doris that referenced this pull request May 15, 2026
### What problem does this PR solve?

Issue Number: N/A

Related PR: apache#62492

Problem Summary:

INNER JoinEdge null-reject inference can validate rewriting an INNER JOIN query by an OUTER JOIN materialized view without adding the required non-null compensation predicate. The rewritten plan can keep null-padded rows from the MV side that should be rejected by the original query.

Root cause: In AbstractMaterializedViewRule.predicatesCompensate(), the previous check treated INNER JoinEdge null-reject inference as proof that an OUTER JOIN MV rewrite was valid, but the proof was not materialized as a real IS NOT NULL predicate in the rewritten query.

Change Summary:

| File | Change Description |

|------|--------------------|

| AbstractMaterializedViewRule.java | Split predicate-based null-reject proof from INNER JoinEdge proof and add query-based IS NOT NULL compensation when only JoinEdge proof covers required MV nullable sides. Fail rewrite if no safe MV output slot can carry the compensation predicate. |

| NullRejectInferenceTest.java | Add unit coverage for LEFT/FULL OUTER JOIN MV rewrites that require INNER JoinEdge null-reject compensation on both sides. |

| inner_join_null_reject_compensation.groovy | Add regression coverage with unmatched OUTER JOIN MV rows, including the LEFT JOIN MV to INNER JOIN query repro with nullable join keys. |

Design rationale: Existing query predicates already flow through normal predicate compensation, so they do not need extra filters. INNER JoinEdge proof is only logical evidence; when it is needed to reject null-generated MV rows, the rewrite must add a real IS NOT NULL predicate on an MV output slot. If no such slot is available, the rewrite is rejected conservatively.

### Release note

Fixed an issue where OUTER JOIN materialized view rewrite could return extra null-padded rows for INNER JOIN queries.

### Check List (For Author)

- Test

    - [x] Regression test

    - [x] Unit Test

    - [ ] Manual test (add detailed scripts or steps below)

    - [ ] No need to test or manual test. Explain why:

        - [ ] This is a refactor/code format and no logic has been changed.

        - [ ] Previous test can cover this change.

        - [ ] No code files have been changed.

        - [ ] Other reason

Unit tests / checks:

- Added NullRejectInferenceTest coverage for INNER/FULL join null-reject compensation on both sides.

- Ran git diff --check.

- Tried ./run-fe-ut.sh --run org.apache.doris.nereids.rules.exploration.mv.NullRejectInferenceTest, but FE core compilation failed before tests because generated cloud proto classes miss Cloud.CreateMetaSyncPointRequest/Response in MetaServiceClient and MetaServiceProxy.

Regression test:

- Added inner_join_null_reject_compensation.groovy for FULL/LEFT OUTER JOIN MV rewrites with unmatched null-padded rows.

- Not run locally; the local FE UT build is currently blocked by the cloud proto compilation issue above.

- Behavior changed:

    - [x] Yes. OUTER JOIN MV rewrite now adds real IS NOT NULL compensation when INNER JoinEdge null-reject inference is required, or rejects the rewrite if no safe MV output slot can carry that predicate.

    - [ ] No.

- Does this need documentation?

    - [x] No.

    - [ ] Yes.

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note

- [ ] Confirm test cases

- [ ] Confirm document

- [ ] Add branch pick label

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
seawinde pushed a commit to seawinde/doris that referenced this pull request May 15, 2026
### What problem does this PR solve?

Issue Number: N/A

Related PR: apache#62492

Problem Summary:

INNER JoinEdge null-reject inference can validate rewriting an INNER JOIN query by an OUTER JOIN materialized view without adding the required non-null compensation predicate. The rewritten plan can keep null-padded rows from the MV side that should be rejected by the original query.

Root cause: In AbstractMaterializedViewRule.predicatesCompensate(), the previous check treated INNER JoinEdge null-reject inference as proof that an OUTER JOIN MV rewrite was valid, but the proof was not materialized as a real IS NOT NULL predicate in the rewritten query.

Change Summary:

| File | Change Description |

|------|--------------------|

| AbstractMaterializedViewRule.java | Split predicate-based null-reject proof from INNER JoinEdge proof and add query-based IS NOT NULL compensation when only JoinEdge proof covers required MV nullable sides. Fail rewrite if no safe MV output slot can carry the compensation predicate. |

| NullRejectInferenceTest.java | Add unit coverage for LEFT/FULL OUTER JOIN MV rewrites that require INNER JoinEdge null-reject compensation on both sides. |

| inner_join_null_reject_compensation.groovy | Add regression coverage with unmatched OUTER JOIN MV rows, including the LEFT JOIN MV to INNER JOIN query repro with nullable join keys. |

Design rationale: Existing query predicates already flow through normal predicate compensation, so they do not need extra filters. INNER JoinEdge proof is only logical evidence; when it is needed to reject null-generated MV rows, the rewrite must add a real IS NOT NULL predicate on an MV output slot. If no such slot is available, the rewrite is rejected conservatively.

### Release note

Fixed an issue where OUTER JOIN materialized view rewrite could return extra null-padded rows for INNER JOIN queries.

### Check List (For Author)

- Test

    - [x] Regression test

    - [x] Unit Test

    - [ ] Manual test (add detailed scripts or steps below)

    - [ ] No need to test or manual test. Explain why:

        - [ ] This is a refactor/code format and no logic has been changed.

        - [ ] Previous test can cover this change.

        - [ ] No code files have been changed.

        - [ ] Other reason

Unit tests / checks:

- Added NullRejectInferenceTest coverage for INNER/FULL join null-reject compensation on both sides.

- Ran git diff --check.

- Tried ./run-fe-ut.sh --run org.apache.doris.nereids.rules.exploration.mv.NullRejectInferenceTest, but FE core compilation failed before tests because generated cloud proto classes miss Cloud.CreateMetaSyncPointRequest/Response in MetaServiceClient and MetaServiceProxy.

Regression test:

- Added inner_join_null_reject_compensation.groovy for FULL/LEFT OUTER JOIN MV rewrites with unmatched null-padded rows.

- Not run locally; the local FE UT build is currently blocked by the cloud proto compilation issue above.

- Behavior changed:

    - [x] Yes. OUTER JOIN MV rewrite now adds real IS NOT NULL compensation when INNER JoinEdge null-reject inference is required, or rejects the rewrite if no safe MV output slot can carry that predicate.

    - [ ] No.

- Does this need documentation?

    - [x] No.

    - [ ] Yes.

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note

- [ ] Confirm test cases

- [ ] Confirm document

- [ ] Add branch pick label

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
morrySnow pushed a commit that referenced this pull request May 18, 2026
### What problem does this PR solve?

Related PR: #62492

Problem Summary:

INNER JoinEdge null-reject inference can validate rewriting an INNER
JOIN query by an OUTER JOIN materialized view without adding the
required non-null compensation predicate. The rewritten plan can keep
null-padded rows from the MV side that should be rejected by the
original query.

Root cause: In AbstractMaterializedViewRule.predicatesCompensate(), the
previous check treated INNER JoinEdge null-reject inference as proof
that an OUTER JOIN MV rewrite was valid, but the proof was not
materialized as a real IS NOT NULL predicate in the rewritten query.

Change Summary:


AbstractMaterializedViewRule.java | Split predicate-based null-reject
proof from INNER JoinEdge proof and add query-based IS NOT NULL
compensation when only JoinEdge proof covers required MV nullable sides.
Fail rewrite if no safe MV output slot can carry the compensation
predicate.

NullRejectInferenceTest.java | Add unit coverage for LEFT/FULL OUTER
JOIN MV rewrites that require INNER JoinEdge null-reject compensation on
both sides.

inner_join_null_reject_compensation.groovy | Add regression coverage
with unmatched OUTER JOIN MV rows, including the LEFT JOIN MV to INNER
JOIN query repro with nullable join keys.

Design rationale: Existing query predicates already flow through normal
predicate compensation, so they do not need extra filters. INNER
JoinEdge proof is only logical evidence; when it is needed to reject
null-generated MV rows, the rewrite must add a real IS NOT NULL
predicate on an MV output slot. If no such slot is available, the
rewrite is rejected conservatively.

### Release note

Fixed an issue where OUTER JOIN materialized view rewrite could return
extra null-padded rows for INNER JOIN queries.

Co-authored-by: seawinde <seawinde@selectdb.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/4.0.x dev/4.1.x reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants