[fix](mv) unblock extra-join elimination in mv rewrite and invalidate#61176
[fix](mv) unblock extra-join elimination in mv rewrite and invalidate#61176foxtail463 wants to merge 1 commit intoapache:masterfrom
Conversation
dependent mtmv caches on constraint changes MV rewrite could fail when the query used fewer joins than the MV If the eliminable extra join node carried a filter edge (system delete-sign filter or an ON-clause lifted filter), rewrite was rejected early with graph-node mismatch before semantic elimination checks. Remove that hard precheck so join elimination is decided by canEliminateViewEdge semantics. Also clear caches of MTMVs that depend on the altered base table when ADD/DROP CONSTRAINT runs, and fail the DDL if cache invalidation fails. This ensures newly added or removed unique/fk traits are immediately visible to rewrite and avoids stale StructInfo/DataTrait decisions.
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
/review |
There was a problem hiding this comment.
Code Review Summary
This PR has two logical changes: (1) removing a premature filter-edge rejection in HyperGraphComparator.tryEliminateNodesAndEdge() to unblock MV rewrite when the MV has extra eliminable joins with filter edges, and (2) invalidating MTMV plan caches when ADD/DROP CONSTRAINT DDL runs. The typo fix (rigthPlan -> rightPlan) is a nice cleanup.
Critical Checkpoint Conclusions
Goal & Correctness: The HyperGraphComparator change is sound. The removed check was an over-conservative early rejection. Filter edges on eliminated nodes are already correctly handled downstream: isLogicCompatible() (lines 163-166) filters out view edges whose referenceNodes overlap shouldEliminateViewNodesMap, and the two final checks in tryEliminateNodesAndEdge() (containAll + getCardinality == 0) ensure completeness. The cache invalidation intent is correct but has a gap for foreign key constraints (see issue below).
Modification minimality: Yes, the changes are focused and minimal.
Concurrency: MTMV.clearCache() correctly acquires the write lock on mvRwLock, consistent with how getOrGenerateCache() and addTaskResult() access the cache fields under the same lock. No concurrency issue.
Lifecycle: No special lifecycle concerns. Cache fields are non-persistent and rebuilt lazily.
Configuration items: None added.
Incompatible changes: None.
Parallel code paths: Issue found. For foreign key constraints, addForeignConstraint() also mutates the referenced (primary) table via primaryKeyConstraint.addForeignTable(this), and dropConstraint() -> dropConstraintRef() mutates it via primaryKeyConstraint.removeForeignTable(...). MTMVs depending on the referenced table should also have their caches cleared. See inline comments.
Special conditional checks: The removed filter-edge check had a comment saying "Right now, just reject it" — this was a known TODO/limitation, and removing it is the correct resolution.
Test coverage: No tests are included in this PR. The PR description's checklist has no test box checked. There should be at least a regression test covering the motivating scenario (MV with extra LEFT JOIN + filter edge, queried with fewer joins), and a test verifying that ADD/DROP CONSTRAINT invalidates MTMV caches.
Observability: No concerns.
EditLog/persistence: Cache invalidation is in-memory only (not persisted), which is correct — the caches are transient. On follower FEs, caches are rebuilt independently. No EditLog changes needed.
Transaction/data writes: Not applicable.
FE-BE variable passing: Not applicable.
Performance: clearDependentMtmvCaches iterates over dependent MTMVs (typically few) and nulls out in-memory cache references — negligible overhead. No concerns.
Other issues: The clearDependentMtmvCaches call is placed after lock release in both commands, and wrapping exceptions into DdlException will fail the DDL command even though the constraint change has already been persisted via EditLog. This means the constraint is committed but the user sees an error. Consider making cache clearing best-effort (log and continue) rather than failing the DDL. See inline comment.
| import org.apache.doris.common.Pair; | ||
| import org.apache.doris.common.util.MetaLockUtils; | ||
| import org.apache.doris.mtmv.MTMVUtil; | ||
| import org.apache.doris.nereids.NereidsPlanner; |
There was a problem hiding this comment.
Issue [Medium]: Missing cache invalidation for the referenced table in FK constraints.
For foreign key constraints, addForeignConstraint() also mutates the referenced (primary) table's PrimaryKeyConstraint via primaryKeyConstraint.addForeignTable(this). MTMVs that depend on the referenced table may use FK metadata for rewrite decisions (e.g., PK-FK join elimination). Their caches should also be invalidated.
Suggested fix:
MTMVUtil.clearDependentMtmvCaches(columnsAndTable.second);
if (referencedColumnsAndTable != null) {
MTMVUtil.clearDependentMtmvCaches(referencedColumnsAndTable.second);
}| import org.apache.doris.catalog.constraint.ForeignKeyConstraint; | ||
| import org.apache.doris.catalog.constraint.PrimaryKeyConstraint; | ||
| import org.apache.doris.mtmv.MTMVUtil; | ||
| import org.apache.doris.nereids.NereidsPlanner; |
There was a problem hiding this comment.
Issue [Medium]: Missing cache invalidation for the referenced table when dropping FK constraints.
When dropping a FK constraint, dropConstraintRef() also modifies the referenced (primary) table via primaryKeyConstraint.removeForeignTable(...). Similarly, when dropping a PK constraint, the FK tables' constraints may be affected. MTMVs depending on those related tables should also have their caches cleared.
Note: getConstraintRelatedTables() (lines 62-70) already exists and identifies these related tables but is never called. It should be used here. The constraint type must be captured before the drop (since dropConstraint removes it from the map).
Suggested approach:
// Before dropping, under the read lock or at start of write lock block:
Constraint constraint = table.getConstraintsMapUnsafe().get(name);
List<TableIf> relatedTables = getConstraintRelatedTables(constraint);
// After dropping and unlocking:
MTMVUtil.clearDependentMtmvCaches(table);
for (TableIf related : relatedTables) {
MTMVUtil.clearDependentMtmvCaches(related);
}| mtmv.clearCache(); | ||
| } | ||
| } catch (Exception e) { | ||
| throw new DdlException(String.format( |
There was a problem hiding this comment.
Issue [Low]: Consider making cache clearing best-effort rather than failing the DDL.
This method is called after the constraint has already been persisted via EditLog. If clearDependentMtmvCaches throws (e.g., MTMV was concurrently dropped, metadata lookup fails), the user sees a DdlException even though the constraint change is already committed and will be replayed on followers. This is confusing — the DDL succeeded but appears to have failed.
Consider logging the error and continuing instead:
} catch (Exception e) {
LOG.warn("failed to clear dependent mtmv caches for base table {}", tableInfo, e);
}The stale cache will be rebuilt on the next MTMV refresh or FE restart anyway, so a transient failure here is harmless.
dependent mtmv caches on constraint changes
MV rewrite could fail when the query used fewer joins than the MV If the eliminable extra join node carried a filter edge (system delete-sign filter or an ON-clause lifted filter), rewrite was rejected early with graph-node mismatch before semantic elimination checks. Remove that hard precheck so join elimination is decided by canEliminateViewEdge semantics.
Also clear caches of MTMVs that depend on the altered base table when ADD/DROP CONSTRAINT runs, and fail the DDL if cache invalidation fails. This ensures newly added or removed unique/fk traits are immediately visible to rewrite and avoids stale StructInfo/DataTrait decisions.
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
create database if not exists test;
use test;
drop table if exists t1;
CREATE TABLE t1 (
id INT,
name VARCHAR(50)
)
DUPLICATE KEY(id)
DISTRIBUTED BY HASH(id) BUCKETS 1
PROPERTIES("replication_num" = "1");
drop table if exists t2;
CREATE TABLE t2 (
id INT,
value INT
)
UNIQUE KEY(id)
DISTRIBUTED BY HASH(id) BUCKETS 1
PROPERTIES(
"replication_num" = "1",
"enable_unique_key_merge_on_write" = "true"
);
ALTER TABLE t2 ADD CONSTRAINT uk_id UNIQUE (id);
INSERT INTO t1 (id, name) VALUES
(1, 'Alice'),
(2, 'Bob'),
(3, 'Charlie');
INSERT INTO t2 (id, value) VALUES
(1, 10), -- 满足 t1.id = t2.id 且 id > 0
(2, -5), -- 满足 t1.id = t2.id,且 id > 0
(4, 20); -- t2 独有的数据,t1 中没有
DROP MATERIALIZED VIEW IF EXISTS mv_join_elimination;
CREATE MATERIALIZED VIEW mv_join_elimination
BUILD IMMEDIATE REFRESH COMPLETE ON MANUAL
DISTRIBUTED BY HASH(id) BUCKETS 1
PROPERTIES ('replication_num' = '1')
AS
SELECT
t1.id,
t1.name
FROM t1
LEFT JOIN t2
ON t1.id = t2.id AND t2.id = 1;
explain SELECT id, name FROM t1;
Release note
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)