branch-4.1 [opt](fe) Bound not-null inference cost #63318#63914
branch-4.1 [opt](fe) Bound not-null inference cost #63318#63914englefly wants to merge 1 commit into
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
|
run buildall |
FE Regression Coverage ReportIncrement line coverage |
|
/review |
There was a problem hiding this comment.
I found one optimizer regression in the new not-null inference budget logic. Existing inline review context was empty, and there were no additional user focus points.
Critical checkpoints:
- AGENTS.md Nereids expression checkpoint: not applicable; this PR does not add or change overloaded checkLegalityBeforeTypeCoercion methods.
- Correctness: needs attention because cheap predicates can now be skipped based on unrelated predicate iteration order.
- Tests: I attempted to run the focused FE tests with ./run-fe-ut.sh --run org.apache.doris.nereids.rules.rewrite.InferAggNotNullTest,org.apache.doris.nereids.rules.rewrite.InferJoinNotNullTest,org.apache.doris.nereids.rules.rewrite.InferFilterNotNullTest,org.apache.doris.nereids.rules.rewrite.EliminateNotNullTest, but the runner failed during generated-code setup because thirdparty/installed/bin/protoc is missing.
User focus response: no additional user-provided review focus was supplied.
| mergedInputSlots.addAll(predicateInputSlots); | ||
| if (mergedInputSlots.size() > MAX_INFER_NOT_NULL_INPUT_SLOTS) { | ||
| return Optional.empty(); | ||
| } |
There was a problem hiding this comment.
This global slot budget makes inference depend on the iteration order of the predicate collection. If an earlier cheap predicate uses 32 input slots, a later simple predicate like x = 1 reaches this branch and is skipped only because x would be the 33rd accumulated slot, so InferFilterNotNull will not infer x IS NOT NULL and EliminateNotNull will not remove an explicit redundant x IS NOT NULL. Several callers pass sets, so two equivalent predicate sets can get different not-null inference depending on iteration order. The budget should be applied per predicate, or the cumulative cap should not prevent independently cheap predicates from being considered.
pick #63318