[Fix](nereids) Fix nullable date literal binding and datediff folding#64127
[Fix](nereids) Fix nullable date literal binding and datediff folding#64127linrrzqqq wants to merge 2 commits into
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
|
/review |
There was a problem hiding this comment.
I found one blocking correctness issue in FE constant folding for days_diff. The new implementation diverges from BE/runtime semantics for DateTimeV2 cases that cross a date boundary but are less than a full day apart.
Critical checkpoint conclusions:
- Goal/test coverage: The PR addresses nullable date literal binding and datediff folding, and adds regression coverage, but the new tests do not cover partial-day
days_diffboundaries where folding must match runtime. - Scope/clarity: The change is mostly focused, but the
days_difffolding change reusesto_dayssemantics where runtime uses full-day rounded-to-zero semantics. - Concurrency/lifecycle/config/compatibility/persistence: Not applicable; this is FE expression folding and tests only.
- Parallel paths: BE runtime
days_diffremains based on rounded full-day differences, so FE folding must stay aligned with it. - Tests/results: Added results are deterministic, but missing the boundary case that exposes this regression.
- Observability/performance: Not applicable; no new runtime observability or hot-path performance concern found.
- User focus: No additional user-provided focus points were supplied.
Please fix the folding implementation and add a regression case such as days_diff('2021-01-02 00:00:00', '2021-01-01 23:59:59'), which should fold to the same value as runtime evaluation.
| @@ -1177,29 +1207,43 @@ public static Expression hoursDiff(DateV2Literal t1, DateV2Literal t2) { | |||
There was a problem hiding this comment.
This changes days_diff folding for DateTimeV2 from whole-day duration semantics to calendar-day semantics, which no longer matches BE runtime. BE evaluates days_diff via datetime_diff<DAY> and date_diff_in_days_round_to_zero_by_time, so days_diff('2021-01-02 00:00:00', '2021-01-01 23:59:59') returns 0 at runtime because the difference is less than one full day. With this new toDays(t1) - toDays(t2) folding it returns 1, so enabling constant folding changes query results. The same issue applies to the overloads below where either argument is DateTimeV2Literal; only the pure DateV2 case is safe to compute from day numbers.
FE UT Coverage ReportIncrement line coverage |
FE Regression Coverage ReportIncrement line coverage |
TPC-H: Total hot run time: 29344 ms |
TPC-DS: Total hot run time: 169204 ms |
4f8307f to
d00c6ad
Compare
d00c6ad to
60668cb
Compare
|
run buildall |
|
/review |
There was a problem hiding this comment.
Review summary:
No additional blocking issue found in this pass. I also considered the existing inline thread about days_diff DateTimeV2 folding and did not submit a duplicate comment.
Critical checkpoint conclusions:
- Goal/test coverage: The PR targets FE constant folding and nullable literal signature binding for date/time diff functions. The added regression cases cover nullable string literals, timestamp-tz literals, time zone changes, and fold-constant checks.
- Scope/clarity: The change is focused on FE Nereids expression folding/signature matching plus the corresponding regression outputs.
- Concurrency/lifecycle/config: No new concurrency, lifecycle-sensitive static initialization, or configuration item is introduced.
- Compatibility/protocol: No FE-BE protocol, storage format, or persisted metadata change is involved.
- Parallel paths: I compared the FE folding paths with BE FunctionTimeDiff behavior for date/datetime/timestamptz. The current helper logic matches BE round-toward-zero behavior for day/week/hour/minute/second/millisecond/microsecond paths, and timestamp-tz overloads use the UTC-normalized literal representation.
- Error handling/data correctness: No unchecked Status-like path applies in this FE Java code. I did not find a distinct constant-folding result mismatch beyond the already-known review context.
- Performance/observability: The added work is constant-time literal evaluation only; no new observability appears necessary.
- Tests: Regression tests are added in nereids_function_p0/scalar_function/D.groovy with output updates in D.out. I did not run the regression suite in this Actions review pass.
Focus points: The focus file says there were no additional user-provided review focus points.
TPC-H: Total hot run time: 29271 ms |
TPC-DS: Total hot run time: 169382 ms |
FE Regression Coverage ReportIncrement line coverage |
| } | ||
|
|
||
| private static int dateDiff(LocalDateTime date1, LocalDateTime date2) { | ||
| return ((int) ChronoUnit.DAYS.between(date2.toLocalDate(), date1.toLocalDate())); |
There was a problem hiding this comment.
ChronoUnit 认为0000 年 2 月是闰年,但是我们内部的行为应该是认为0000年是平年, 所以常量折叠如果覆盖到了 0000.2, 就会多一天出来
| if (!argument.isNullLiteral() && argument.isLiteral() && realType.isStringLikeType()) { | ||
| realType = TypeCoercionUtils.characterLiteralTypeCoercion(((Literal) argument).getStringValue(), | ||
| Optional<Literal> literalAfterUnwrapNullable = ExpressionUtils.getLiteralAfterUnwrapNullable(argument); | ||
| if (!argument.isNullLiteral() && literalAfterUnwrapNullable.isPresent() && realType.isStringLikeType()) { |
|
run buildall |
TPC-H: Total hot run time: 29364 ms |
TPC-DS: Total hot run time: 170016 ms |
FE UT Coverage ReportIncrement line coverage |
FE Regression Coverage ReportIncrement line coverage |
Problem Summary:
nullable()/non_nullable()when checking whether an argument is a literal duringsignature search, so non-TIMESTAMPTZ date/time literals keep DATE/DATETIMEV2 semantics. Timezone-bearing literals can still bind to TIMESTAMPTZ.before:
now:
0000in theDATE_DIFFconstant folding.before:
now: