[fix](datetime) Replace legacy from_date_str with cast function#61682
[fix](datetime) Replace legacy from_date_str with cast function#61682zclllyybb wants to merge 3 commits intoapache:masterfrom
from_date_str with cast function#61682Conversation
and split individual `CastToTimestampTz` class
|
run buildall |
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
/review |
There was a problem hiding this comment.
Code Review Summary
This PR replaces legacy from_date_str member functions on VecDateTimeValue and DateV2Value<T> with static cast function calls (CastToDateOrDatetime, CastToDateV2, CastToDatetimeV2, CastToTimestampTz). It also refactors CastToDatetimeV2 internals to separate public API from private implementation details via _internal suffix methods, and extracts a new CastToTimestampTz wrapper in cast_to_timestamptz_impl.hpp.
Critical Checkpoint Conclusions
Goal & correctness: The PR achieves its goal of removing from_date_str by routing all callers through the unified cast functions. All 38 files are consistent in the replacement pattern. The from_date_str methods are fully removed with no remaining callers.
Focused & minimal: The change is relatively large (38 files, +572/-260) but mechanical in nature. Each call site follows a consistent pattern.
Concurrency: No concurrency issues introduced. The partition_transformers.h non-atomic static bool initialized pattern is pre-existing and not introduced by this PR.
Lifecycle / static initialization: No new SIOF risks. The CastToTimestampTz forward declaration in cast_to_datetimev2_impl.hpp (line 85) and subsequent definition in cast_to_timestamptz_impl.hpp are well-structured.
Configuration items: None added.
Incompatible changes: The typo fix CastToTimstampTz → CastToTimestampTz is a symbol rename. Since the old symbol was only used internally and the new symbol is defined identically, this is safe.
Parallel code paths: All date type variants (DATE, DATETIME, DATEV2, DATETIMEV2, TIMESTAMPTZ) are handled. The ES scroll parser and runtime filter wrapper correctly dispatch to the appropriate cast function based on type.
Test coverage: Tests are updated consistently across 12 test files. No new tests added, but existing tests verify the new paths.
Observability: No observability changes needed for this refactoring.
Performance: No performance regression expected. The new code paths are functionally equivalent. Moving implementations from headers to .cpp files may slightly improve compilation time.
Issues Found
-
[Minor] Whitespace issue in
runtime_state.cpp— extra leading space on_nano_seconds = 0;line. -
[Design concern] ODR risk — Two different
struct CastToTimestampTzdefinitions exist in the codebase (cast_to_timestamptz.handcast_to_timestamptz_impl.hpp). They have the same name in the same namespace but different member functions and parameter types. While no translation unit currently includes both, this is a latent ODR violation that could cause undefined behavior if any future include chain pulls in both. Consider renaming the impl version (e.g.,CastToTimestampTzImpl) to avoid confusion and prevent accidental ODR violations.
TPC-H: Total hot run time: 26945 ms |
TPC-DS: Total hot run time: 169719 ms |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
/review |
…pes in cast functions Replace `IS_STRICT` and `IS_DATETIME` boolean template parameters with strongly-typed enums `DatelikeParseMode` and `DatelikeTargetType` across all datetime cast operations. This improves code readability and type safety by making template parameters self-documenting.
There was a problem hiding this comment.
Code Review Summary
This is a well-executed refactoring PR that replaces legacy from_date_str methods with strongly-typed cast functions and replaces boolean template parameters (IS_STRICT, IS_DATETIME) with DatelikeParseMode and DatelikeTargetType enums. The new CastToTimestampTz class is cleanly factored out. Overall the changes are mechanical and correct, with one inconsistency noted below.
Critical Checkpoints (Part 1.3)
- Goal / correctness: The goal is to replace legacy
from_date_strwith cast functions and use strongly-typed enums. The code accomplishes this goal. Existing regression tests cover basic paths but no new dedicated tests for the enum-based API were added. - Minimal and focused: Yes. The PR is large (46 files) but purely mechanical — each change follows a consistent pattern.
- Concurrency: Not applicable. No concurrent data structures or threading changes.
- Lifecycle / static init: Not applicable. No new static variables or lifecycle changes.
- Configuration: Not applicable. No new config items.
- Incompatible changes: No. This is an internal API refactoring with no user-visible or wire-format changes.
- Parallel code paths: One inconsistency found (see inline comment) —
from_decimal_batchindata_type_date_or_datetime_serde.cppusesSTRICTwhere datev2/datetimev2 serdes useNON_STRICT. This preserves the old behavior (the old code also used<true>in the non-strict path) but is inconsistent with the v2 serde files where the PR corrected this. - Special conditional checks: The ternary pattern
IsDatetime ? DatelikeTargetType::DATE_TIME : DatelikeTargetType::DATEused in templates is correct and consistent. - Test coverage: Existing tests are mechanically adapted. No new test cases specifically for the enum-based API, but the refactoring is behavior-preserving so existing coverage is adequate.
- Observability: Not applicable. No new features requiring observability.
- Transaction / persistence: Not applicable.
- Data writes: Not applicable.
- FE-BE variables: Not applicable.
- Performance: The
from_floatforCastToDateOrDatetimenow takes an additionalto_scaleparameter that is explicitly discarded viastatic_cast<void>(to_scale)— no performance impact. Thefrom_decimal_batchNON_STRICT change for v2 types is actually a minor performance improvement (avoids wasteful error-status writes). - Other issues: The typo fix
CastToTimstampTz→CastToTimestampTzis a nice cleanup. Themutable_utc_dt()accessor added toTimestampTzValueis appropriate for theCastToTimestampTzintegration.
be/src/core/data_type_serde/data_type_date_or_datetime_serde.cpp
Outdated
Show resolved
Hide resolved
8ec75dc to
35ecbcb
Compare
…HECK guards for strict mode consistency
35ecbcb to
a4a5960
Compare
|
/review |
Code Review — No Blocking Issues FoundReviewed the full diff (47 files, +1349/-425). This is a well-executed mechanical refactoring with several embedded bug fixes. Critical Checkpoint Conclusions1. Goal & Correctness: The PR replaces legacy 2. Modification Scope: Focused on the enum migration and API consolidation. Also fixes several pre-existing bugs:
3. Concurrency: N/A — all changes are in stateless static functions. 4. Lifecycle Management: N/A. 5. Configuration Items: None added. 6. Incompatible Changes: 7. Parallel Code Paths: Enum migration applied uniformly across all datelike types (Date, DateTime, DateV2, DateTimeV2, TimeV2, TimestampTz). No paths missed. 8. Special Conditional Checks: 9. Test Coverage: New 10. Observability: N/A for this refactoring. 11. Performance: Zero runtime cost — enum-to-bool conversions are 12. Other Observations:
LGTM. |
|
run buildall |
TPC-H: Total hot run time: 26914 ms |
TPC-DS: Total hot run time: 169177 ms |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
and split individual
CastToTimestampTzclassand Replace
IS_STRICTandIS_DATETIMEboolean template parameters with strongly-typed enumsDatelikeParseModeandDatelikeTargetTypeacross all datetime cast operations. This improves code readability and type safety by making template parameters self-documenting.