DX-105463: [C++][Gandiva] Add TimestampIR for unit-aware timestamp[us/ns] support#137
Merged
lriggs merged 7 commits intodremio:dremio_27.0_23_19from Apr 24, 2026
Merged
Conversation
…/ns] support Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…tor, and test files
…o#135) next_day(timestamp, utf8) was registered in the C++ function registry for timestamp inputs but was not included in any TimestampIR table, so no _us/_ns IR wrappers were generated. With the relaxed DataTypeEquals (ignoring TimeUnit), calls like next_day(timestamp[us], 'MO') pass validation and are routed to Gandiva, but BuildFunctionCall silently falls through to the precompiled millis function, which interprets microseconds as milliseconds — producing dates ~51,000 years in the future (e.g. +53425-02-28 for a 2021 input). This adds a BuildNextDayWrapper that scales the timestamp input to millis via FloorDiv and calls the precompiled next_day_from_timestamp(context, millis, day_str, day_len) function. Pattern follows the cast-from-timestamp wrapper shape with the additional (context, string, length) args. No remainder recombination is needed: next_day returns date64 (midnight of the next weekday), so sub-millisecond input precision is not meaningful in the result. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
akravchukdremio
approved these changes
Apr 24, 2026
akravchukdremio
left a comment
There was a problem hiding this comment.
Looks good. Thanks for taking into account fixes for observed sigsev and checking for seconds unit, and also cherry-picking my next_day changes!
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This is a cherry-pick from dremio_27.0_20.
Adds TimestampIR — an LLVM IR builder class modeled after DecimalIR — that generates unit-aware wrapper functions at module-load time, enabling Gandiva functions registered for timestamp[ms] to automatically handle timestamp[us] and timestamp[ns] inputs without explicit per-unit registry entries.
What changes are included in this PR?
timestamp_ir.h/cc (new): TimestampIR class with wrapper patterns: pure IR arithmetic, calendar split/recombine, extract, trunc, diff, cast, and timezone wrappers. Includes FloorDiv/FloorDivRem helpers for correct floor-toward-negative-infinity semantics on pre-epoch timestamps.
CMakeLists.txt + engine.cc: Wire TimestampIR::AddFunctions() at module load alongside DecimalIR.
function_signature.cc: DataTypeEquals for TIMESTAMP ignores TimeUnit; only timezone is significant for matching. This allows the existing timestamp[ms] registry entries to match calls with timestamp[us]/timestamp[ns] parameters.
llvm_generator.cc/h: BuildFunctionCall inspects function descriptor for non-ms timestamp params and remaps to _us/_ns suffixed IR functions. Propagates Status errors from the visitor back to the caller (previously silently dropped).
precompiled/time.cc: Floor-division fix in DATE_TRUNC_FIXED_UNIT for pre-epoch (negative) timestamps; fix sub-second millis sign in castVARCHAR_timestamp_int64 for negative timestamps.
tests/date_time_test.cc: End-to-end C++ tests for timestamp[us] and timestamp[ns] through extract, trunc, arithmetic, and cast functions.
Are these changes tested?
Yes — tests/date_time_test.cc covers the new unit-aware paths. Also validated end-to-end through arrow-java ProjectorTest and Dremio integration tests (separate PRs).
Are there any user-facing changes?
No API changes. Gandiva functions that previously only accepted timestamp[ms] now transparently accept timestamp[us] and timestamp[ns].
Safe degradation: If a caller uses an old native lib (pre-TimestampIR), DataTypeEquals falls back to strict matching, function lookup fails at validation time, and the caller gets a clean error rather than silent wrong results.