Skip to content

DX-105463: [C++][Gandiva] Add TimestampIR for unit-aware timestamp[us/ns] support#137

Merged
lriggs merged 7 commits intodremio:dremio_27.0_23_19from
lriggs:nanos_dremio_27.0_23_19
Apr 24, 2026
Merged

DX-105463: [C++][Gandiva] Add TimestampIR for unit-aware timestamp[us/ns] support#137
lriggs merged 7 commits intodremio:dremio_27.0_23_19from
lriggs:nanos_dremio_27.0_23_19

Conversation

@lriggs
Copy link
Copy Markdown
Collaborator

@lriggs lriggs commented Apr 24, 2026

Summary

This is a cherry-pick from dremio_27.0_20.

Adds TimestampIR — an LLVM IR builder class modeled after DecimalIR — that generates unit-aware wrapper functions at module-load time, enabling Gandiva functions registered for timestamp[ms] to automatically handle timestamp[us] and timestamp[ns] inputs without explicit per-unit registry entries.

What changes are included in this PR?
timestamp_ir.h/cc (new): TimestampIR class with wrapper patterns: pure IR arithmetic, calendar split/recombine, extract, trunc, diff, cast, and timezone wrappers. Includes FloorDiv/FloorDivRem helpers for correct floor-toward-negative-infinity semantics on pre-epoch timestamps.
CMakeLists.txt + engine.cc: Wire TimestampIR::AddFunctions() at module load alongside DecimalIR.
function_signature.cc: DataTypeEquals for TIMESTAMP ignores TimeUnit; only timezone is significant for matching. This allows the existing timestamp[ms] registry entries to match calls with timestamp[us]/timestamp[ns] parameters.
llvm_generator.cc/h: BuildFunctionCall inspects function descriptor for non-ms timestamp params and remaps to _us/_ns suffixed IR functions. Propagates Status errors from the visitor back to the caller (previously silently dropped).
precompiled/time.cc: Floor-division fix in DATE_TRUNC_FIXED_UNIT for pre-epoch (negative) timestamps; fix sub-second millis sign in castVARCHAR_timestamp_int64 for negative timestamps.
tests/date_time_test.cc: End-to-end C++ tests for timestamp[us] and timestamp[ns] through extract, trunc, arithmetic, and cast functions.

Are these changes tested?

Yes — tests/date_time_test.cc covers the new unit-aware paths. Also validated end-to-end through arrow-java ProjectorTest and Dremio integration tests (separate PRs).

Are there any user-facing changes?

No API changes. Gandiva functions that previously only accepted timestamp[ms] now transparently accept timestamp[us] and timestamp[ns].

Safe degradation: If a caller uses an old native lib (pre-TimestampIR), DataTypeEquals falls back to strict matching, function lookup fails at validation time, and the caller gets a clean error rather than silent wrong results.

lriggs and others added 7 commits April 23, 2026 14:44
…/ns] support

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…o#135)

next_day(timestamp, utf8) was registered in the C++ function registry for
timestamp inputs but was not included in any TimestampIR table, so no _us/_ns
IR wrappers were generated. With the relaxed DataTypeEquals (ignoring TimeUnit),
calls like next_day(timestamp[us], 'MO') pass validation and are routed to
Gandiva, but BuildFunctionCall silently falls through to the precompiled
millis function, which interprets microseconds as milliseconds — producing
dates ~51,000 years in the future (e.g. +53425-02-28 for a 2021 input).

This adds a BuildNextDayWrapper that scales the timestamp input to millis
via FloorDiv and calls the precompiled next_day_from_timestamp(context,
millis, day_str, day_len) function. Pattern follows the cast-from-timestamp
wrapper shape with the additional (context, string, length) args.

No remainder recombination is needed: next_day returns date64 (midnight of
the next weekday), so sub-millisecond input precision is not meaningful in
the result.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@akravchukdremio akravchukdremio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks for taking into account fixes for observed sigsev and checking for seconds unit, and also cherry-picking my next_day changes!

Comment thread cpp/src/gandiva/precompiled/time.cc
@lriggs lriggs merged commit 8a16367 into dremio:dremio_27.0_23_19 Apr 24, 2026
24 of 41 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants