fix(sql): return correct precision from arg_max/arg_min on TIMESTAMP_NS columns#7078
Conversation
Fixes #7076. The six arg_max/arg_min variants whose first argument is a TIMESTAMP hardcoded their result type to ColumnType.TIMESTAMP (microseconds) in the function constructor. When the input was a TIMESTAMP_NS column, the underlying long was a nanosecond value but the output formatter treated it as microseconds, producing dates thousands of years in the future. This change makes each function accept the timestamp type as a constructor parameter. The factory derives it from the value argument via ColumnType.getTimestampType(), the same pattern used by MaxTimestampGroupByFunction and FirstTimestampGroupByFunction. The result column now reports TIMESTAMP_NS when the input is TIMESTAMP_NS and TIMESTAMP otherwise. Affected variants: arg_max(ND), arg_max(NL), arg_max(NZ), arg_min(ND), arg_min(NL), arg_min(NZ). Each of the six test classes gains three new methods covering: end-to-end correctness with a timestamp_ns column, the typeOf() of the result, and a GROUP BY case. The Double variant also gains a parallel test that exercises the worker-pool merge path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Wrap the new TIMESTAMP_NS test methods in assertMemoryLeak() so any native allocations leaked during a regression are caught immediately. Switch the simple cases from assertSql() to assertQueryNoLeakCheck(), which additionally validates supportsRandomAccess and expectSize on the resulting factory. Replace the parallel test's self-comparing assertSqlCursors(sql, sql) - which only catches non-determinism - with a deterministic 100k-row dataset and an exact (sym, type, max-value) row assertion. The expected nanosecond values would shift by 1000x under any wrong- precision regression, so this catches the original bug end to end through the WorkerPool merge path. Restore the trailing whitespace on the empty line in the testArgMaxWithNullValue text block; the IntelliJ formatter step in CI rewrites it back, so leaving it stripped breaks the "git diff --exit-code" lint check.
jovfer
left a comment
There was a problem hiding this comment.
LGTM: visual check + run new tests on master and got fails as expected.
[PR Coverage check]😍 pass : 18 / 18 (100.00%) file detail
|
Fixes #7076.
Summary
When
arg_maxorarg_minwas applied to aTIMESTAMP_NScolumn, thereturned value was rendered with microsecond precision, producing dates
thousands of years in the future (e.g.
54977-04-25T06:29:47.654321Zinstead of
2023-01-03T12:34:56.987654321Z). The underlying long was ananosecond value but the function reported its result type as
TIMESTAMP(microseconds), so the formatter scaled it incorrectly.Root cause
The six
arg_max/arg_minvariants whose first argument is aTIMESTAMPhardcodedsuper(ColumnType.TIMESTAMP)in the functionconstructor:
arg_max(ND),arg_max(NL),arg_max(NZ)arg_min(ND),arg_min(NL),arg_min(NZ)Comparable functions such as
max(N)andfirst(N)already derive thetype from the input via
ColumnType.getTimestampType(args.getQuick(0).getType()).Change
Each affected function now accepts the timestamp type as a constructor
parameter. The factory passes
ColumnType.getTimestampType(valueArg.getType()), mirroring theexisting pattern in
MaxTimestampGroupByFunctionFactoryandFirstTimestampGroupByFunctionFactory. The internal storage layout isunchanged (long); only the reported column type and resulting display
formatting differ.
There is no behavioural change for
TIMESTAMP(micros) inputs. Thereare no performance implications: the type is resolved once at
factory time.
Test plan
Arg{Max,Min}Timestamp{Double,Long,Uuid}GroupByFunctionFactoryTestclasses, each covering:
timestamp_nsvalue column,asserting the full nine-digit nano string
typeOf(arg_max(...))returnsTIMESTAMP_NSGROUP BYover atimestamp_nsvalue columnArgMaxTimestampDoubleGroupByFunctionFactoryTestadditionallyexercises the parallel worker-pool merge path with
typeOfassertionarg_max/arg_minfactory tests pass; all 220 timestampgroup-by factory tests pass
(off-by-1000 timestamp display and wrong reported type)
🤖 Generated with Claude Code