Skip to content

Preserve millisecond precision for TIMESTAMP comparisons in the multi-stage engine#18883

Merged
yashmayya merged 2 commits into
apache:masterfrom
yashmayya:fix-mse-timestamp-millis-precision
Jun 29, 2026
Merged

Preserve millisecond precision for TIMESTAMP comparisons in the multi-stage engine#18883
yashmayya merged 2 commits into
apache:masterfrom
yashmayya:fix-mse-timestamp-millis-precision

Conversation

@yashmayya

@yashmayya yashmayya commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Summary

In the multi-stage engine, comparing a TIMESTAMP column to a sub-second epoch-millis literal (e.g. WHERE ts = 1761667561482) silently returned no rows. This is a regression surfaced by #18396 and tracked in #18881. The single-stage engine is unaffected (it treats TIMESTAMP as a plain long).

Root cause

#18396 changed PinotTypeCoercion#binaryComparisonCoercion so that for tsColumn <op> bigintLiteral, the implicit cast is placed on the literal (tsColumn <op> CAST(literal AS TIMESTAMP)) instead of on the column, to keep the column unwrapped for index applicability. The cast target was built via factory.createSqlType(SqlTypeName.TIMESTAMP) with no precision.

TypeSystem.getDefaultPrecision(...) only overrode DECIMAL, so TIMESTAMP fell through to Calcite's default precision of 0 (whole seconds). When the literal cast is constant-folded, RexBuilder.clean rounds the TimestampString to the type precision via TimestampString.round(0), truncating the sub-second component:

1761667561482  ->  TIMESTAMP '... :01'  ->  1761667561000

The stored value is ...561482; the folded literal is ...561000, so the predicate never matches.

Fix

Override TypeSystem.getDefaultPrecision(TIMESTAMP) to return 3 (millisecond precision), matching Pinot's epoch-millis LONG storage. 3 is also Calcite's MAX_DATETIME_PRECISION, so it is simultaneously the default and the max for TIMESTAMP — no clamping. This keeps the column unwrapped (preserving the #18396 improvement) and preserves milliseconds on the folded literal.

The precision is planner-only metadata and is not on the wire: the broker→server boundary maps SqlTypeName.TIMESTAMP to ColumnDataType.TIMESTAMP (backed by LONG, no precision), TIMESTAMP literals are serialized to the server as raw epoch-millis, and no execution path reads the SQL precision — so the change is rolling-upgrade safe and does not alter results beyond the intended millisecond preservation.

Tests

  • PinotTypeCoercionTest#testTimestampColumnVsSubSecondLiteralPreservesMillis — planner-level guard that the folded literal keeps its millis. Fails without the fix (literal truncates to whole seconds).
  • TimestampTest#testSubSecondTimestampEqualityQueries — end-to-end on both engines: a row with a sub-second TIMESTAMP is matched by WHERE tsCol = <epoch-millis>. On the multi-stage engine this returned 0 rows before the fix (expected [1] but found [0]).
  • Updated LiteralEvaluationPlans.json and two PinotTypeCoercionTest assertions for the TIMESTAMP(0)TIMESTAMP(3) plan rendering (no behavioral change; the folded epoch values are identical for whole-second literals).

Backward compatibility

No wire-format or segment-format change. Affects only multi-stage planner-emitted TIMESTAMP types. After upgrade, multi-stage queries comparing a TIMESTAMP column to a sub-second epoch-millis literal return the correct rows.

Related behavior change: because the precision is now millisecond, multi-stage queries that fold a TIMESTAMP-returning function to a sub-second value (e.g. CAST(now() AS BIGINT)) now preserve milliseconds instead of truncating to whole seconds — which is what the single-stage engine already did (now() is a long there). The testLiteralOnlyFunc/testLiteralOnlyFuncV2 integration-test assertions are updated to match the single-stage bounds.

Closes #18881

@yashmayya yashmayya added bug Something is not working as expected multi-stage Related to the multi-stage query engine labels Jun 29, 2026
@codecov-commenter

codecov-commenter commented Jun 29, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 64.81%. Comparing base (3be6236) to head (3fac119).
⚠️ Report is 9 commits behind head on master.

Additional details and impacted files
@@              Coverage Diff              @@
##             master   #18883       +/-   ##
=============================================
+ Coverage     37.16%   64.81%   +27.64%     
- Complexity     1321     1322        +1     
=============================================
  Files          3393     3393               
  Lines        211305   211336       +31     
  Branches      33226    33235        +9     
=============================================
+ Hits          78541   136980    +58439     
+ Misses       125557    63296    -62261     
- Partials       7207    11060     +3853     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-21 64.81% <100.00%> (+27.64%) ⬆️
temurin 64.81% <100.00%> (+27.64%) ⬆️
unittests 64.81% <100.00%> (+27.64%) ⬆️
unittests1 57.02% <100.00%> (?)
unittests2 37.17% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@xiangfu0 xiangfu0 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Core fix looks right to me. The planner now uses millisecond TIMESTAMP precision, preserving sub-second literals while keeping the TIMESTAMP column unwrapped, and the regression coverage exercises both engines. CI is green.

@yashmayya yashmayya merged commit 2103aa2 into apache:master Jun 29, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something is not working as expected multi-stage Related to the multi-stage query engine

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Equality comparison between TIMESTAMP column and epoch_ms literal silently returns no rows in multi-stage engine

3 participants