Skip to content

test: add date_trunc DST regression test in non-UTC session timezone#4040

Merged
parthchandra merged 2 commits intoapache:mainfrom
andygrove:andygrove/dst-trunc-test
Apr 23, 2026
Merged

test: add date_trunc DST regression test in non-UTC session timezone#4040
parthchandra merged 2 commits intoapache:mainfrom
andygrove:andygrove/dst-trunc-test

Conversation

@andygrove
Copy link
Copy Markdown
Member

@andygrove andygrove commented Apr 23, 2026

Which issue does this PR close?

Rationale for this change

Comet's native date_trunc is currently marked Incompatible for non-UTC session timezones and falls back to Spark by default. When a user opts into the native path via spark.comet.expression.TruncTimestamp.allowIncompatible=true, there is no SQL-level regression coverage that exercises DST transition boundaries. This gap makes it easy for well-intentioned DST-related changes in the Rust kernel to introduce subtle correctness regressions that CI will not catch.

What changes are included in this PR?

Adds one SQL file-based test: spark/src/test/resources/sql-tests/expressions/datetime/trunc_timestamp_dst.sql.

The file sets the session timezone to America/Los_Angeles, enables allowIncompatible for TruncTimestamp, and runs date_trunc at every supported level (DAY, HOUR, WEEK, MONTH, QUARTER, YEAR) against a small table that includes the 2024 spring-forward day (March 10), the 2024 fall-back day (November 3), and baseline non-DST days.

How are these changes tested?

The test itself is the change. Run it with:

./mvnw test -Dsuites='org.apache.comet.CometSqlFileTestSuite trunc_timestamp_dst' -Dtest=none

On current main, the test passes, verifying that the existing native date_trunc implementation matches Spark across the full DST matrix.

Exercises CometTruncTimestamp natively (via allowIncompatible=true) with a
session timezone that observes DST. Covers spring-forward and fall-back
transition days across DAY, HOUR, WEEK, MONTH, QUARTER, and YEAR levels,
plus baseline non-DST days.
CREATE TABLE test_trunc_dst(ts timestamp) USING parquet

statement
INSERT INTO test_trunc_dst VALUES
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude suggests adding '2024-11-03 01:30:00' as a test date. The suspicion is that truncating to HOUR produces 1:00 AM, which is ambiguous on Nov 3 (it occurs in both PDT and PST). chrono's .single() returns None, causing the unwrap() to panic.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a separate test file for this case, confirming the current panic

…k time

Adds a separate SQL test file exercising date_trunc with
timestamp('2024-11-03 01:30:00') in America/Los_Angeles. Truncating this
timestamp produces an ambiguous local time (1:00 AM occurs in both PDT
and PST), causing chrono to return None and the native kernel to panic.
All queries are marked ignore until the Rust code is fixed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@parthchandra
Copy link
Copy Markdown
Contributor

Merged. Thank you @andygrove

@parthchandra parthchandra merged commit 0c2fa63 into apache:main Apr 23, 2026
73 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants