-
Notifications
You must be signed in to change notification settings - Fork 211
Date-part extraction functions missing timezone handling for Timestamp inputs #2155
Description
Describe the bug
Five date-part extraction functions in NativeConverters.scala use buildExtScalarFunction, which does not pass the session timezone to the native Rust implementation:
By contrast, Hour, Minute, Second, and WeekOfYear correctly use buildTimePartExt, which passes sessionLocalTimeZone for TimestampType inputs.
This inconsistency can cause incorrect results for timestamp inputs near date boundaries in non-UTC timezones.
Affected functions:
- Year (Spark_Year) — not timezone-aware
- Month (Spark_Month) — not timezone-aware
- DayOfMonth (Spark_Day) — not timezone-aware
- DayOfWeek (Spark_DayOfWeek) — not timezone-aware
- Quarter (Spark_Quarter) — not timezone-aware
To Reproduce
- Set
spark.sql.session.timeZonetoAmerica/New_York - Create a table with a timestamp column containing
2021-01-04 04:30:00 UTC
(equivalent to2021-01-03 23:30:00in New York) - Run:
SELECT dayofmonth(ts), dayofweek(ts) FROM t1Expected behavior
All date-part extraction functions should interpret timestamp inputs in the session local timezone before extracting the date component, matching Spark's behavior.
All 5 functions should use buildTimePartExt (or equivalent), and their corresponding Rust implementations should accept and handle an optional timezone argument.
Screenshots
Additional context