[lance] Support timestamp_ltz by normalizing timezone strings for Arrow-Rust compatibility#7703
[lance] Support timestamp_ltz by normalizing timezone strings for Arrow-Rust compatibility#77030dunay0 wants to merge 1 commit into
Conversation
…ow-Rust compatibility
|
Closing this PR. While investigating, I found that the root cause of #6648 goes deeper than the timezone string format. Lance v0.39.0 has a bug in its type serialization. It serializes timestamp types as colon-delimited strings like IANA timezone names like I'll open a separate PR for the |
Summary
Fixes #6648.
When using Lance format with
TIMESTAMP WITH LOCAL TIME ZONE, Lance's Rust-side schema parser fails on timezone strings likeGMT-10:00withUnsupported timestamp type: timestamp:us:GMT-10:00. This happens because Arrow-Rust only accepts IANA timezone names (e.g.America/New_York) or standard UTC offset format (e.g.+05:30,-10:00,Z), but Java'sZoneId.systemDefault().toString()produces theGMT-10:00prefix form.The fix uses
ZoneId.normalized()before converting to string. This convertsGMT-10:00to-10:00,GMT+05:30to+05:30,UTCtoZ, and leaves IANA names likeAmerica/New_Yorkunchanged. All of these are formats that Arrow-Rust accepts.Changes:
ArrowFieldTypeConversion, call.normalized()on the ZoneId before.toString()when creating Arrow Timestamp typesLanceFileFormat, remove theUnsupportedOperationExceptionthat blockedLOCAL_ZONED_TIMESTAMPentirelytimestamp_ltzcoverage toLanceFileFormatTest(validation) andLanceFileFormatReadWriteTest(round-trip read/write)