New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HIVE-27199: Read TIMESTAMP WITH LOCAL TIME ZONE columns from text files using custom formats #4170
Conversation
…enrich tests Before the changes only space was allowed between date and time parts. 2022-05-29 12:23:43 After the changes following literals become valid (were not before) 2022-05-29T 12:23:43 2022-05-29T12:23:43 2022-05-2912:23:43 Regarding the TZL parser there still a few inconsistencies: Valid: 2022-05-29T 12:23:43 Invalid: 2022-05-29 T12:23:43 Invalid: 2022-05-29 T 12:23:43 General observation is that TZL parser is much more lenient than the TZ parser; not sure we want that.
The change fixes some inconsistencies which were introduced by the previous change: 2016-05-03T 12:26:34 -> NULL 2016-05-0312:26:34 -> NULL but introduces some new: # Case I Valid TZ: 2016-05-03T12:26:34 Invalid TZL: 2016-05-03T12:26:34 # Case II Valid TZL: 2016-05-03 12:26:34 Invalid TZL: 2016-05-03T12:26:34
@check-spelling-bot Report🔴 Please reviewSee the files view or the action log for details. Unrecognized words (1)TZL Previously acknowledged words that are now absentaarry timestamplocal yyyyTo accept these unrecognized words as correct (and remove the previously acknowledged and now absent words), run the following commands... in a clone of the git@github.com:zabetak/hive.git repository
If the flagged items do not appear to be textIf items relate to a ...
|
…es using custom formats 1. Support parsing TimestampTZ using the TimestampParser, which accepts multiple DateTimeFormatters. 2. Pass timestamp.formats in Lazy inspector handling TIMESTAMP WITH LOCAL TIME ZONE and instantiate a TimestampParser. 3. Refactor TimestampTZUtil to allow passing different DateTimeFormatters. 4. Add tests covering timestamps with 3 different formats (built-in, plus 2 more not covered by the default).
I took a look - nothing stands out to me, not even nits :) LGTM pending tests. This is a nice approach to handle the general problem. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Kudos, SonarCloud Quality Gate passed! |
…es using custom formats (Stamatis Zampetakis reviewed by Ayush Saxena, John Sherman, Attila Turóczy) 1. Support parsing TimestampTZ using the TimestampParser, which accepts multiple DateTimeFormatters. 2. Pass timestamp.formats in Lazy inspector handling TIMESTAMP WITH LOCAL TIME ZONE and instantiate a TimestampParser. 3. Refactor TimestampTZUtil to allow passing different DateTimeFormatters. 4. Add tests covering timestamps with 3 different formats (built-in, plus 2 more not covered by the default). These changes give more flexibility to users reading timestamps from text files and it also aligns the way TIMESTAMP and TIMESTAMP WITH LOCAL TIME ZONE behave when a custom format is provided. Closes apache#4170
…es using custom formats (Stamatis Zampetakis reviewed by Ayush Saxena, John Sherman, Attila Turóczy) 1. Support parsing TimestampTZ using the TimestampParser, which accepts multiple DateTimeFormatters. 2. Pass timestamp.formats in Lazy inspector handling TIMESTAMP WITH LOCAL TIME ZONE and instantiate a TimestampParser. 3. Refactor TimestampTZUtil to allow passing different DateTimeFormatters. 4. Add tests covering timestamps with 3 different formats (built-in, plus 2 more not covered by the default). These changes give more flexibility to users reading timestamps from text files and it also aligns the way TIMESTAMP and TIMESTAMP WITH LOCAL TIME ZONE behave when a custom format is provided. Closes apache#4170
…es using custom formats (Stamatis Zampetakis reviewed by Ayush Saxena, John Sherman, Attila Turóczy) 1. Support parsing TimestampTZ using the TimestampParser, which accepts multiple DateTimeFormatters. 2. Pass timestamp.formats in Lazy inspector handling TIMESTAMP WITH LOCAL TIME ZONE and instantiate a TimestampParser. 3. Refactor TimestampTZUtil to allow passing different DateTimeFormatters. 4. Add tests covering timestamps with 3 different formats (built-in, plus 2 more not covered by the default). These changes give more flexibility to users reading timestamps from text files and it also aligns the way TIMESTAMP and TIMESTAMP WITH LOCAL TIME ZONE behave when a custom format is provided. Closes apache#4170
What changes were proposed in this pull request?
TimestampTZ
using theTimestampParser
, which accepts multipleDateTimeFormatter
s.timestamp.formats
in Lazy inspector handlingTIMESTAMP WITH LOCAL TIME ZONE
and instantiate aTimestampParser
.TimestampTZUtil
to allow passing differentDateTimeFormatter
s.Why are the changes needed?
TIMESTAMP
andTIMESTAMP WITH LOCAL TIME ZONE
as far as concerns custom format support.Does this PR introduce any user-facing change?
Yes, users can now specify custom
timestamp.formats
for bothTIMESTAMP
andTIMESTAMP WITH LOCAL TIME ZONE
data types.For existing tables containing
timestamp.formats
andTIMESTAMP WITH LOCAL TIME ZONE
type results may change fromNULL
to a "valid" timestamp value.How was this patch tested?