-
Notifications
You must be signed in to change notification settings - Fork 779
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use kernel utility for parsing timestamps in csv reader. #832
Conversation
Codecov Report
@@ Coverage Diff @@
## master #832 +/- ##
==========================================
+ Coverage 82.60% 82.64% +0.03%
==========================================
Files 168 168
Lines 48040 48088 +48
==========================================
+ Hits 39685 39742 +57
+ Misses 8355 8346 -9
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me -- thank you @novemberkilo -- this change will ensure that the timestamp format supported by the csv parser are the same as the timestamp formats supported by the cast
kernel (utf8
-> timestamp
) 👍
I think this PR will pass CI checks once it is rebased against master to pick up #839 |
arrow/src/csv/reader.rs
Outdated
} | ||
_ => panic!( | ||
"Unexpected failure converting {} to local datetime", | ||
stringify! { naive_datetime } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this stringify
is needed here: it converts naive_datetime
into the string -- so this message always looks like:
thread 'main' panicked at 'Unexpected failure converting naive_datetime to local datetime', src/main.rs:2:3
Which you can see at the playground: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=9f6fbabade9152946ea5a84db486c2de
Since it is in test code, this is likely not a critical problem but I noticed it so figured I would point it out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing that out Andrew. Will fix.
Thanks again @novemberkilo ! |
* Use kernel utility for parsing timestamps in csvs. * Remove cruft. * Cleanup. * Lint. * Remove erroneous stringify.
Which issue does this PR close?
This is related to a datafusion issue -- apache/datafusion#958
Rationale for this change
apache/datafusion#958 (comment)
What changes are included in this PR?
So as to support more timestamp formats, this PR changes the implementation of
parse
forMicrosecond
andNanosecond
typeTimestamps
to usekernels::cast_utils::string_to_timestamp_nanos
Are there any user-facing changes?
I've checked that a user can now parse timestamps in CSV files as described in the referenced
datafusion
issue. Specifically, you can now do:// @alamb