-
Notifications
You must be signed in to change notification settings - Fork 28k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-8995][SQL] cast date strings like '2015-01-01 12:15:31' to date #7353
Conversation
This was by intentional as hive doesn't support it. However, our new design about date time says we should support cast string |
Okay, yes I'm going to extend it. What Hive does support is parsing the minute from |
@cloud-fan (String -> Timestamp) |
ok to test |
Test build #37124 has finished for PR 7353 at commit
|
Test build #1051 has finished for PR 7353 at commit
|
Test build #37129 has finished for PR 7353 at commit
|
try DateTimeUtils.fromJavaTimestamp(Timestamp.valueOf(n)) | ||
catch { case _: java.lang.IllegalArgumentException => null } | ||
val parsedDateString = DateTimeUtils.stringToTimestamp(utfs) | ||
if (parsedDateString == null) null else DateTimeUtils.fromJavaTimestamp(parsedDateString) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a Timestamp object, should we have a better name? parsedTime?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to adjust this.
@tarekauel Thanks for working on this, it's in a good shape, just a few comments. As you mentioned in the description, |
@davies thanks for all your good comments. I'm going to incorporate your suggestions. I don't accept |
Test build #37168 has finished for PR 7353 at commit
|
Test build #37171 has finished for PR 7353 at commit
|
@davies How should we deal with this? I don't know the value of Update: My assumption is that value is a numeric value (see links below). Because of that 238 could https://github.com/apache/spark/blob/master/sql/hive/src/test/resources/data/scripts/q_test_init.sql#L4-L8 |
Test build #37177 has finished for PR 7353 at commit
|
In master, we can't cast
I think you should check that the year is larger than |
@davies somehow Jenkins wasn't able to fetch from GitHub. Could you trigger Jenkins, again? |
Test build #1075 has finished for PR 7353 at commit
|
Test build #37415 has finished for PR 7353 at commit
|
Test build #37454 has finished for PR 7353 at commit
|
c.getTimeInMillis * 1000) | ||
c.set(2015, 0, 1, 0, 0, 0) | ||
c.set(Calendar.MILLISECOND, 0) | ||
assert(DateTimeUtils.stringToTimestamp(UTF8String.fromString("2015")).get == |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to use ===
in tests (for better failure message).
LGTM, just some minor comments. |
Test build #37476 has finished for PR 7353 at commit
|
Thanks for the update, merging this into master! |
a follow up of #7353 1. we should use `Calendar.HOUR_OF_DAY` instead of `Calendar.HOUR`(this is for AM, PM). 2. we should call `c.set(Calendar.MILLISECOND, 0)` after `Calendar.getInstance` I'm not sure why the tests didn't fail in jenkins, but I ran latest spark master branch locally and `DateTimeUtilsSuite` failed. Author: Wenchen Fan <cloud0fan@outlook.com> Closes #7473 from cloud-fan/datetime and squashes the following commits: 66cdaf2 [Wenchen Fan] fix several bugs in DateTimeUtils.stringToTimestamp
fix 2 bugs introduced in #7353 1. we should use UTC Calendar when cast string to date . Before #7353 , we use `DateTimeUtils.fromJavaDate(Date.valueOf(s.toString))` to cast string to date, and `fromJavaDate` will call `millisToDays` to avoid the time zone issue. Now we use `DateTimeUtils.stringToDate(s)`, we should create a Calendar with UTC in the begging. 2. we should not change the default time zone in test cases. The `threadLocalLocalTimeZone` and `threadLocalTimestampFormat` in `DateTimeUtils` will only be evaluated once for each thread, so we can't set the default time zone back anymore. Author: Wenchen Fan <cloud0fan@outlook.com> Closes #7488 from cloud-fan/datetime and squashes the following commits: 9cd6005 [Wenchen Fan] address comments 21ef293 [Wenchen Fan] fix 2 bugs in datetime
Jira https://issues.apache.org/jira/browse/SPARK-8995
In PR #6981we noticed that we cannot cast date strings that contains a time, like '2015-03-18 12:39:40' to date. Besides it's not possible to cast a string like '18:03:20' to a timestamp.
If a time is passed without a date, today is inferred as date.