-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-32424][SQL][3.0] Fix silent data change for timestamp parsing if f overflow happens #29267
Conversation
@@ -146,7 +146,3 @@ select from_json('{"time":"26/October/2015"}', 'time Timestamp', map('timestampF | |||
select from_json('{"date":"26/October/2015"}', 'date Date', map('dateFormat', 'dd/MMMMM/yyyy')); | |||
select from_csv('26/October/2015', 'time Timestamp', map('timestampFormat', 'dd/MMMMM/yyyy')); | |||
select from_csv('26/October/2015', 'date Date', map('dateFormat', 'dd/MMMMM/yyyy')); | |||
|
|||
select from_unixtime(1, 'yyyyyyyyyyy-MM-dd'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked these tests which were added to test cases that exceed 10-'y'. It's safe to remove them now because they will fail starting from 7-'y' and we have already covered these in both datetime-parsing.sql and datetime-formatting.sql
cc @cloud-fan |
Test build #126686 has finished for PR 29267 at commit
|
Test build #126690 has finished for PR 29267 at commit
|
retest this please |
github action passes, thanks, merging to 3.0! |
…if f overflow happens This PR backports d315ebf to branch-3.0 ### What changes were proposed in this pull request? When using `Seconds.toMicros` API to convert epoch seconds to microseconds, ```scala /** * Equivalent to * {link #convert(long, TimeUnit) MICROSECONDS.convert(duration, this)}. * param duration the duration * return the converted duration, * or {code Long.MIN_VALUE} if conversion would negatively * overflow, or {code Long.MAX_VALUE} if it would positively overflow. */ ``` This PR change it to `Math.multiplyExact(epochSeconds, MICROS_PER_SECOND)` ### Why are the changes needed? fix silent data change between 3.x and 2.x ``` ~/Downloads/spark/spark-3.1.0-SNAPSHOT-bin-20200722 bin/spark-sql -S -e "select to_timestamp('300000', 'y');" +294247-01-10 12:00:54.775807 ``` ``` kentyaohulk ~/Downloads/spark/spark-2.4.5-bin-hadoop2.7 bin/spark-sql -S -e "select to_timestamp('300000', 'y');" 284550-10-19 15:58:1010.448384 ``` ### Does this PR introduce _any_ user-facing change? Yes, we will raise `ArithmeticException` instead of giving the wrong answer if overflow. ### How was this patch tested? add unit test Closes #29267 from yaooqinn/SPARK-32424-30. Authored-by: Kent Yao <yaooqinn@hotmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Test build #126703 has finished for PR 29267 at commit
|
This PR backports d315ebf to branch-3.0
What changes were proposed in this pull request?
When using
Seconds.toMicros
API to convert epoch seconds to microseconds,This PR change it to
Math.multiplyExact(epochSeconds, MICROS_PER_SECOND)
Why are the changes needed?
fix silent data change between 3.x and 2.x
Does this PR introduce any user-facing change?
Yes, we will raise
ArithmeticException
instead of giving the wrong answer if overflow.How was this patch tested?
add unit test