-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-17018][SQL] literals.sql for testing literal parsing #14598
Conversation
|
||
|
||
-- !query 5 | ||
select 32768 S |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this throw an exception?
cc @cloud-fan / @rxin / @hvanhovell
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? This will create Integer literal 32768
aliased as S
. select 32768S
does throw an exception.
|
||
|
||
-- !query 13 | ||
select 1D, 1 D, 1.2D, 1e10, 1.5e5, .10 D, 0.10 D, .1e5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a bug here too.
I would expect .10 D to be parsed as double, not decimal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
D
is a double literal (like in Hive). So this checks out.
-- !query 7 schema | ||
struct<> | ||
-- !query 7 output | ||
org.apache.spark.sql.catalyst.parser.ParseException |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This exception message can be better. It doesn't actually say out of range.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmmm - this is funny. The exception/message is produced by java.lang.Long.parseLong(...)
, but that doesn't seem to produce something sensible. I was expecting the something similar to the error java.lang.Short.parseShort(...)
produces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we parse integral literals as BigInteger, and then turn them into appropriate types? That way we have more control.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do that already for 'untyped' (without a suffix) integral literals. I like your suggestion (this means we can also control the exception for Short better), could you open an PR for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind this one. We have someone working on it.
Test build #63593 has finished for PR 14598 at commit
|
Test build #63596 has finished for PR 14598 at commit
|
Test build #63598 has finished for PR 14598 at commit
|
Test build #63597 has finished for PR 14598 at commit
|
select 1234567890123456789012345678901234567890.0; | ||
|
||
-- super large scientific notation numbers should still be valid doubles | ||
select 123456789012345678901234567890123456789e10, 123456789012345678901234567890123456789.1e10; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we also add really large double, 1E309
for instance (that will actually evaluate to positive infinity).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me add that.
This is pretty cool :) I am comparing this to the |
I have updated this to include more string literals and added timestmap/date/interval parsing. That said, I didn't add all the test cases for interval because there were a large number, and I felt those are best left for parser unit tests. |
I also didn't include \b and \0 parsing. Otherwise github shows the result file as binary and refuse to display the diff, which makes it more difficult to review. |
-- invalid timestamp | ||
select timestamp '2016-33-11 20:54:00.000'; | ||
|
||
-- internal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: interval? :)
Test build #63625 has finished for PR 14598 at commit
|
Test build #63626 has finished for PR 14598 at commit
|
I'm going to merge this in master/2.0. |
## What changes were proposed in this pull request? This patch adds literals.sql for testing literal parsing end-to-end in SQL. ## How was this patch tested? The patch itself is only about adding test cases. Author: petermaxlee <petermaxlee@gmail.com> Closes #14598 from petermaxlee/SPARK-17018-2. (cherry picked from commit cf93678) Signed-off-by: Reynold Xin <rxin@databricks.com>
What changes were proposed in this pull request?
This patch adds literals.sql for testing literal parsing end-to-end in SQL.
How was this patch tested?
The patch itself is only about adding test cases.