Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-17018][SQL] literals.sql for testing literal parsing #14598

Closed
wants to merge 9 commits into from

Conversation

petermaxlee
Copy link
Contributor

What changes were proposed in this pull request?

This patch adds literals.sql for testing literal parsing end-to-end in SQL.

How was this patch tested?

The patch itself is only about adding test cases.



-- !query 5
select 32768 S
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this throw an exception?

cc @cloud-fan / @rxin / @hvanhovell

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? This will create Integer literal 32768 aliased as S. select 32768S does throw an exception.



-- !query 13
select 1D, 1 D, 1.2D, 1e10, 1.5e5, .10 D, 0.10 D, .1e5
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a bug here too.

I would expect .10 D to be parsed as double, not decimal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

D is a double literal (like in Hive). So this checks out.

-- !query 7 schema
struct<>
-- !query 7 output
org.apache.spark.sql.catalyst.parser.ParseException
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This exception message can be better. It doesn't actually say out of range.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmmm - this is funny. The exception/message is produced by java.lang.Long.parseLong(...), but that doesn't seem to produce something sensible. I was expecting the something similar to the error java.lang.Short.parseShort(...) produces.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we parse integral literals as BigInteger, and then turn them into appropriate types? That way we have more control.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do that already for 'untyped' (without a suffix) integral literals. I like your suggestion (this means we can also control the exception for Short better), could you open an PR for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Will do.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind this one. We have someone working on it.

@SparkQA
Copy link

SparkQA commented Aug 11, 2016

Test build #63593 has finished for PR 14598 at commit 15d2eec.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 11, 2016

Test build #63596 has finished for PR 14598 at commit c40c957.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 11, 2016

Test build #63598 has finished for PR 14598 at commit d60b2bb.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 11, 2016

Test build #63597 has finished for PR 14598 at commit 243cd39.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

select 1234567890123456789012345678901234567890.0;

-- super large scientific notation numbers should still be valid doubles
select 123456789012345678901234567890123456789e10, 123456789012345678901234567890123456789.1e10;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we also add really large double, 1E309 for instance (that will actually evaluate to positive infinity).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me add that.

@hvanhovell
Copy link
Contributor

This is pretty cool :)

I am comparing this to the ExpressionParserSuite. Shouldn't we add support for more complex string cases, intervals and a type constructors, see: https://github.com/apache/spark/blob/master/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ExpressionParserSuite.scala#L331-L498. If we do we can move these tests out of the ExpressionParserSuite.

@petermaxlee
Copy link
Contributor Author

I have updated this to include more string literals and added timestmap/date/interval parsing. That said, I didn't add all the test cases for interval because there were a large number, and I felt those are best left for parser unit tests.

@petermaxlee
Copy link
Contributor Author

I also didn't include \b and \0 parsing. Otherwise github shows the result file as binary and refuse to display the diff, which makes it more difficult to review.

-- invalid timestamp
select timestamp '2016-33-11 20:54:00.000';

-- internal
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: interval? :)

@SparkQA
Copy link

SparkQA commented Aug 11, 2016

Test build #63625 has finished for PR 14598 at commit 00bc63a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 11, 2016

Test build #63626 has finished for PR 14598 at commit 5565144.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@rxin
Copy link
Contributor

rxin commented Aug 11, 2016

I'm going to merge this in master/2.0.

@asfgit asfgit closed this in cf93678 Aug 11, 2016
asfgit pushed a commit that referenced this pull request Aug 11, 2016
## What changes were proposed in this pull request?
This patch adds literals.sql for testing literal parsing end-to-end in SQL.

## How was this patch tested?
The patch itself is only about adding test cases.

Author: petermaxlee <petermaxlee@gmail.com>

Closes #14598 from petermaxlee/SPARK-17018-2.

(cherry picked from commit cf93678)
Signed-off-by: Reynold Xin <rxin@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants