Skip to content

Commit

Permalink
[SPARK-32840][SQL][3.0] Invalid interval value can happen to be just …
Browse files Browse the repository at this point in the history
…adhesive with the unit

THIS PR backports apache#29708 to 3.0

### What changes were proposed in this pull request?
In this PR, we add a checker for STRING form interval value ahead for parsing multiple units intervals and fail directly if the interval value contains alphabets to prevent correctness issues like `interval '1 day 2' day`=`3 days`.

### Why are the changes needed?

fix correctness issue

### Does this PR introduce _any_ user-facing change?

yes, in spark 3.0.0 `interval '1 day 2' day`=`3 days` but now we fail with ParseException
### How was this patch tested?

add a test.

Closes apache#29716 from yaooqinn/SPARK-32840-30.

Authored-by: Kent Yao <yaooqinn@hotmail.com>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
  • Loading branch information
yaooqinn authored and dongjoon-hyun committed Sep 16, 2020
1 parent 29ea6b4 commit 78dc478
Show file tree
Hide file tree
Showing 4 changed files with 88 additions and 3 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -2107,7 +2107,16 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
val kvs = units.indices.map { i =>
val u = units(i).getText
val v = if (values(i).STRING() != null) {
string(values(i).STRING())
val value = string(values(i).STRING())
// SPARK-32840: For invalid cases, e.g. INTERVAL '1 day 2' hour,
// INTERVAL 'interval 1' day, we need to check ahead before they are concatenated with
// units and become valid ones, e.g. '1 day 2 hour'.
// Ideally, we only ensure the value parts don't contain any units here.
if (value.exists(Character.isLetter)) {
throw new ParseException("Can only use numbers in the interval value part for" +
s" multiple unit value pairs interval form, but got invalid value: $value", ctx)
}
value
} else {
values(i).getText
}
Expand Down
4 changes: 4 additions & 0 deletions sql/core/src/test/resources/sql-tests/inputs/interval.sql
Original file line number Diff line number Diff line change
Expand Up @@ -188,3 +188,7 @@ select interval '1.2';
select interval '- 2';
select interval '1 day -';
select interval '1 day 1';

select interval '1 day 2' day;
select interval 'interval 1' day;
select interval '-\t 1' day;
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
-- Automatically generated by SQLQueryTestSuite
-- Number of queries: 100
-- Number of queries: 103


-- !query
Expand Down Expand Up @@ -1054,3 +1054,39 @@ Cannot parse the INTERVAL value: 1 day 1(line 1, pos 7)
== SQL ==
select interval '1 day 1'
-------^^^


-- !query
select interval '1 day 2' day
-- !query schema
struct<>
-- !query output
org.apache.spark.sql.catalyst.parser.ParseException

Can only use numbers in the interval value part for multiple unit value pairs interval form, but got invalid value: 1 day 2(line 1, pos 16)

== SQL ==
select interval '1 day 2' day
----------------^^^


-- !query
select interval 'interval 1' day
-- !query schema
struct<>
-- !query output
org.apache.spark.sql.catalyst.parser.ParseException

Can only use numbers in the interval value part for multiple unit value pairs interval form, but got invalid value: interval 1(line 1, pos 16)

== SQL ==
select interval 'interval 1' day
----------------^^^


-- !query
select interval '-\t 1' day
-- !query schema
struct<INTERVAL '-1 days':interval>
-- !query output
-1 days
38 changes: 37 additions & 1 deletion sql/core/src/test/resources/sql-tests/results/interval.sql.out
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
-- Automatically generated by SQLQueryTestSuite
-- Number of queries: 100
-- Number of queries: 103


-- !query
Expand Down Expand Up @@ -1026,3 +1026,39 @@ Cannot parse the INTERVAL value: 1 day 1(line 1, pos 7)
== SQL ==
select interval '1 day 1'
-------^^^


-- !query
select interval '1 day 2' day
-- !query schema
struct<>
-- !query output
org.apache.spark.sql.catalyst.parser.ParseException

Can only use numbers in the interval value part for multiple unit value pairs interval form, but got invalid value: 1 day 2(line 1, pos 16)

== SQL ==
select interval '1 day 2' day
----------------^^^


-- !query
select interval 'interval 1' day
-- !query schema
struct<>
-- !query output
org.apache.spark.sql.catalyst.parser.ParseException

Can only use numbers in the interval value part for multiple unit value pairs interval form, but got invalid value: interval 1(line 1, pos 16)

== SQL ==
select interval 'interval 1' day
----------------^^^


-- !query
select interval '-\t 1' day
-- !query schema
struct<INTERVAL '-1 days':interval>
-- !query output
-1 days

0 comments on commit 78dc478

Please sign in to comment.