Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-32785][SQL] Interval with dangling parts should not results null #29635

Closed
wants to merge 5 commits into from
Closed

[SPARK-32785][SQL] Interval with dangling parts should not results null #29635

wants to merge 5 commits into from

Conversation

yaooqinn
Copy link
Member

@yaooqinn yaooqinn commented Sep 3, 2020

What changes were proposed in this pull request?

bugfix for incomplete interval values, e.g. interval '1', interval '1 day 2', currently these cases will result null, but actually we should fail them with IllegalArgumentsException

Why are the changes needed?

correctness

Does this PR introduce any user-facing change?

yes, incomplete intervals will throw exception now

before

bin/spark-sql -S -e "select interval '1', interval '+', interval '1 day -'"

NULL NULL NULL

after

-- !query
select interval '1'
-- !query schema
struct<>
-- !query output
org.apache.spark.sql.catalyst.parser.ParseException

Cannot parse the INTERVAL value: 1(line 1, pos 7)

== SQL ==
select interval '1'

How was this patch tested?

unit tests added

@yaooqinn
Copy link
Member Author

yaooqinn commented Sep 3, 2020

cc @cloud-fan @MaxGekk @maropu @HyukjinKwon thanks for reviewing

@SparkQA
Copy link

SparkQA commented Sep 3, 2020

Test build #128248 has finished for PR 29635 at commit 6d27597.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -77,6 +77,19 @@ class IntervalUtilsSuite extends SparkFunSuite with SQLHelper {
}
}

test("string to interval: interval with dangling parts should not results null") {
checkFromInvalidString("+", "dangling interval value sign '+'")
checkFromInvalidString("-", "dangling interval value sign '-'")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about Expect a number after '-' but hit EOL

checkFromInvalidString("-", "dangling interval value sign '-'")
checkFromInvalidString("+ 2", "dangling interval value '2'")
checkFromInvalidString("- 1", "dangling interval value '1'")
checkFromInvalidString("1", "dangling interval value '1'")
Copy link
Contributor

@cloud-fan cloud-fan Sep 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about Expect a unit name after '1' but hit EOL

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, but looks a bit weird for interval '-.' to expect a unit name

@SparkQA
Copy link

SparkQA commented Sep 3, 2020

Test build #128254 has finished for PR 29635 at commit eb99bd5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 3, 2020

Test build #128261 has finished for PR 29635 at commit ae6d9be.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

can we add a migration guide?

@yaooqinn
Copy link
Member Author

yaooqinn commented Sep 4, 2020

OK, are you targeting this to 3.0 or not?

@cloud-fan
Copy link
Contributor

yea we should target 3.0. Let's add migration guide for 3.1 in this PR, and do the same for 3.0.2 in the backport PR.

@SparkQA
Copy link

SparkQA commented Sep 4, 2020

Test build #128293 has finished for PR 29635 at commit daed4e6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in de44e9c Sep 7, 2020
cloud-fan pushed a commit that referenced this pull request Sep 8, 2020
…t null

THIS PR brings #29635 to branch-3.0 and targets v3.0.2

### What changes were proposed in this pull request?

bugfix for incomplete interval values, e.g. interval '1', interval '1 day 2', currently these cases will result null, but actually we should fail them with IllegalArgumentsException

### Why are the changes needed?

correctness

### Does this PR introduce _any_ user-facing change?

yes, incomplete intervals will throw exception now

#### before
```
bin/spark-sql -S -e "select interval '1', interval '+', interval '1 day -'"

NULL NULL NULL
```
#### after

```
-- !query
select interval '1'
-- !query schema
struct<>
-- !query output
org.apache.spark.sql.catalyst.parser.ParseException

Cannot parse the INTERVAL value: 1(line 1, pos 7)

== SQL ==
select interval '1'
```

### How was this patch tested?

unit tests added

Closes #29658 from yaooqinn/SPARK-32785-30.

Authored-by: Kent Yao <yaooqinn@hotmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
@@ -40,6 +40,8 @@ license: |

- In Spark 3.1, `path` option cannot coexist when the following methods are called with path parameter(s): `DataFrameReader.load()`, `DataFrameWriter.save()`, `DataStreamReader.load()`, or `DataStreamWriter.start()`. In addition, `paths` option cannot coexist for `DataFrameReader.load()`. For example, `spark.read.format("csv").option("path", "/tmp").load("/tmp2")` or `spark.read.option("path", "/tmp").csv("/tmp2")` will throw `org.apache.spark.sql.AnalysisException`. In Spark version 3.0 and below, `path` option is overwritten if one path parameter is passed to above methods; `path` option is added to the overall paths if multiple path parameters are passed to `DataFrameReader.load()`. To restore the behavior before Spark 3.1, you can set `spark.sql.legacy.pathOptionBehavior.enabled` to `true`.

- In Spark 3.1, incomplete interval literals, e.g. `INTERVAL '1'`, `INTERVAL '1 DAY 2'` will fail with IllegalArgumentException. In Spark 3.0, they result `NULL`s.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Spark 3.1, incomplete interval literals, e.g. INTERVAL '1', INTERVAL '1 DAY 2' will fail with IllegalArgumentException. In Spark 3.0, they result NULLs.

=>

In Spark 3.1, IllegalArgumentException is returned for the incomplete interval literals, e.g. INTERVAL '1', INTERVAL '1 DAY 2', which are invalid. In Spark 3.0, these literals result in NULLs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you fix it in followup, @yaooqinn ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I will raise a PR today~

maropu pushed a commit that referenced this pull request Oct 21, 2020
…lete interval literals

### What changes were proposed in this pull request?

Address comments  #29635 (comment) to improve migration guide

### Why are the changes needed?

improve migration guide

### Does this PR introduce _any_ user-facing change?

NO,only doc update

### How was this patch tested?

passing GitHub action

Closes #30113 from yaooqinn/SPARK-32785-F.

Authored-by: Kent Yao <yaooqinn@hotmail.com>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
maropu pushed a commit that referenced this pull request Oct 21, 2020
…complete interval literals

### What changes were proposed in this pull request?

Address comments  #29635 (comment) to improve migration guide

### Why are the changes needed?

improve migration guide

### Does this PR introduce _any_ user-facing change?

NO,only doc update

### How was this patch tested?

passing GitHub action

Closes #30117 from yaooqinn/SPARK-32785-F30.

Authored-by: Kent Yao <yaooqinn@hotmail.com>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
holdenk pushed a commit to holdenk/spark that referenced this pull request Oct 27, 2020
…t null

THIS PR brings apache#29635 to branch-3.0 and targets v3.0.2

### What changes were proposed in this pull request?

bugfix for incomplete interval values, e.g. interval '1', interval '1 day 2', currently these cases will result null, but actually we should fail them with IllegalArgumentsException

### Why are the changes needed?

correctness

### Does this PR introduce _any_ user-facing change?

yes, incomplete intervals will throw exception now

#### before
```
bin/spark-sql -S -e "select interval '1', interval '+', interval '1 day -'"

NULL NULL NULL
```
#### after

```
-- !query
select interval '1'
-- !query schema
struct<>
-- !query output
org.apache.spark.sql.catalyst.parser.ParseException

Cannot parse the INTERVAL value: 1(line 1, pos 7)

== SQL ==
select interval '1'
```

### How was this patch tested?

unit tests added

Closes apache#29658 from yaooqinn/SPARK-32785-30.

Authored-by: Kent Yao <yaooqinn@hotmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
holdenk pushed a commit to holdenk/spark that referenced this pull request Oct 27, 2020
…complete interval literals

### What changes were proposed in this pull request?

Address comments  apache#29635 (comment) to improve migration guide

### Why are the changes needed?

improve migration guide

### Does this PR introduce _any_ user-facing change?

NO,only doc update

### How was this patch tested?

passing GitHub action

Closes apache#30117 from yaooqinn/SPARK-32785-F30.

Authored-by: Kent Yao <yaooqinn@hotmail.com>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants