Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-33460][SQL] Accessing map values should fail if key is not found #30386

Closed
wants to merge 3 commits into from

Conversation

leanken-zz
Copy link
Contributor

What changes were proposed in this pull request?

Instead of returning NULL, throws runtime NoSuchElementException towards invalid key accessing in map-like functions, such as element_at, GetMapValue, when ANSI mode is on.

Why are the changes needed?

For ANSI mode.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Added UT and Existing UT.

…alid key access when ANSI mode is on.

Change-Id: I562d15f30f157bfeac7900b4e46993abeae66570
@leanken-zz
Copy link
Contributor Author

@cloud-fan FYI.

Change-Id: Ic592678e4452f550b2c645be9cba5cac1c009b82
@SparkQA
Copy link

SparkQA commented Nov 16, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35752/

@SparkQA
Copy link

SparkQA commented Nov 16, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35753/

@SparkQA
Copy link

SparkQA commented Nov 16, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35752/

@SparkQA
Copy link

SparkQA commented Nov 16, 2020

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35753/

null
}
}
else if (values.isNullAt(i)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: merge it to the previous line.

Change-Id: I1551c79b4d404cde0a0d4d898e32e61cc0acfa7b
@SparkQA
Copy link

SparkQA commented Nov 16, 2020

Test build #131150 has finished for PR 30386 at commit 2649359.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 16, 2020

Test build #131149 has finished for PR 30386 at commit d830d14.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class GetMapValue(

@SparkQA
Copy link

SparkQA commented Nov 16, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35768/

@SparkQA
Copy link

SparkQA commented Nov 16, 2020

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35768/

@cloud-fan
Copy link
Contributor

GA passed, merging to master, thanks!

@cloud-fan cloud-fan closed this in b5eca18 Nov 16, 2020
@SparkQA
Copy link

SparkQA commented Nov 16, 2020

Test build #131165 has finished for PR 30386 at commit 9e29957.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

gengliangwang added a commit that referenced this pull request Aug 15, 2022
… map column

### What changes were proposed in this pull request?

Change the syntax of map column access under ANSI mode: always return null results instead of throwing `MAP_KEY_DOES_NOT_EXIST`  errors.
This PR also remove an internal `spark.sql.ansi.strictIndexOperator`.

### Why are the changes needed?

Since #30386, Spark always throws an error on invalid access to a map column. There is no such syntax in the ANSI SQL standard since there is no Map type in it. There is a similar type `multiset` which returns null on non-existing element access.
Also, I investigated PostgreSQL/Snowflake/Biguqery and all of them returns null return on map(json) key not exists.
I suggest loosen the the syntax here. When users get the error, most of them will just use `try_element_at()` to get the same syntax or just turn off the ANSI SQL mode.

### Does this PR introduce _any_ user-facing change?

Yes, see above

### How was this patch tested?

Unit tests

Closes #37503 from gengliangwang/returnNullOnInvalidMapAccess.

Authored-by: Gengliang Wang <gengliang@apache.org>
Signed-off-by: Gengliang Wang <gengliang@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants