-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-33460][SQL] Accessing map values should fail if key is not found #30386
Conversation
…alid key access when ANSI mode is on. Change-Id: I562d15f30f157bfeac7900b4e46993abeae66570
@cloud-fan FYI. |
Change-Id: Ic592678e4452f550b2c645be9cba5cac1c009b82
Kubernetes integration test starting |
Kubernetes integration test starting |
Kubernetes integration test status success |
Kubernetes integration test status failure |
null | ||
} | ||
} | ||
else if (values.isNullAt(i)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: merge it to the previous line.
Test build #131150 has finished for PR 30386 at commit
|
Test build #131149 has finished for PR 30386 at commit
|
Kubernetes integration test starting |
Kubernetes integration test status failure |
GA passed, merging to master, thanks! |
Test build #131165 has finished for PR 30386 at commit
|
… map column ### What changes were proposed in this pull request? Change the syntax of map column access under ANSI mode: always return null results instead of throwing `MAP_KEY_DOES_NOT_EXIST` errors. This PR also remove an internal `spark.sql.ansi.strictIndexOperator`. ### Why are the changes needed? Since #30386, Spark always throws an error on invalid access to a map column. There is no such syntax in the ANSI SQL standard since there is no Map type in it. There is a similar type `multiset` which returns null on non-existing element access. Also, I investigated PostgreSQL/Snowflake/Biguqery and all of them returns null return on map(json) key not exists. I suggest loosen the the syntax here. When users get the error, most of them will just use `try_element_at()` to get the same syntax or just turn off the ANSI SQL mode. ### Does this PR introduce _any_ user-facing change? Yes, see above ### How was this patch tested? Unit tests Closes #37503 from gengliangwang/returnNullOnInvalidMapAccess. Authored-by: Gengliang Wang <gengliang@apache.org> Signed-off-by: Gengliang Wang <gengliang@apache.org>
What changes were proposed in this pull request?
Instead of returning NULL, throws runtime NoSuchElementException towards invalid key accessing in map-like functions, such as element_at, GetMapValue, when ANSI mode is on.
Why are the changes needed?
For ANSI mode.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Added UT and Existing UT.