-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-40066][SQL] ANSI mode: always return null on invalid access to map column #37503
Closed
gengliangwang
wants to merge
7
commits into
apache:master
from
gengliangwang:returnNullOnInvalidMapAccess
Closed
[SPARK-40066][SQL] ANSI mode: always return null on invalid access to map column #37503
gengliangwang
wants to merge
7
commits into
apache:master
from
gengliangwang:returnNullOnInvalidMapAccess
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
cc @leanken-zz as well |
cloud-fan
reviewed
Aug 15, 2022
@@ -2943,15 +2943,6 @@ object SQLConf { | |||
.booleanConf | |||
.createWithDefault(false) | |||
|
|||
val ANSI_STRICT_INDEX_OPERATOR = buildConf("spark.sql.ansi.strictIndexOperator") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should put it in removedSQLConfigs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, updated
cloud-fan
approved these changes
Aug 15, 2022
@cloud-fan thanks for the review. Merging to master |
HyukjinKwon
added a commit
that referenced
this pull request
Aug 17, 2022
…etting its dataType ### What changes were proposed in this pull request? This PR is a followup of #37503 that adds a check if the `ElementAt` expression is resolved or not before getting its dataType. ### Why are the changes needed? To make the tests pass with ANSI enabled. Currently it fails (https://github.com/apache/spark/runs/7870131749?check_suite_focus=true) as below: ``` [info] - map_filter *** FAILED *** (243 milliseconds) [info] org.apache.spark.sql.catalyst.analysis.UnresolvedException: Invalid call to dataType on unresolved object [info] at org.apache.spark.sql.catalyst.expressions.UnresolvedNamedLambdaVariable.dataType(higherOrderFunctions.scala:46) [info] at org.apache.spark.sql.catalyst.expressions.ElementAt.initQueryContext(collectionOperations.scala:2275) [info] at org.apache.spark.sql.catalyst.expressions.SupportQueryContext.$init$(Expression.scala:603) [info] at org.apache.spark.sql.catalyst.expressions.ElementAt.<init>(collectionOperations.scala:2105) [info] at org.apache.spark.sql.functions$.element_at(functions.scala:3958) [info] at org.apache.spark.sql.DataFrameFunctionsSuite.$anonfun$new$452(DataFrameFunctionsSuite.scala:2476) [info] at org.apache.spark.sql.functions$.createLambda(functions.scala:4029) [info] at org.apache.spark.sql.functions$.map_filter(functions.scala:4256) [info] at org.apache.spark.sql.DataFrameFunctionsSuite.$anonfun$new$451(DataFrameFunctionsSuite.scala:2476) [info] at org.apache.spark.sql.QueryTest.checkAnswer(QueryTest.scala:133) [info] at org.apache.spark.sql.DataFrameFunctionsSuite.$anonfun$new$445(DataFrameFunctionsSuite.scala:2478) [info] at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85) [info] at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83) [info] at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) [info] at org.scalatest.Transformer.apply(Transformer.scala:22) [info] at org.scalatest.Transformer.apply(Transformer.scala:20) [info] at org.scalatest.funsuite.AnyFunSuiteLike$$anon$1.apply(AnyFunSuiteLike.scala:190) [info] at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:204) [info] at org.scalatest.funsuite.AnyFunSuiteLike.invokeWithFixture$1(AnyFunSuiteLike.scala:188) ``` ### Does this PR introduce _any_ user-facing change? No, test-only. ### How was this patch tested? Manually tested with ANSI mode enabled. Closes #37548 from HyukjinKwon/SPARK-40066. Authored-by: Hyukjin Kwon <gurwls223@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Change the syntax of map column access under ANSI mode: always return null results instead of throwing
MAP_KEY_DOES_NOT_EXIST
errors.This PR also remove an internal
spark.sql.ansi.strictIndexOperator
.Why are the changes needed?
Since #30386, Spark always throws an error on invalid access to a map column. There is no such syntax in the ANSI SQL standard since there is no Map type in it. There is a similar type
multiset
which returns null on non-existing element access.Also, I investigated PostgreSQL/Snowflake/Biguqery and all of them returns null return on map(json) key not exists.
I suggest loosen the the syntax here. When users get the error, most of them will just use
try_element_at()
to get the same syntax or just turn off the ANSI SQL mode.Does this PR introduce any user-facing change?
Yes, see above
How was this patch tested?
Unit tests