[SPARK-46830][SQL] Fix collation strength for parameter markers in EXECUTE IMMEDIATE#55219
Open
ilicmarkodb wants to merge 1 commit intoapache:masterfrom
Open
[SPARK-46830][SQL] Fix collation strength for parameter markers in EXECUTE IMMEDIATE#55219ilicmarkodb wants to merge 1 commit intoapache:masterfrom
ilicmarkodb wants to merge 1 commit intoapache:masterfrom
Conversation
cbc6d6b to
0a5c9f6
Compare
707459a to
aeb9912
Compare
ac6a447 to
649df55
Compare
649df55 to
1434668
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Fix parameter marker collation strength in
EXECUTE IMMEDIATEand parameterized queries so that parameters get implicit collation strength instead of explicit.ParameterHandler.convertToSql, collated string parameters (including those nested in arrays, maps, structs) are now serialized asCAST('value' AS STRING COLLATE X)instead of'value' COLLATE X, giving them implicit collation strength when re-parsed.NullTypechildren in CAST which gives Default (not Implicit) strength regardless.DataTypeUtils.hasNonDefaultStringCharOrVarcharTypeto recursively check if a type contains any explicitly collated STRING/CHAR/VARCHAR.ElementAtcollation context propagation inCollationTypeCoercion.findCollationContextto extract collation context from map value type or array element type.ElementAtkey coercion rule inCollationTypeCoercionto cast the lookup key to match a collated map's key type (analogous to existingGetMapValuerule).Why are the changes needed?
Previously, parameters in
EXECUTE IMMEDIATEand parameterized queries had explicit collation strength. This caused incorrect behavior — for example, a parameter withCOLLATE UTF8_LCASEwould win over a column's collation instead of producing anINDETERMINATE_COLLATION_IN_EXPRESSIONerror, which is the correct behavior for implicit-strength collations meeting a different column collation.Additionally,
element_at()on maps with collated keys failed withDATATYPE_MISMATCH.MAP_FUNCTION_DIFF_TYPESbecauseCollationTypeCoercionlacked a coercion rule forElementAt(unlikeGetMapValuewhich already had one).Does this PR introduce any user-facing change?
Yes. Parameters in
EXECUTE IMMEDIATEand parameterized queries now have implicit collation strength, matching the behavior of string literals. This means collation conflicts between parameters and columns with different collations will now correctly raiseINDETERMINATE_COLLATION_IN_EXPRESSIONinstead of silently using the parameter's collation.How was this patch tested?
Added 20 new tests in
CollationSuitecovering:spark.sql()API) vs columns and with collation strengthWas this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Code (claude-opus-4-6)