-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-47300][SQL] quoteIfNeeded
should quote identifier starts with digits
#45401
Conversation
Do we already have it? |
@yaooqinn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
quoteIfNeeded
should quote identifier starts with digitsquoteIfNeeded
should quote identifier starts with digits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cloud-fan Seems the test failure is related:
[info] - SPARK-34872: quoteIfNeeded should quote a string which contains non-word characters *** FAILED *** (3 milliseconds)
[info] "[`1a`]" did not equal "[1a]" (StringUtilsSuite.scala:136)
[info] Analysis:
[info] "[`1a`]" -> "[1a]"
How about deduplicate the tests: spark/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala Lines 132 to 140 in 5089140
and maybe move them to QuotingUtilsSuite .
|
It seems that the same failure (which @MaxGekk 's reported) still exists at the last CI.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM (if CIs passes after adjusting the test suite)
@@ -129,16 +129,6 @@ class StringUtilsSuite extends SparkFunSuite with SQLHelper { | |||
} | |||
} | |||
|
|||
test("SPARK-34872: quoteIfNeeded should quote a string which contains non-word characters") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's covered by the new test
Merged to master. Thank you @cloud-fan @dongjoon-hyun @HyukjinKwon @MaxGekk |
…h digits ### What changes were proposed in this pull request? `quoteIfNeeded` is used to generate pretty strings of identifiers in error message, EXPLAIN result, etc. It's mostly for humans to read but sometimes people may copy-paste the identifier string and put it in a SQL. There is a small issue in `quoteIfNeeded`: it does not quote number literals like `0d`, and people will get parse exception if they directly put the identifier string in SQL. This PR fixes the issue by always quoting the identifier if it does not start with alphabet or underscore. Note, there might be program trying to parse the identifier string, but this change is safe as these programs should already handle quoting and it's ok to quote more. ### Why are the changes needed? make identifier string parsable is more user-friendly. ### Does this PR introduce _any_ user-facing change? yes, the identifier string in the error message and EXPLAIN result will be properly quoted. ### How was this patch tested? new test suite ### Was this patch authored or co-authored using generative AI tooling? no Closes apache#45401 from cloud-fan/quote. Lead-authored-by: Wenchen Fan <wenchen@databricks.com> Co-authored-by: Wenchen Fan <cloud0fan@gmail.com> Signed-off-by: Kent Yao <yao@apache.org>
What changes were proposed in this pull request?
quoteIfNeeded
is used to generate pretty strings of identifiers in error message, EXPLAIN result, etc. It's mostly for humans to read but sometimes people may copy-paste the identifier string and put it in a SQL. There is a small issue inquoteIfNeeded
: it does not quote number literals like0d
, and people will get parse exception if they directly put the identifier string in SQL.This PR fixes the issue by always quoting the identifier if it does not start with alphabet or underscore. Note, there might be program trying to parse the identifier string, but this change is safe as these programs should already handle quoting and it's ok to quote more.
Why are the changes needed?
make identifier string parsable is more user-friendly.
Does this PR introduce any user-facing change?
yes, the identifier string in the error message and EXPLAIN result will be properly quoted.
How was this patch tested?
new test suite
Was this patch authored or co-authored using generative AI tooling?
no