-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-33100][SQL] Ignore a semicolon inside a bracketed comment in spark-sql #29982
Conversation
If a comment |
ok to test |
Test build #129585 has finished for PR 29982 at commit
|
Kubernetes integration test starting |
Kubernetes integration test status success |
@maropu Thanks for your comments, I have modified the title. |
Kubernetes integration test starting |
Test build #129618 has finished for PR 29982 at commit
|
Test build #129616 has finished for PR 29982 at commit
|
Kubernetes integration test starting |
Kubernetes integration test status failure |
Kubernetes integration test status success |
Test build #129621 has finished for PR 29982 at commit
|
Kubernetes integration test starting |
Kubernetes integration test status success |
...e-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala
Outdated
Show resolved
Hide resolved
sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
Outdated
Show resolved
Hide resolved
...e-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala
Outdated
Show resolved
Hide resolved
sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
Outdated
Show resolved
Hide resolved
sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
Outdated
Show resolved
Hide resolved
sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
Show resolved
Hide resolved
Thanks for fixing this, @turboFei. Looks fine cc: @HyukjinKwon @yaooqinn @wangyum |
sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
Outdated
Show resolved
Hide resolved
sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
Outdated
Show resolved
Hide resolved
Test build #133631 has finished for PR 29982 at commit
|
Kubernetes integration test starting |
Kubernetes integration test status success |
…park-sql ### What changes were proposed in this pull request? Now the spark-sql does not support parse the sql statements with bracketed comments. For the sql statements: ``` /* SELECT 'test'; */ SELECT 'test'; ``` Would be split to two statements: The first one: `/* SELECT 'test'` The second one: `*/ SELECT 'test'` Then it would throw an exception because the first one is illegal. In this PR, we ignore the content in bracketed comments while splitting the sql statements. Besides, we ignore the comment without any content. ### Why are the changes needed? Spark-sql might split the statements inside bracketed comments and it is not correct. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Added UT. Closes #29982 from turboFei/SPARK-33110. Lead-authored-by: fwang12 <fwang12@ebay.com> Co-authored-by: turbofei <fwang12@ebay.com> Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org> (cherry picked from commit a071826) Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
Many thanks, @turboFei and @yaooqinn ! Merged to master/3.1. FYI: @dongjoon-hyun @HyukjinKwon |
@turboFei Could you open a PR to fix it for branch-3.0/2.4? |
sure |
…park-sql Now the spark-sql does not support parse the sql statements with bracketed comments. For the sql statements: ``` /* SELECT 'test'; */ SELECT 'test'; ``` Would be split to two statements: The first one: `/* SELECT 'test'` The second one: `*/ SELECT 'test'` Then it would throw an exception because the first one is illegal. In this PR, we ignore the content in bracketed comments while splitting the sql statements. Besides, we ignore the comment without any content. Spark-sql might split the statements inside bracketed comments and it is not correct. No. Added UT. Closes apache#29982 from turboFei/SPARK-33110. Lead-authored-by: fwang12 <fwang12@ebay.com> Co-authored-by: turbofei <fwang12@ebay.com> Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
…park-sql Now the spark-sql does not support parse the sql statements with bracketed comments. For the sql statements: ``` /* SELECT 'test'; */ SELECT 'test'; ``` Would be split to two statements: The first one: `/* SELECT 'test'` The second one: `*/ SELECT 'test'` Then it would throw an exception because the first one is illegal. In this PR, we ignore the content in bracketed comments while splitting the sql statements. Besides, we ignore the comment without any content. Spark-sql might split the statements inside bracketed comments and it is not correct. No. Added UT. Closes apache#29982 from turboFei/SPARK-33110. Lead-authored-by: fwang12 <fwang12@ebay.com> Co-authored-by: turbofei <fwang12@ebay.com> Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
@turboFei I found some GA flakiness caused by this commit, e.g., Could you check/fix it? FYI: @dongjoon-hyun @HyukjinKwon |
oh, i will fix it today. Should we ignore the comments during two for example,
should be transfered as Or I just remove the test case like that? |
What's a root cause of the flakiness? It depends on the cause, I think. |
CC @bogdanghit |
there is a bug for statementBegin method. |
create #31054 to fix this issue |
… in spark-sql ### What changes were proposed in this pull request? Now the spark-sql does not support parse the sql statements with bracketed comments. For the sql statements: ``` /* SELECT 'test'; */ SELECT 'test'; ``` Would be split to two statements: The first one: `/* SELECT 'test'` The second one: `*/ SELECT 'test'` Then it would throw an exception because the first one is illegal. In this PR, we ignore the content in bracketed comments while splitting the sql statements. Besides, we ignore the comment without any content. NOTE: This backport comes from #29982 ### Why are the changes needed? Spark-sql might split the statements inside bracketed comments and it is not correct. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Added UT. Closes #31033 from turboFei/SPARK-33100. Authored-by: fwang12 <fwang12@ebay.com> Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
…kend ### What changes were proposed in this pull request? In current spark-sql cli interface, if the end SQL is not a close comment, the SQL won't be passed to backend engine and just ignored. This caused a problem that if user write a SQL with wrong comment. It's just ignored and won't throw exception. For example: ``` spark-sql> /* This is a comment without end symbol SELECT 1; spark-sql> ``` After this pr: ``` spark-sql> /* This is a comment without end symbol SELECT 1; Error in query: Unclosed bracketed comment(line 1, pos 0) == SQL == /* This is a comment without end symbol SELECT 1; ^^^ ``` In SPARK-33100 add this change #29982 Hive related code https://github.com/apache/hive/blob/1090c93b1a02d480bdee2af2cecf503f8a54efc6/cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java#L488-L490 ### Why are the changes needed? Exact exceptions are thrown for wrong statements, which is convenient for users to troubleshoot. ### Does this PR introduce _any_ user-facing change? Yes, if user write a wrong comment in sql/sql file or query in the end. Before it's just ignored since it's not a statement. Now it will be passed to backend engine and if the statement is not correct, it will throw SQL exception. ### How was this patch tested? added UT and test by handle. ``` spark-sql> /* SELECT /*+ HINT() 4; */; Error in query: mismatched input ';' expecting {'(', 'ADD', 'ALTER', 'ANALYZE', 'CACHE', 'CLEAR', 'COMMENT', 'COMMIT', 'CREATE', 'DELETE', 'DESC', 'DESCRIBE', 'DFS', 'DROP', 'EXPLAIN', 'EXPORT', 'FROM', 'GRANT', 'IMPORT', 'INSERT', 'LIST', 'LOAD', 'LOCK', 'MAP', 'MERGE', 'MSCK', 'REDUCE', 'REFRESH', 'REPLACE', 'RESET', 'REVOKE', 'ROLLBACK', 'SELECT', 'SET', 'SHOW', 'START', 'TABLE', 'TRUNCATE', 'UNCACHE', 'UNLOCK', 'UPDATE', 'USE', 'VALUES', 'WITH'}(line 1, pos 26) == SQL == /* SELECT /*+ HINT() 4; */; --------------------------^^^ spark-sql> /* SELECT /*+ HINT() 4; */ > SELECT 1; 1 Time taken: 3.16 seconds, Fetched 1 row(s) spark-sql> /* SELECT /*+ HINT() */ 4; */; spark-sql> > ; spark-sql> > /* SELECT /*+ HINT() 4\\; > SELECT 1; Error in query: Unclosed bracketed comment(line 1, pos 0) == SQL == /* SELECT /*+ HINT() 4\\; ^^^ SELECT 1; spark-sql> ``` Closes #34815 from AngersZhuuuu/SPARK-37555. Authored-by: Angerszhuuuu <angers.zhu@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
What changes were proposed in this pull request?
Now the spark-sql does not support parse the sql statements with bracketed comments.
For the sql statements:
Would be split to two statements:
The first one:
/* SELECT 'test'
The second one:
*/ SELECT 'test'
Then it would throw an exception because the first one is illegal.
In this PR, we ignore the content in bracketed comments while splitting the sql statements.
Besides, we ignore the comment without any content.
Why are the changes needed?
Spark-sql might split the statements inside bracketed comments and it is not correct.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Added UT.