-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-28880][SQL] Support ANSI nested bracketed comments #27495
Conversation
@gatorsmile Thank you. |
Test build #118054 has finished for PR 27495 at commit
|
I personally think that these supported comment syntax are documented in the Spark SQL Guide (along with the PostgreSQL/Vertica docs above). WDYT? cc: @dilipbiswal |
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/PlanParserSuite.scala
Show resolved
Hide resolved
|
||
test("bracketed comment case one") { | ||
val plan = table("a").select(star()) | ||
assertEqual("/* This is an example of SQL which should not execute:\n" + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you follow a format like this?
assertEqual(
"""
|/*
| XXX
| */
|SELECT ...
""".stripMargin,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Thanks.
@@ -1794,7 +1794,7 @@ BRACKETED_EMPTY_COMMENT | |||
; | |||
|
|||
BRACKETED_COMMENT | |||
: '/*' ~[+] .*? '*/' -> channel(HIDDEN) | |||
: '/*' ~[+] ( ~'/' | ~'*' '/' ~'*' )*? BRACKETED_COMMENT? ( ~'/' | ~'*' '/' ~'*' )*? '*/' -> channel(HIDDEN) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
: '/*' ~[+] (BRACKETED_COMMENT|.)*? '*/' -> channel(HIDDEN)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry. Maybe I lost some information.
I try to use this again and found it works well.
Can you remove the jira numbers below?
|
I want remove the jira numbers with #27481. Golden files can't generated correctly. |
If the golden file is not correct, we should fix this in this pr because |
Could I create another ticket and update golden file with the new ticket id? |
Test build #118063 has finished for PR 27495 at commit
|
Since that's a minor fix, plz include it in this pr. |
Test build #118066 has finished for PR 27495 at commit
|
Test build #118069 has finished for PR 27495 at commit
|
Test build #118070 has finished for PR 27495 at commit
|
Test build #118071 has finished for PR 27495 at commit
|
I have no idea that |
Ah, I got it now. The reason not to generate the output correctly is an issue in the testing logic itself in |
@maropu Yes, thank you. I will throw exception in golden files until we can fix it. |
Test build #118095 has finished for PR 27495 at commit
|
@cloud-fan This is a very very good idea. I learned it and will make a try. |
@cloud-fan |
@cloud-fan @gengliangwang
|
* Returns true if the first character is '+'. | ||
*/ | ||
public boolean isHint() { | ||
int firstChar = _input.LA(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_input.LA(1)
returns next char, shall we name it nextChar
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
@@ -61,6 +61,19 @@ grammar SqlBase; | |||
* When true, the behavior of keywords follows ANSI SQL standard. | |||
*/ | |||
public boolean SQL_standard_keyword_behavior = false; | |||
|
|||
/** | |||
* Verify whether current token is a valid hint token (which follows '/*' and is '+'). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to enrich this comment:
This is called when we see '/*' and try to match it as a comment. If the next char is '+', this should
be parsed as hint later and we can't match it as a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is better. Thanks.
@@ -1797,11 +1810,11 @@ SIMPLE_COMMENT | |||
; | |||
|
|||
BRACKETED_EMPTY_COMMENT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we still need it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(BRACKETED_COMMENT|.)*?
should work for empty comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need.
Test build #118785 has finished for PR 27495 at commit
|
|
||
/** | ||
* This method will be called when we see '/*' and try to match it as a bracketed comment. | ||
* If the next character is '+', it should be parsed as hint later, otherwise we cannot match |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
otherwise
-> and
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
* If the next character is '+', it should be parsed as hint later, otherwise we cannot match | ||
* it as a bracketed comment. | ||
* | ||
* Returns true if the first character is '+'. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
next character
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My fault.
Test build #118788 has finished for PR 27495 at commit
|
retest this please |
Test build #118789 has finished for PR 27495 at commit
|
retest this please |
Test build #118790 has finished for PR 27495 at commit
|
retest this please |
Test build #118786 has finished for PR 27495 at commit
|
Test build #118798 has finished for PR 27495 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the work!
Merging to master. |
@cloud-fan @maropu @gengliangwang @dongjoon-hyun @gatorsmile Thanks for all your help. |
### What changes were proposed in this pull request? Although Spark SQL support bracketed comments, but `SQLQueryTestSuite` can't treat bracketed comments well and lead to generated golden files can't display bracketed comments well. This PR will improve the treatment of bracketed comments and add three test case in `PlanParserSuite`. Spark SQL can't support nested bracketed comments and apache#27495 used to support it. ### Why are the changes needed? Golden files can't display well. ### Does this PR introduce any user-facing change? No ### How was this patch tested? New UT. Closes apache#27481 from beliefer/ansi-brancket-comments. Authored-by: beliefer <beliefer@163.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
### What changes were proposed in this pull request? Spark SQL support single comments and bracketed comments now. This PR will support nested bracketed comments. There are some mainstream database support the syntax. **PostgreSQL:** https://www.postgresql.org/docs/11/sql-syntax-lexical.html#SQL-SYNTAX-COMMENTS **Vertica:** https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/SQLReferenceManual/LanguageElements/Expressions/Comments.htm?zoom_highlight=comments Note: Because Spark SQL not exists UT for single comments and bracketed comments, so I add some UT for them. ### Why are the changes needed? nested bracketed comments is ANSI standard. ### Does this PR introduce any user-facing change? No ### How was this patch tested? New UT Closes apache#27495 from beliefer/nested-brancket-comments. Authored-by: beliefer <beliefer@163.com> Signed-off-by: Gengliang Wang <gengliang.wang@databricks.com>
What changes were proposed in this pull request?
Spark SQL support single comments and bracketed comments now. This PR will support nested bracketed comments.
There are some mainstream database support the syntax.
PostgreSQL:
https://www.postgresql.org/docs/11/sql-syntax-lexical.html#SQL-SYNTAX-COMMENTS
Vertica:
https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/SQLReferenceManual/LanguageElements/Expressions/Comments.htm?zoom_highlight=comments
Note: Because Spark SQL not exists UT for single comments and bracketed comments, so I add some UT for them.
Why are the changes needed?
nested bracketed comments is ANSI standard.
Does this PR introduce any user-facing change?
No
How was this patch tested?
New UT