[SPARK-14398] [SQL] Audit non-reserved keyword list in ANTLR4 parser#12191
[SPARK-14398] [SQL] Audit non-reserved keyword list in ANTLR4 parser#12191bomeng wants to merge 12 commits intoapache:masterfrom bomeng:SPARK-14398
Conversation
|
Test build #55066 has finished for PR 12191 at commit
|
|
Test build #55080 has finished for PR 12191 at commit
|
|
Test build #55091 has finished for PR 12191 at commit
|
|
@bomeng what is the point of adding non-reserved keywords if they are not used in parser rules? The main point of this ticket is that we need to make sure that we do not have regressions compared to the old situation; a non-reserved keyword in the old situation should not be reserved in the new situation. Did you find any of these cases? |
|
If we don't actually use these non-reserved keywords in any rules, I think we don't need to add them to the list. It might cause confusing too. |
|
Sorry for my misunderstanding. I thought we want to keep all the keywords that were defined in the Antlr3 and later if we want to use them, we do not have to add them back case by case. Among the items I added, some of them (e.g. ASC, DESC) needs to be in the non-reserved list, since they are used the parser and were non-reserved before. Should I only focus on those? What is the best way to do it? Please advise. Thanks. |
|
@bomeng No worries. Please focus on the keywords that are reserved in the ANTLR4 parser, but were not in the ANTLR3 parser. The exception being join keywords. We can add the other keywords when we need them. |
|
Another try... This time, I've scanned all the existing keywords one by one and added missing non-reserved ones back. So it is more conservative approach. Later on, if we need to support more syntax, we can add more keywords by then. Thanks. |
|
Test build #55247 has finished for PR 12191 at commit
|
| | STATISTICS | ANALYZE | PARTITIONED | EXTERNAL | DEFINED | RECORDWRITER | ||
| | REVOKE | GRANT | LOCK | UNLOCK | MSCK | EXPORT | IMPORT | LOAD | VALUES | COMMENT | ROLE | ||
| | ROLES | COMPACTIONS | PRINCIPALS | TRANSACTIONS | INDEX | INDEXES | LOCKS | OPTION | ||
| | ASC | DESC | LIMIT | METADATA | MINUS | PLUS | RENAME | SETS |
There was a problem hiding this comment.
PLUS (+) and MINUS (-) are bit funny, and really shouldn't be used as identifiers. Lets leave them out.
|
@hvanhovell I've made the changes by removing +/-. I really want to sort out the keywords in the file if you agree, right now, I have to search one by one and it is tedious. Do you think it is worth to do another JIRA for that? |
|
Test build #55384 has finished for PR 12191 at commit
|
|
Test build #55386 has finished for PR 12191 at commit
|
|
@bomeng Can you update description too? Thanks. |
|
description is updated. thanks. |
|
Test build #55855 has finished for PR 12191 at commit
|
|
@bomeng sorry for not getting back to you sooner. Is sorting the list only for asthetics and ease of searching? It seems like it is not really worth effort if it is, what do you think? It might have a little merit in terms of performance to group all |
|
Yes, the reason for sorting the keywords is for ease of searching purpose. |
|
The compiler should emit a LGTM |
|
Merging to master. Thanks! |
## What changes were proposed in this pull request? I have compared non-reserved list in Antlr3 and Antlr4 one by one as well as all the existing keywords defined in Antlr4, added the missing keywords to the non-reserved keywords list. If we need to support more syntax, we can add more keywords by then. Any recommendation for the above is welcome. ## How was this patch tested? I manually checked the keywords one by one. Please let me know if there is a better way to test. Another thought: I suggest to put all the keywords definition and non-reserved list in order, that will be much easier to check in the future. Author: bomeng <bmeng@us.ibm.com> Closes apache#12191 from bomeng/SPARK-14398.
What changes were proposed in this pull request?
I have compared non-reserved list in Antlr3 and Antlr4 one by one as well as all the existing keywords defined in Antlr4, added the missing keywords to the non-reserved keywords list. If we need to support more syntax, we can add more keywords by then.
Any recommendation for the above is welcome.
How was this patch tested?
I manually checked the keywords one by one. Please let me know if there is a better way to test.
Another thought: I suggest to put all the keywords definition and non-reserved list in order, that will be much easier to check in the future.