-
-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lexer - Solving ambiguity on function keywords #385
Conversation
Codecov ReportBase: 95.64% // Head: 95.66% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## master #385 +/- ##
============================================
+ Coverage 95.64% 95.66% +0.02%
- Complexity 2144 2156 +12
============================================
Files 67 67
Lines 4548 4569 +21
============================================
+ Hits 4350 4371 +21
Misses 198 198
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
This comment was marked as resolved.
This comment was marked as resolved.
Note: When this gets merged please hard reset your master branch to phpmyadmin/sql-parser#master and use specific branch names for your PRs ;) |
Yes, i've mistakenly used the master branch here, I will make sure to do this after this get merged |
src/Lexer.php
Outdated
/** | ||
* Resolves the ambiguity when dealing with the functions keywords. | ||
* | ||
* In SQL statements, the function keywords might be used as table names or columns names. | ||
* To solve this ambiguity, the solution is to find the next token, excluding whitespaces and | ||
* comments, right after the function keyword position. The function keyword is for sure used | ||
* as column name or table name if the next token found is any of: | ||
* | ||
* - "FROM" (the FROM keyword like in "SELECT Country x, AverageSalary avg FROM..."); | ||
* - "WHERE" (the WHERE keyword like in "DELETE FROM emp x WHERE x.salary = 20"); | ||
* - "SET" (the SET keyword like in "UPDATE Country x, City y set x.Name=x.Name"); | ||
* - "," (a comma separator like 'x,' in "UPDATE Country x, City y set x.Name=x.Name"); | ||
* - "." (a dot separator like in "x.asset_id FROM (SELECT evt.asset_id FROM evt)". | ||
* - "NULL" (when used as a table alias like in "avg.col FROM (SELECT ev.col FROM ev) avg"). | ||
* | ||
* This method will change the flag of the function keyword tokens when any of those | ||
* condition above is true. Otherwise, the | ||
* default flag (function keyword) will be kept. | ||
* | ||
* @return void | ||
*/ | ||
private function solveAmbiguityOnFunctionKeywords() | ||
{ | ||
$iBak = $this->list->idx; | ||
$keywordFunction = Token::TYPE_KEYWORD | Token::FLAG_KEYWORD_FUNCTION; | ||
while (($keywordToken = $this->list->getNextOfTypeAndFlag(Token::TYPE_KEYWORD, $keywordFunction)) !== null) { | ||
$next = $this->list->getNext(); | ||
if ( | ||
($next->type !== Token::TYPE_KEYWORD || ! in_array($next->value, ['FROM', 'SET', 'WHERE'], true)) | ||
&& ($next->type !== Token::TYPE_OPERATOR || ! in_array($next->value, ['.', ','], true)) | ||
&& ($next->value !== null) | ||
) { | ||
continue; | ||
} | ||
|
||
$keywordToken->type = Token::TYPE_NONE; | ||
$keywordToken->flags = Token::TYPE_NONE; | ||
$keywordToken->keyword = $keywordToken->value; | ||
} | ||
|
||
$this->list->idx = $iBak; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope that I'm not missing any other cases. This way the keywords that represent functions names can be used freely as aliases or table names if the lexer detected that they're not being used as functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changes
Can you squash your commits on the next push ? |
Signed-off-by: iifawzi <iifawzie@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really nice, I think we should wait for the parser to be released before merging this into it. I consider this one as potentially risky, not sure why.
But anyway we can agree this is not emergency and can wait for the next window to go in ?
Since it affects the lexer, I agree that it's potentially risky. But still, the risks that this might introduce are minimal since we're only changing the token type in very specific cases, and otherwise it's kept as it's. Anyway, this's not an emergency yes, it can for sure wait for the next release. |
Pull-request: #385 Signed-off-by: William Desportes <williamdes@wdes.fr>
Fixing: #327 and #378
Signed-off-by: iifawzi iifawzie@gmail.com