-
Notifications
You must be signed in to change notification settings - Fork 7.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lexical Feedback #1158
Lexical Feedback #1158
Conversation
7b840ca
to
9ee6d71
Compare
Can one of the admins verify this patch? |
9ee6d71
to
b501455
Compare
| simple_variable { $$ = zend_ast_create(ZEND_AST_VAR, $1); } | ||
; | ||
|
||
property_name: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not handle property names with the same system?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because property names have sigils and therefore no naming limitations, if you handle it the same way as the other member names they will be restricted by the inclusive SEMI_RESERVED
rule https://github.com/php/php-src/pull/1158/files#diff-7eff82c2c5b45db512a9dc49fb990bb8R273
I'm sure you don't want that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I forgot that method names have a more limited set of allowed keywords :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, I found a way to uncomment https://github.com/php/php-src/pull/1158/files#diff-7eff82c2c5b45db512a9dc49fb990bb8R280 and have these words semi reserved too, but that would require that people aliasing methods that clash with method modifiers names to use the verbose syntax. Ex:
trait TraitA {
private function private(){}
}
Class A {
use TraitA {
private as protected protected;
}
}
I'm leaving this out of the patch though, as it has to be compatible with the older one.
|
||
identifier: | ||
T_STRING { $$ = $1; } | ||
| /* if */ SEMI_RESERVED { REWIND } /* and rematch as */ T_STRING { $$ = $3; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be possible to get the string representation of keyword without "rewind" and "rematch".
If something like the following may work, you won't need to change lexer at all.
identifier:
T_STRING { $$ = $1; }
| SEMI_RESERVED { zval zv; ZVAL_STRINGL(&zv, LANG_SCNG(yy_text), LANG_SCNG(yy_leng)); $$ = zend_ast_create_zval(&zv); }
;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a great idea because we don't waste time interacting with the lexer twice anymore. On the other side I was doing the ext tokenizer port based on this lexical feedback and now the lexer is completely unaware of context again.
I'll use this anyway and will try to find another solution on the tokenizer extension side again. Thanks ^^
Just a heads up: this PR is still active and soon we'll have updates. |
1580d0d
to
bfd0279
Compare
The implementation has no regression risks, has an even smaller footprint compared to the previous attempt involving a pure lexical approach, is higly predictable and higly configurable. To turn a word semi-reserved you only need to edit the "SEMI_RESERVED" parser rule, it's an inclusive list of all the words that should be matched as T_STRING on specific contexts. Example: ``` method_modifiers function returns_ref indentifier '(' parameter_list ')' ... ``` instead of: ``` method_modifiers function returns_ref T_STRING '(' parameter_list ')' ... ``` TODO: port ext tokenizer
The class_name_scalar rule was removed from grammar: "::class" is now just a reserved case insensitive class const. The lexer became completely unaware about the semi-reserved words context.
…calar back in grammar That was clearly not the best way to achieve it, let's reserve T_CLASS for now. Turning it a reserved case insensitive constant was a bad idea because there are many static analyzers out there relying on ::class as a language construct, we can't simply "downgrade" it to a special constant. It's no big deal, reverting this bit is fine and won't have big impacts on the final result. This reverts commit c931644.
ef7573a
to
ec16a1e
Compare
a6365d6
to
608fa98
Compare
32f3284
to
2cdba15
Compare
9ae97c2
to
127c949
Compare
127c949
to
893ef50
Compare
893ef50
to
c89eee8
Compare
FYI, I'll be waiting for the anonymous classes patch to be merged before continue. |
@marcioAlmada will this support methods named |
@jrnickell absolutely. Looks like I forgot php -r "class X { static function empty(){ return true; } } var_dump(X::empty());"
# bool(true)
php -r "class X { function empty(){ return true; } } var_dump((new X)->empty());"
# bool(true)
php -r "class X { const EMPTY = true; } var_dump(X::EMPTY);"
# bool(true) Good catch, the tests will be updated soon. |
@marcioAlmada awesome! Thanks |
07fa419
to
90df11e
Compare
1a2748e
to
0918831
Compare
9d6fcb1
to
a0ef2bd
Compare
I'm also working on this other experiment #1221 as we still have some time. |
Closed in favor of #1221. |
This is related to the "Context Sensitive Lexer" RFC: https://wiki.php.net/rfc/context_sensitive_lexer. This new W.I.P patch is an alternative implementation of #1054.
About this new implementation:
Tasks
%expect
equals to 2