-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Lexical Feedback - B #1221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lexical Feedback - B #1221
Conversation
7ec2cc4
to
33d844f
Compare
This approach ended up being much more simpler than #1158, with the advantage of a near zero maintenance cost so it's worth moving forward with it. |
de191c7
to
a7a7cbd
Compare
I intend to open a mailing list thread to evaluate this patch next week, so we can analyze if it's good enough to fulfill the RFC without problems. Thanks. |
3c255b6
to
d908055
Compare
Hi Marcio, thanks for the awesome work! Just a little suggestion: once the patch is considered ready to be merged I'd squash all commits into a single one to have a clearer git history in master. |
c9518f2
to
3c17b00
Compare
I posted this on the mailing list and apparently there is no objections there. Does this means the PR can be merged or we should wait more?
|
451fdc8
to
95d517b
Compare
The implementation has no regression risks, has an even smaller footprint compared to the previous attempt involving a pure lexical approach, is higly predictable and higly configurable. To turn a word semi-reserved you only need to edit the "SEMI_RESERVED" parser rule, it's an inclusive list of all the words that should be matched as T_STRING on specific contexts. Example: ``` method_modifiers function returns_ref indentifier '(' parameter_list ')' ... ``` instead of: ``` method_modifiers function returns_ref T_STRING '(' parameter_list ')' ... ``` TODO: port ext tokenizer
we basically added a mechanism to store the token stream during parsing and exposed the entire parser stack on the tokenizer extension through an opt in flag: token_get_all($src, TOKEN_PARSE). this change allows easy future language enhancements regarding context aware parsing & scanning without further maintance on the tokenizer extension while solves known inconsistencies "parseless" tokenizer extension has when it handles `__halt_compiler()` presence.
95d517b
to
c2f3091
Compare
Hi @marcioAlmada ! AFAIK a code like this class True {
} would still not be possible with your patch right? There are some BC issues in the Symfony Validator component and I wanted to make sure if this patch will solve them or not. Thanks! |
@acasademont It won't. Even if we manage to allow reserved words as class names and namespace names in the future, the words representing types would still collide because types and classes dispute the same keyspace. The only solution for that is syntactic stropping but I don't want to speculate about that now. One step at a time :) |
I think the only way it would work is to make PHP core case-sensitive, so that True and true are different. If PHP reserved words only covers lower case, it will work perfectly. For instance, string is the primitive type, but String can be used as class name. Even String and STRing are considered different. But I think such a breaking change will not occur at least until PHP 8, theres also no way to know whether reserved words will only cover lower case even if PHP becomes case sensitive. |
Thanks! |
This is related to the Context Sensitive Language RFC:
About this implementation:
About the tokenizer extension port:
We basically added a mechanism to store the token stream during parsing and exposed the entire parser stack on the tokenizer extension through an opt in flag:
token_get_all($src, TOKEN_PARSE)
. This has a few advantages:token_get_all
as theTOKEN_PARSE
flag is opt in__halt_compiler()
is handled are gone - currently the tokenizer extension tries to "mimic" the parser and doesn't do an accurate job __halt_compiler is buggy without real parsing :(TOKEN_PARSE
flag as the following code is now possible:TO DO
TOKEN_PARSE
flag and collect token stream concurrently to parsing sotoken_get_all()
can be optionally context awaretoken_get_all($source, TOKEN_PARSE)
emit a catchableParseException
halt_compiler();
in a backwards compatible wayThis pull request supersedes #1158.