New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PHP_CodeSniffer should use TOKEN_PARSE flag for token_get_all #3020
Comments
Code that'd be touched by this today is for example this: <?php
class A
{
const PUBLIC = 1;
} With TOKEN_PARSE the constant name is T_STRING, without it it's T_PUBLIC. |
@ondrejmirtes PHPCS 3.x has a minimum PHP version of PHP 5.4. That flag was added to PHP in version 7.0, so unfortunately, this is not an option at this time. PHPCS 4.x is set to have a minimum PHP version of PHP 7.2, so hopefully it will be for 4.x, as yes, it would be great to have it enabled as it would make a number of backfills (partially) redundant or at the very least, these could be simplified. (though enabling it would also constitute a BC-break as the tokens provided by PHPCS will be different, so this would need to go into a major anyway... quite apart from the PHPCS tokenizer needing lots more unit tests to safeguard against unexpected BC-breaks, or at least to discover them and be able to annotate them in the changelog) |
This could work (https://3v4l.org/eNmOv): if (defined('TOKEN_PARSE')) {
token_get_all($code, TOKEN_PARSE);
} else {
token_get_all($code);
} I'd argue this isn't even a BC break as logic that parses code like this has to handle the normal case with T_STRING anyway, so the worst thing that can happen is that some extra conditions that are aware of this behaviour would no longer be executed... |
It most definitely would be a BC-break as the token stream will be different and people will have written sniffs based on the expected token stream as it currently is. Just for fun, I've just run some tests with the flag enabled on PHP 7.4.6:
And some external standards:
Looking at the errors, it looks like that flag makes the PHP tokenizer parse error intolerant, which is undesirable for PHPCS for two reasons:
Note: as I stated above, I ran the tests on PHP 7.4.6. If I'd rerun these on a lower PHP version, expect all the error numbers listed above to go up. So, having looked at this more closely, even though it would simplify some backfills, I honestly don't think the flag should ever be turned on for PHPCS. |
My concern is that I'd say that a lot of sniffs that are currently looking at code like this: <?php
class A
{
const FOO = 1;
} Will currently definitely break for: <?php
class A
{
const PUBLIC = 1;
} Because their authors are not aware they can encouter |
Re: the |
PHPCS already accounts for a lot of those type of situations and changes the token to The class Foo {
function if() {}
} If the same is not done for class constants, that could be added, but that's definitely not a reason to turn on the |
Oh and just for arguments sake, I've just checked the current PHPCS tokenization of your code sample: class A
{
const PUBLIC = 1;
} And PHPCS already tokenizes the
|
So PHP_CodeSniffer already kind-of simulates the TOKEN_PARSE behaviour, which I wasn't aware of. |
@jrfnl Thanks for all work exploring this and explaining how PHP_CodeSniffer's tokenizer works.
Yes, that's correct. PHP_CodeSniffer has been around for quite a while, so it's seen a lot of different PHP versions and has tried to maintain backwards compatibility as much as possible through these sort of backfills. It's a QA tool, so it's important it meets developers at the PHP version they are at, not just where they would like to be. I don't have any plans to implement the TOKEN_PARSE flag as my top priority is to ensure the token stack remains the same throughout PHP_CodeSniffer versions to enable custom sniffs to be migrated with as little work as possible. I'm going to close this issue so it is clear no work is planned here, but thanks for raising it as it turned into a very worthwhile discussion. Feel free to continue, as I can always reopen if there is a benefit to changing the behaviour of the tokenizer that I'm not seeing yet. |
In PHP 8 there's a new reserved keyword
match
, but it's allowed as a class method name.Without the TOKEN_PARSE flag, class method with name
match
will be present as a T_MATCH token in the token list.With the TOKEN_PARSE flag, it will continue to be present as T_STRING.
Maybe there's already an example like this today that'd make working with the token stream easier.
PHP_CodeSniffer currently doesn't use the flag:
PHP_CodeSniffer/src/Tokenizers/PHP.php
Line 470 in ce62dee
Ref: nikic/PHP-Parser#684 (comment)
The text was updated successfully, but these errors were encountered: