Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for supporting PHP 8.2+ Disjunctive Normal Form Types in the Tokenizer #387

Closed
jrfnl opened this issue Mar 12, 2024 · 12 comments · Fixed by #461
Closed

Proposal for supporting PHP 8.2+ Disjunctive Normal Form Types in the Tokenizer #387

jrfnl opened this issue Mar 12, 2024 · 12 comments · Fixed by #461

Comments

@jrfnl
Copy link
Member

jrfnl commented Mar 12, 2024

The problem

PHP 8.2 introduced Disjunctive Normal Form Types, which now adds parentheses to the tokens allowed within type declarations.

This is currently not yet supported by PHP_CodeSniffer and the PHP_CodeSniffer\Tokenizers\PHP class will not always handle tokens within DNF types correctly, which can lead to false positives/false negatives being reported for sniffs and incorrect fixes being made.

Typical issues in the Tokenizer:

  • The array keyword when used in a type declaration may not be retokenized from T_ARRAY to T_STRING.
  • The true/false/null keywords when used in a type declaration may not be retokenized from T_STRING to T_TRUE/T_FALSE/T_NULL.
  • The & and | characters may not be retokenized to T_TYPE_UNION/T_TYPE_INTERSECTION.

Typical resulting sniff issues:

  • Sniffs handling arbitrary parentheses spacing may act on parentheses in DNF types.
  • Sniffs listening for the array keyword may try to act on the array keyword when used in a DNF type.
  • Sniffs listening for bitwise and/or operators may act on the & and | characters in DNF types.
  • etc

These issues also impact the functioning of the following PHPCS native helper functions and by extension all sniffs which use these methods:

  • File::getMethodParameters() (parameter types)
  • File::getMethodProperties() (return types)
  • File::getMemberProperties() (property types)
  • File::isReference()

Compounding the Problem

Aside from the above, standards may want to treat (spacing around) parentheses in DNF types differently from (spacing around) parentheses used in other places in a code base.

Proposed Solution

To support DNF types the PHP_CodeSniffer\Tokenizers\PHP class will need significant updates.

When making these updates, I propose to include a change to retokenize the open and close parentheses when used in DNF types as T_TYPE_OPEN_PARENTHESIS and T_TYPE_CLOSE_PARENTHESIS respectively.

Having separate, PHPCS native tokens for the DNF parentheses will allow for sniff developers to decide on a case-by-case basis whether a sniff should or should not act on DNF parentheses.

This change also follows in the footsteps of the retokenization of the & and | tokens (when used in types) to T_TYPE_* tokens.

As part of this change, the new T_TYPE_OPEN_PARENTHESIS and T_TYPE_CLOSE_PARENTHESIS tokens will need to be handled correctly by the PHP_CodeSniffer\Tokenizers\Tokenizer class and tokens should get the parenthesis_opener, parenthesis_closer and nested_parenthesis tokens, same as if the tokens were the "normal" T_OPEN_PARENTHESIS and T_CLOSE_PARENTHESIS tokens.

Open question

As PHP 8.2 has been out for a while, some external standards may have implemented work-arounds for the current state of things.

So, aside from opinions on the above proposal itself, I would also like to hear opinions on when this change should be made.

  • Should this change go into the next PHPCS 3.x minor (once the PR has been prepared) ?
  • Or should this change go into PHPCS 4.0 ? (expected in 3-4 months)

/cc @asispts @dingo-d @fredden @GaryJones @greg0ire @kukulich @michalbundyra @Ocramius @sirbrillig @stronk7 @weierophinney @wimg

Related issues

@Ocramius
Copy link

Should this change go into the next PHPCS 3.x minor (once the PR has been prepared) ?

Could the old class be deprecated/left in place, while usages migrate to a new class that contains these adjustments?

@jrfnl
Copy link
Member Author

jrfnl commented Mar 12, 2024

Should this change go into the next PHPCS 3.x minor (once the PR has been prepared) ?

Could the old class be deprecated/left in place?

Not really. It's a monster-class without any modularity.

usages migrate to a new class that contains these adjustments?

This is exactly why I would like opinions on the timeline. PHPCS 4.0 will contain various other tokenizer changes too, so would allow for more gradual migration.

@kukulich
Copy link
Contributor

I'm always happy with more new tokens because my sniffs could be more simple :)

Should this change go into the next PHPCS 3.x minor (once the PR has been prepared) ?
Or should this change go into PHPCS 4.0 ? (expected in 3-4 months)

DNF don't work currently so I think it could go to PHPCS 3.x. However I think you should choose what is more easy for you. DNF support can wait another 3-4 months if it will make your work on PHPCS easier :)

@stronk7
Copy link
Contributor

stronk7 commented Mar 12, 2024

Thanks for the complete report, easy to digest!

I like the new T_TYPE_* approach. And I think that it will make things easy for people to move on them (exisiting Sniffs).

About the target release, applying the safety-principle, I'd say 4.0, because of the potential breaking impact on already existing Sniffs out there (not sure if there is any reliable way to try to look for existing cases).

In our case (Moodle), we still have PHP 8.1 as min requirement in our current dev version, so we are DNFT-free for now.

Side-note, only partially related: One thing that always make me think about all these Tokenizer issues/changes... is how others out there (maybe using other type of - AST, ... - parsers) are impacted by PHP changes (more or less than us). Just saying, I don't know/feel that there is any easy route to provide any tokenizer <==> parser conversion easily.

Ciao :-)

@sirbrillig
Copy link

I think that the proposed new tokens sound like a good approach since they have a distinct meaning.

As for timing, I agree with the above that since it's currently effectively broken, a minor version should be fine. This is at least a bug fix and at most a new feature.

That said, I can imagine that it could be considered breaking if someone hacked around these tokens already. But if they did I would hope that they posted an issue on this repo since it seems to me to be missing support? And if it does break a sniff, the fix is easy enough by pinning the previous minor version.

You have your finger on the pulse of sniff development, though, so if you think it likely that folks would be badly impacted by this change and there's no way to let them know ahead of time, then a major version is also fine. We've been behind PHP versions for quite a while now and another instance is not going to be the end of the world.

@jrfnl
Copy link
Member Author

jrfnl commented Mar 12, 2024

@kukulich @stronk7 @sirbrillig Thank you for your input so far.

Personally, I'm not aware of standards which have work-arounds in place (which is part of the reason for opening this issue: to try and find out).

Timing-wise, if nobody speaks up about having put work-arounds in place, I think adding this to the next minor (once the patch is available) would be best as it takes pressure off PHPCS 4.0, for which quite some work still needs to be done, and would allow for the support to reach end-users sooner, especially as it is likely to take a while before external standards will support PHPCS 4.0.

We've been behind PHP versions for quite a while now and another instance is not going to be the end of the world.

Well, I've been trying to play catch up with the PHPCS 3.8.0 and 3.9.0 releases. Syntax-support wise, this is the last "big" one we're still missing as far as I'm concerned.

@GaryJones
Copy link
Contributor

I was going to say PHPCS 4.0, but if you're not aware of any current workarounds, then it makes sense to bring this forward to 3.x.

@jrfnl
Copy link
Member Author

jrfnl commented Apr 10, 2024

As nobody has spoken up about having work-arounds in place already, I intend to start work on this over the next few days and hope to include DNF support in the 3.10.0 release.

@jrfnl
Copy link
Member Author

jrfnl commented Apr 22, 2024

I've just pulled the first PR in a series of 6 PRs related to this. See #461

This first PR adds the tokenizer support for DNF types as outlined above.

Once that PR has been merged, I have three PRs lined up to add support for DNF types to the File::getMethodParameters() method, the File::getMethodProperties() method and the File::getMemberVariables() method.
One small PR to add some related tests for the File::isReference() method.

And then finally a PR to handle DNF types in the Generic.PHP.LowerCaseType sniff, which is the only PHPCS native sniff which needs changes to support DNF (from what I can see. What with the new tokens, false positives in all other sniffs should automatically be fixed via the tokenizer PR).

I'd like to invite everyone to put the PR through the wringer and test it thoroughly to confirm I've caught all the edge cases and/or to find those edge cases I've still missed.

Leaving a comment saying you've tested the PR with your sniffs is appreciated as it will strengthen confidence in the implementation.

@jrfnl
Copy link
Member Author

jrfnl commented Apr 28, 2024

FYI: I've merged the PR with the Tokenizer changes to improve the chances of people using the master branch testing this change in hopes of any undesired side-effects being caught before the next release.

I've also pulled the changes to the File methods now (PR #471, #472, #473) and will merge those in a couple of days (providing no bugs are discovered in the PRs).

@jrfnl
Copy link
Member Author

jrfnl commented May 2, 2024

FYI: I've merged the above mentioned PRs updating the File methods to increase the surface for testing the implementation of DNF support in PHPCS via external standards.

I've also now pulled PR #478 which updates the one sniff which needed updating (AFAICS).

@jrfnl
Copy link
Member Author

jrfnl commented May 13, 2024

Just checking in to see if anyone has been testing the implementation with their external standard and if there is any feedback. If not, I intend to release PHPCS 3.10.0 later this week with this change included.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants