Skip to content

Commit

Permalink
PHP 8.2 | Tokenizer/PHP: add support for DNF types
Browse files Browse the repository at this point in the history
This commit adds tokenizer support for DNF types as per the proposal outlined in 387.

This means that:
* Two new tokens are introduced `T_TYPE_OPEN_PARENTHESIS` and `T_TYPE_CLOSE_PARENTHESIS` for the parentheses used in DNF types.
    This allows for sniffs to specifically target those tokens and prevents sniffs which are looking for the "normal" open/close parenthesis tokens from acting on DNF parentheses.
* These new tokens, like other parentheses, will get the `parenthesis_opener` and `parenthesis_closer` token array indexes and the tokens between them will have the `nested_parenthesis` index.

Based on the currently added tests, the commit safeguards that:
* The `|` in types is still tokenized as `T_TYPE_UNION`, even in DNF types.
* The `&` in types is still tokenized as `T_TYPE_INTERSECTION`, even in DNF types.
* The `static` keyword for properties is still tokenized as `T_STATIC`, even when right before a DNF type (which could be confused for a function call).
* The arrow function retokenization to `T_FN` with a `T_FN_ARROW` scope opener is handled correctly, even when DNF types are involved and including when the arrow function is declared to return by reference.
* The keyword tokens, like `self`, `parent`, `static`, `true` or `false`, when used in DNF types are still tokenized to their own token and not tokenized as `T_STRING`.
* The `array` keyword when used in DNF types is still tokenized as `T_STRING` and not as `T_ARRAY`.
* A `?` intended as an (illegal) nullability operator in combination with a DNF type is still tokenized as `T_NULLABLE` and not as `T_INLINE_THEN`.
* A function declaration open parenthesis before a typed parameter isn't accidentally retokenized to `T_TYPE_OPEN_PARENTHESIS`.

Includes ample unit tests.

Even so, strenuous testing of this PR is recommended as there are so many moving parts involved, it is very easy for something to have been overlooked.

Related to 105

Closes 387
Closes squizlabs/PHP_CodeSniffer 3731
  • Loading branch information
jrfnl committed Apr 22, 2024
1 parent 83f3859 commit ceab210
Show file tree
Hide file tree
Showing 18 changed files with 1,282 additions and 63 deletions.
159 changes: 130 additions & 29 deletions src/Tokenizers/PHP.php
Original file line number Diff line number Diff line change
Expand Up @@ -464,6 +464,8 @@ class PHP extends Tokenizer
T_CLOSE_SHORT_ARRAY => 1,
T_TYPE_UNION => 1,
T_TYPE_INTERSECTION => 1,
T_TYPE_OPEN_PARENTHESIS => 1,
T_TYPE_CLOSE_PARENTHESIS => 1,
];

/**
Expand Down Expand Up @@ -747,6 +749,9 @@ protected function tokenize($string)

/*
Special case for `static` used as a function name, i.e. `static()`.
Note: this may incorrectly change the static keyword directly before a DNF property type.
If so, this will be caught and corrected for in the additional processing.
*/

if ($tokenIsArray === true
Expand Down Expand Up @@ -2712,21 +2717,23 @@ protected function processAdditional()
if (isset($this->tokens[$x]) === true && $this->tokens[$x]['code'] === T_OPEN_PARENTHESIS) {
$ignore = Tokens::$emptyTokens;
$ignore += [
T_ARRAY => T_ARRAY,
T_CALLABLE => T_CALLABLE,
T_COLON => T_COLON,
T_NAMESPACE => T_NAMESPACE,
T_NS_SEPARATOR => T_NS_SEPARATOR,
T_NULL => T_NULL,
T_TRUE => T_TRUE,
T_FALSE => T_FALSE,
T_NULLABLE => T_NULLABLE,
T_PARENT => T_PARENT,
T_SELF => T_SELF,
T_STATIC => T_STATIC,
T_STRING => T_STRING,
T_TYPE_UNION => T_TYPE_UNION,
T_TYPE_INTERSECTION => T_TYPE_INTERSECTION,
T_ARRAY => T_ARRAY,
T_CALLABLE => T_CALLABLE,
T_COLON => T_COLON,
T_NAMESPACE => T_NAMESPACE,
T_NS_SEPARATOR => T_NS_SEPARATOR,
T_NULL => T_NULL,
T_TRUE => T_TRUE,
T_FALSE => T_FALSE,
T_NULLABLE => T_NULLABLE,
T_PARENT => T_PARENT,
T_SELF => T_SELF,
T_STATIC => T_STATIC,
T_STRING => T_STRING,
T_TYPE_UNION => T_TYPE_UNION,
T_TYPE_INTERSECTION => T_TYPE_INTERSECTION,
T_TYPE_OPEN_PARENTHESIS => T_TYPE_OPEN_PARENTHESIS,
T_TYPE_CLOSE_PARENTHESIS => T_TYPE_CLOSE_PARENTHESIS,
];

$closer = $this->tokens[$x]['parenthesis_closer'];
Expand Down Expand Up @@ -3029,10 +3036,15 @@ protected function processAdditional()
continue;
} else if ($this->tokens[$i]['code'] === T_BITWISE_OR
|| $this->tokens[$i]['code'] === T_BITWISE_AND
|| $this->tokens[$i]['code'] === T_OPEN_PARENTHESIS
|| $this->tokens[$i]['code'] === T_CLOSE_PARENTHESIS
) {
/*
Convert "|" to T_TYPE_UNION or leave as T_BITWISE_OR.
Convert "&" to T_TYPE_INTERSECTION or leave as T_BITWISE_AND.
Convert "(" and ")" to T_TYPE_(OPEN|CLOSE)_PARENTHESIS or leave as T_(OPEN|CLOSE)_PARENTHESIS.
All type related tokens will be converted in one go as soon as this section is hit.
*/

$allowed = [
Expand All @@ -3048,20 +3060,22 @@ protected function processAdditional()
T_NS_SEPARATOR => T_NS_SEPARATOR,
];

$suspectedType = null;
$typeTokenCount = 0;
$suspectedType = null;
$typeTokenCountAfter = 0;

for ($x = ($i + 1); $x < $numTokens; $x++) {
if (isset(Tokens::$emptyTokens[$this->tokens[$x]['code']]) === true) {
continue;
}

if (isset($allowed[$this->tokens[$x]['code']]) === true) {
++$typeTokenCount;
++$typeTokenCountAfter;
continue;
}

if ($typeTokenCount > 0
if (($typeTokenCountAfter > 0
|| ($this->tokens[$i]['code'] === T_CLOSE_PARENTHESIS
&& isset($this->tokens[$i]['parenthesis_owner']) === false))
&& ($this->tokens[$x]['code'] === T_BITWISE_AND
|| $this->tokens[$x]['code'] === T_ELLIPSIS)
) {
Expand Down Expand Up @@ -3092,6 +3106,7 @@ protected function processAdditional()
&& $this->tokens[$this->tokens[$x]['scope_condition']]['code'] === T_FUNCTION
) {
$suspectedType = 'return';
break;
}

if ($this->tokens[$x]['code'] === T_EQUAL) {
Expand All @@ -3103,35 +3118,95 @@ protected function processAdditional()
break;
}//end for

if ($typeTokenCount === 0 || isset($suspectedType) === false) {
// Definitely not a union or intersection type, move on.
if (($typeTokenCountAfter === 0
&& ($this->tokens[$i]['code'] !== T_CLOSE_PARENTHESIS
|| isset($this->tokens[$i]['parenthesis_owner']) === true))
|| isset($suspectedType) === false
) {
// Definitely not a union, intersection or DNF type, move on.
continue;
}

if ($suspectedType === 'property or parameter') {
unset($allowed[T_STATIC]);
}

$typeTokenCount = 0;
$typeOperators = [$i];
$confirmed = false;
$typeTokenCountBefore = 0;
$typeOperators = [$i];
$confirmed = false;
$maybeNullable = null;

for ($x = ($i - 1); $x >= 0; $x--) {
if (isset(Tokens::$emptyTokens[$this->tokens[$x]['code']]) === true) {
continue;
}

if ($suspectedType === 'property or parameter'
&& $this->tokens[$x]['code'] === T_STRING
&& strtolower($this->tokens[$x]['content']) === 'static'
) {
// Static keyword followed directly by an open parenthesis for a DNF type.
// This token should be T_STATIC and was incorrectly identified as a function call before.
$this->tokens[$x]['code'] = T_STATIC;
$this->tokens[$x]['type'] = 'T_STATIC';

if (PHP_CODESNIFFER_VERBOSITY > 1) {
$line = $this->tokens[$x]['line'];
echo "\t* token $x on line $line changed back from T_STRING to T_STATIC".PHP_EOL;
}
}

if ($suspectedType === 'property or parameter'
&& $this->tokens[$x]['code'] === T_OPEN_PARENTHESIS
) {
// We need to prevent the open parenthesis for a function/fn declaration from being retokenized
// to T_TYPE_OPEN_PARENTHESIS if this is the first parameter in the declaration.
if (isset($this->tokens[$x]['parenthesis_owner']) === true
&& $this->tokens[$this->tokens[$x]['parenthesis_owner']]['code'] === T_FUNCTION
) {
$confirmed = true;
break;
} else {
// This may still be an arrow function which hasn't be handled yet.
for ($y = ($x - 1); $y > 0; $y--) {
if (isset(Tokens::$emptyTokens[$this->tokens[$y]['code']]) === false
&& $this->tokens[$y]['code'] !== T_BITWISE_AND
) {
// Non-whitespace content.
break;
}
}

if ($this->tokens[$y]['code'] === T_FN) {
$confirmed = true;
break;
}
}
}//end if

if (isset($allowed[$this->tokens[$x]['code']]) === true) {
++$typeTokenCount;
++$typeTokenCountBefore;
continue;
}

// Union and intersection types can't use the nullable operator, but be tolerant to parse errors.
if ($typeTokenCount > 0 && $this->tokens[$x]['code'] === T_NULLABLE) {
// Union, intersection and DNF types can't use the nullable operator, but be tolerant to parse errors.
if (($typeTokenCountBefore > 0
|| ($this->tokens[$x]['code'] === T_OPEN_PARENTHESIS && isset($this->tokens[$x]['parenthesis_owner']) === false))
&& ($this->tokens[$x]['code'] === T_NULLABLE
|| $this->tokens[$x]['code'] === T_INLINE_THEN)
) {
if ($this->tokens[$x]['code'] === T_INLINE_THEN) {
$maybeNullable = $x;
}

continue;
}

if ($this->tokens[$x]['code'] === T_BITWISE_OR || $this->tokens[$x]['code'] === T_BITWISE_AND) {
if ($this->tokens[$x]['code'] === T_BITWISE_OR
|| $this->tokens[$x]['code'] === T_BITWISE_AND
|| $this->tokens[$x]['code'] === T_OPEN_PARENTHESIS
|| $this->tokens[$x]['code'] === T_CLOSE_PARENTHESIS
) {
$typeOperators[] = $x;
continue;
}
Expand Down Expand Up @@ -3217,14 +3292,40 @@ protected function processAdditional()
$line = $this->tokens[$x]['line'];
echo "\t* token $x on line $line changed from T_BITWISE_OR to T_TYPE_UNION".PHP_EOL;
}
} else {
} else if ($this->tokens[$x]['code'] === T_BITWISE_AND) {
$this->tokens[$x]['code'] = T_TYPE_INTERSECTION;
$this->tokens[$x]['type'] = 'T_TYPE_INTERSECTION';

if (PHP_CODESNIFFER_VERBOSITY > 1) {
$line = $this->tokens[$x]['line'];
echo "\t* token $x on line $line changed from T_BITWISE_AND to T_TYPE_INTERSECTION".PHP_EOL;
}
} else if ($this->tokens[$x]['code'] === T_OPEN_PARENTHESIS) {
$this->tokens[$x]['code'] = T_TYPE_OPEN_PARENTHESIS;
$this->tokens[$x]['type'] = 'T_TYPE_OPEN_PARENTHESIS';

if (PHP_CODESNIFFER_VERBOSITY > 1) {
$line = $this->tokens[$x]['line'];
echo "\t* token $x on line $line changed from T_OPEN_PARENTHESIS to T_TYPE_OPEN_PARENTHESIS".PHP_EOL;
}
} else if ($this->tokens[$x]['code'] === T_CLOSE_PARENTHESIS) {
$this->tokens[$x]['code'] = T_TYPE_CLOSE_PARENTHESIS;
$this->tokens[$x]['type'] = 'T_TYPE_CLOSE_PARENTHESIS';

if (PHP_CODESNIFFER_VERBOSITY > 1) {
$line = $this->tokens[$x]['line'];
echo "\t* token $x on line $line changed from T_CLOSE_PARENTHESIS to T_TYPE_CLOSE_PARENTHESIS".PHP_EOL;
}
}//end if
}//end foreach

if (isset($maybeNullable) === true) {
$this->tokens[$maybeNullable]['code'] = T_NULLABLE;
$this->tokens[$maybeNullable]['type'] = 'T_NULLABLE';

if (PHP_CODESNIFFER_VERBOSITY > 1) {
$line = $this->tokens[$maybeNullable]['line'];
echo "\t* token $maybeNullable on line $line changed from T_INLINE_THEN to T_NULLABLE".PHP_EOL;
}
}

Expand Down
2 changes: 2 additions & 0 deletions src/Util/Tokens.php
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,8 @@
define('T_ATTRIBUTE_END', 'PHPCS_T_ATTRIBUTE_END');
define('T_ENUM_CASE', 'PHPCS_T_ENUM_CASE');
define('T_TYPE_INTERSECTION', 'PHPCS_T_TYPE_INTERSECTION');
define('T_TYPE_OPEN_PARENTHESIS', 'PHPCS_T_TYPE_OPEN_PARENTHESIS');
define('T_TYPE_CLOSE_PARENTHESIS', 'PHPCS_T_TYPE_CLOSE_PARENTHESIS');

// Some PHP 5.5 tokens, replicated for lower versions.
if (defined('T_FINALLY') === false) {
Expand Down
17 changes: 17 additions & 0 deletions tests/Core/Tokenizer/ArrayKeywordTest.inc
Original file line number Diff line number Diff line change
Expand Up @@ -39,3 +39,20 @@ class Bar {
/* testOOPropertyType */
protected array $property;
}

class DNFTypes {
/* testOOConstDNFType */
const (A&B)|array|(C&D) NAME = [];

/* testOOPropertyDNFType */
protected (A&B)|ARRAY|null $property;

/* testFunctionDeclarationParamDNFType */
public function name(null|array|(A&B) $param) {
/* testClosureDeclarationParamDNFType */
$cl = function ( array|(A&B) $param) {};

/* testArrowDeclarationReturnDNFType */
$arrow = fn($a): (A&B)|Array => new $a;
}
}
18 changes: 18 additions & 0 deletions tests/Core/Tokenizer/ArrayKeywordTest.php
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,24 @@ public static function dataArrayType()
'OO property type' => [
'testMarker' => '/* testOOPropertyType */',
],

'OO constant DNF type' => [
'testMarker' => '/* testOOConstDNFType */',
],
'OO property DNF type' => [
'testMarker' => '/* testOOPropertyDNFType */',
'testContent' => 'ARRAY',
],
'function param DNF type' => [
'testMarker' => '/* testFunctionDeclarationParamDNFType */',
],
'closure param DNF type' => [
'testMarker' => '/* testClosureDeclarationParamDNFType */',
],
'arrow return DNF type' => [
'testMarker' => '/* testArrowDeclarationReturnDNFType */',
'testContent' => 'Array',
],
];

}//end dataArrayType()
Expand Down
9 changes: 9 additions & 0 deletions tests/Core/Tokenizer/BackfillFnTokenTest.inc
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,15 @@ $arrowWithUnionParam = fn(Traversable&Countable $param) : int => (new SomeClass(
/* testIntersectionReturnType */
$arrowWithUnionReturn = fn($param) : \MyFoo&SomeInterface => new SomeClass($param);

/* testDNFParamType */
$arrowWithUnionParam = fn((Traversable&Countable)|null $param) : SomeClass => new SomeClass($param) ?? null;

/* testDNFReturnType */
$arrowWithUnionReturn = fn($param) : false|(\MyFoo&SomeInterface) => new \MyFoo($param) ?? false;

/* testDNFParamTypeWithReturnByRef */
$arrowWithParamReturnByRef = fn &((A&B)|null $param) => $param * 10;

/* testTernary */
$fn = fn($a) => $a ? /* testTernaryThen */ fn() : string => 'a' : /* testTernaryElse */ fn() : string => 'b';

Expand Down
48 changes: 48 additions & 0 deletions tests/Core/Tokenizer/BackfillFnTokenTest.php
Original file line number Diff line number Diff line change
Expand Up @@ -547,6 +547,54 @@ public function testIntersectionReturnType()
}//end testIntersectionReturnType()


/**
* Test arrow function with a DNF parameter type.
*
* @covers PHP_CodeSniffer\Tokenizers\PHP::processAdditional
*
* @return void
*/
public function testDNFParamType()
{
$token = $this->getTargetToken('/* testDNFParamType */', T_FN);
$this->backfillHelper($token);
$this->scopePositionTestHelper($token, 17, 29);

}//end testDNFParamType()


/**
* Test arrow function with a DNF return type.
*
* @covers PHP_CodeSniffer\Tokenizers\PHP::processAdditional
*
* @return void
*/
public function testDNFReturnType()
{
$token = $this->getTargetToken('/* testDNFReturnType */', T_FN);
$this->backfillHelper($token);
$this->scopePositionTestHelper($token, 16, 29);

}//end testDNFReturnType()


/**
* Test arrow function which returns by reference with a DNF parameter type.
*
* @covers PHP_CodeSniffer\Tokenizers\PHP::processAdditional
*
* @return void
*/
public function testDNFParamTypeWithReturnByRef()
{
$token = $this->getTargetToken('/* testDNFParamTypeWithReturnByRef */', T_FN);
$this->backfillHelper($token);
$this->scopePositionTestHelper($token, 15, 22);

}//end testDNFParamTypeWithReturnByRef()


/**
* Test arrow functions used in ternary operators.
*
Expand Down

0 comments on commit ceab210

Please sign in to comment.