Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

POC Tokens::insertSlices #4250

Merged
merged 1 commit into from
Jan 25, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
8 changes: 4 additions & 4 deletions src/Fixer/Alias/PowToExponentiationFixer.php
Original file line number Diff line number Diff line change
Expand Up @@ -131,10 +131,10 @@ private function findPowCalls(Tokens $tokens)
}

/**
* @param int $functionNameIndex
* @param int $openParenthesisIndex
* @param int $closeParenthesisIndex
* @param array<int,int> $arguments
* @param int $functionNameIndex
* @param int $openParenthesisIndex
* @param int $closeParenthesisIndex
* @param array<int, int> $arguments
*
* @return int number of tokens added to the collection
*/
Expand Down
42 changes: 20 additions & 22 deletions src/Fixer/ArrayNotation/WhitespaceAfterCommaInArrayFixer.php
Original file line number Diff line number Diff line change
Expand Up @@ -48,34 +48,32 @@ public function isCandidate(Tokens $tokens)
*/
protected function applyFix(\SplFileInfo $file, Tokens $tokens)
{
$tokensToInsert = [];

for ($index = $tokens->count() - 1; $index >= 0; --$index) {
if ($tokens[$index]->isGivenKind([T_ARRAY, CT::T_ARRAY_SQUARE_BRACE_OPEN])) {
$this->fixSpacing($index, $tokens);
if (!$tokens[$index]->isGivenKind([T_ARRAY, CT::T_ARRAY_SQUARE_BRACE_OPEN])) {
continue;
}
}
}

/**
* Method to fix spacing in array declaration.
*
* @param int $index
*/
private function fixSpacing($index, Tokens $tokens)
{
if ($tokens[$index]->isGivenKind(CT::T_ARRAY_SQUARE_BRACE_OPEN)) {
$startIndex = $index;
$endIndex = $tokens->findBlockEnd(Tokens::BLOCK_TYPE_ARRAY_SQUARE_BRACE, $startIndex);
} else {
$startIndex = $tokens->getNextTokenOfKind($index, ['(']);
$endIndex = $tokens->findBlockEnd(Tokens::BLOCK_TYPE_PARENTHESIS_BRACE, $startIndex);
}
if ($tokens[$index]->isGivenKind(CT::T_ARRAY_SQUARE_BRACE_OPEN)) {
$startIndex = $index;
$endIndex = $tokens->findBlockEnd(Tokens::BLOCK_TYPE_ARRAY_SQUARE_BRACE, $startIndex);
} else {
$startIndex = $tokens->getNextTokenOfKind($index, ['(']);
$endIndex = $tokens->findBlockEnd(Tokens::BLOCK_TYPE_PARENTHESIS_BRACE, $startIndex);
}

for ($i = $endIndex - 1; $i > $startIndex; --$i) {
$i = $this->skipNonArrayElements($i, $tokens);
if ($tokens[$i]->equals(',') && !$tokens[$i + 1]->isWhitespace()) {
$tokens->insertAt($i + 1, new Token([T_WHITESPACE, ' ']));
for ($i = $endIndex - 1; $i > $startIndex; --$i) {
$i = $this->skipNonArrayElements($i, $tokens);
if ($tokens[$i]->equals(',') && !$tokens[$i + 1]->isWhitespace()) {
$tokensToInsert[$i + 1] = new Token([T_WHITESPACE, ' ']);
}
}
}

if ([] !== $tokensToInsert) {
$tokens->insertSlices($tokensToInsert);
}
}

/**
Expand Down
8 changes: 3 additions & 5 deletions src/Fixer/FunctionNotation/NativeFunctionInvocationFixer.php
Original file line number Diff line number Diff line change
Expand Up @@ -275,7 +275,7 @@ private function fixFunctionCalls(Tokens $tokens, callable $functionFilter, $sta
{
$functionsAnalyzer = new FunctionsAnalyzer();

$insertAtIndexes = [];
$tokensToInsert = [];
for ($index = $start; $index < $end; ++$index) {
if (!$functionsAnalyzer->isGlobalFunctionCall($tokens, $index)) {
continue;
Expand All @@ -298,12 +298,10 @@ private function fixFunctionCalls(Tokens $tokens, callable $functionFilter, $sta
continue; // do not bother if previous token is already namespace separator
}

$insertAtIndexes[] = $index;
$tokensToInsert[$index] = new Token([T_NS_SEPARATOR, '\\']);
}

foreach (array_reverse($insertAtIndexes) as $index) {
$tokens->insertAt($index, new Token([T_NS_SEPARATOR, '\\']));
}
$tokens->insertSlices($tokensToInsert);
}

/**
Expand Down
66 changes: 54 additions & 12 deletions src/Tokenizer/Tokens.php
Original file line number Diff line number Diff line change
Expand Up @@ -895,31 +895,73 @@ public function findSequence(array $sequence, $start = 0, $end = null, $caseSens
public function insertAt($index, $items)
{
$items = \is_array($items) || $items instanceof self ? $items : [$items];
$itemsCnt = \count($items);

if (0 === $itemsCnt) {
$this->insertSlices([$index => $items]);
}

/**
* Insert a slices or individual Tokens into multiple places in a single run.
*
* This approach is kind-of an experiment - it's proven to improve performance a lot for big files that needs plenty of new tickets to be inserted,
* like edge case example of 3.7h vs 4s (https://github.com/FriendsOfPHP/PHP-CS-Fixer/issues/3996#issuecomment-455617637),
* yet at same time changing a logic of fixers in not-always easy way.
*
* To be discuss:
* - should we always aim to use this method?
* - should we deprecate `insertAt` method ?
*
* The `$slices` parameter is an assoc array, in which:
* - index: starting point for inserting of individual slice, with indexes being relatives to original array collection before any Token inserted
* - value under index: a slice of Tokens to be inserted
*
* @internal
*
* @param array<int, array<Token>|Token|Tokens> $slices
*/
public function insertSlices(array $slices)
{
$itemsCount = 0;
foreach ($slices as $slice) {
$slice = \is_array($slice) || $slice instanceof self ? $slice : [$slice];
$itemsCount += \count($slice);
}

if (0 === $itemsCount) {
return;
}

$oldSize = \count($this);
$this->changed = true;
$this->blockEndCache = [];
$this->setSize($oldSize + $itemsCnt);
$this->setSize($oldSize + $itemsCount);

krsort($slices);

$insertBound = $oldSize - 1;

// since we only move already existing items around, we directly call into SplFixedArray::offset* methods.
// that way we get around additional overhead this class adds with overridden offset* methods.
for ($i = $oldSize + $itemsCnt - 1; $i >= $index; --$i) {
$oldItem = parent::offsetExists($i - $itemsCnt) ? parent::offsetGet($i - $itemsCnt) : new Token('');
parent::offsetSet($i, $oldItem);
}
foreach ($slices as $index => $slice) {
$slice = \is_array($slice) || $slice instanceof self ? $slice : [$slice];
$sliceCount = \count($slice);

for ($i = 0; $i < $itemsCnt; ++$i) {
if ('' === $items[$i]->getContent()) {
throw new \InvalidArgumentException('Must not add empty token to collection.');
for ($i = $insertBound; $i >= $index; --$i) {
$oldItem = parent::offsetExists($i) ? parent::offsetGet($i) : new Token('');
parent::offsetSet($i + $itemsCount, $oldItem);
}

$this->registerFoundToken($items[$i]);
parent::offsetSet($i + $index, $items[$i]);
$insertBound = $index - $sliceCount;
$itemsCount -= $sliceCount;

foreach ($slice as $indexItem => $item) {
if ('' === $item->getContent()) {
throw new \InvalidArgumentException('Must not add empty token to collection.');
}

$this->registerFoundToken($item);
$newOffset = $index + $itemsCount + $indexItem;
parent::offsetSet($newOffset, $item);
}
}
}

Expand Down
26 changes: 26 additions & 0 deletions tests/Fixer/FunctionNotation/NativeFunctionInvocationFixerTest.php
Original file line number Diff line number Diff line change
Expand Up @@ -215,6 +215,18 @@ public function bar($foo)
[
'<?php

\json_encode($foo);
\strlen($foo);
',
'<?php

json_encode($foo);
strlen($foo);
',
],
[
'<?php

class Foo
{
public function bar($foo)
Expand All @@ -232,6 +244,20 @@ public function bar($foo)
return JSON_ENCODE($foo);
}
}
',
],
'fix multiple calls in single code' => [
'<?php

\json_encode($foo);
\strlen($foo);
\strlen($foo);
',
'<?php
Copy link
Member Author

@keradus keradus Jan 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This, unchanged Tokens collection is size of 20.
The fixer has to add 3 new tokens to this collection.

original code has to run offsetSet 45 times, as it has to move all tokens from strlen, then middle strlen, then from json_encode - which means that last strlen was moved 3 times!
my proposal has to run offsetSet only 20 times, as it has to move one token no more than once

Basically, this gives the possibility to reduce O(n) of calling offsetSet
from O(x*y)
to O(x+y)
where x is number of existing tokens and y is number of new tokens to insert


json_encode($foo);
strlen($foo);
strlen($foo);
',
],
];
Expand Down
9 changes: 9 additions & 0 deletions tests/Fixtures/Integration/misc/meta_insert_64566_tokens.test
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
--TEST--
Test of super huge file that would require 64566 tokens to be inserted! Basically, this tests Tokens::insertSlices, without which this test would take few hours to execute.
Note: requiring PHP 7+ just to skip PHP 5.6 with memory issues for this huge test case.
--RULESET--
{
"whitespace_after_comma_in_array": true
}
--REQUIREMENTS--
{"php": 70000}