Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Tokens::insertSlices not moving around all affected tokens #6058

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
25 changes: 15 additions & 10 deletions src/Tokenizer/Tokens.php
Expand Up @@ -842,6 +842,7 @@ public function insertAt(int $index, $items): void
public function insertSlices(array $slices): void
{
$itemsCount = 0;

foreach ($slices as $slice) {
$slice = \is_array($slice) || $slice instanceof self ? $slice : [$slice];
$itemsCount += \count($slice);
Expand All @@ -859,20 +860,28 @@ public function insertSlices(array $slices): void

krsort($slices);

if (array_key_first($slices) > $oldSize) {
throw new \OutOfBoundsException('Cannot insert outside of collection.');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced it's good validation, as it's checking only index for first slice. And it looks like there are not tests checking it? :(

ref #6172

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we sort the keys above it, so we know all other keys are of a lower value so we only need to test the first,
all others are tested to be greater or equal to zero

}

$insertBound = $oldSize - 1;

// since we only move already existing items around, we directly call into SplFixedArray::offset* methods.
// that way we get around additional overhead this class adds with overridden offset* methods.
foreach ($slices as $index => $slice) {
if (!\is_int($index) || $index < 0) {
throw new \OutOfBoundsException(sprintf('Invalid index "%s".', $index));
}

$slice = \is_array($slice) || $slice instanceof self ? $slice : [$slice];
$sliceCount = \count($slice);

for ($i = $insertBound; $i >= $index; --$i) {
$oldItem = parent::offsetExists($i) ? parent::offsetGet($i) : new Token('');
parent::offsetSet($i + $itemsCount, $oldItem);
parent::offsetSet($i + $itemsCount, parent::offsetGet($i));
}

$insertBound = $index - $sliceCount;
// adjust $insertBound as tokens between this index and the next index in loop
$insertBound = $index - 1;
$itemsCount -= $sliceCount;

foreach ($slice as $indexItem => $item) {
Expand All @@ -881,8 +890,8 @@ public function insertSlices(array $slices): void
}

$this->registerFoundToken($item);
$newOffset = $index + $itemsCount + $indexItem;
parent::offsetSet($newOffset, $item);

parent::offsetSet($index + $itemsCount + $indexItem, $item);
}
}
}
Expand All @@ -892,11 +901,7 @@ public function insertSlices(array $slices): void
*/
public function isChanged(): bool
{
if ($this->changed) {
return true;
}

return false;
return $this->changed;
Comment on lines -895 to +904
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i remember the idea here was to have different lines for different paths, so we can have explicit code coverage not just by one-liner covered, but to ensure both paths are covered for this crucial core-level functionality

@SpacePossum , should we move back to multiline approach, perhaps?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow, what is the benefit of higher LOC's here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i remember initial reasoning. we wanted to ensure, initially, that we don't test only cases like "this->changed=true"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh like in the coverage it won't show if both true and false and checked.... that is true. I value a more direct and compact notation even if that means a bit less verbose coverage report.

}

public function isEmptyAt(int $index): bool
Expand Down
184 changes: 184 additions & 0 deletions tests/Tokenizer/TokensTest.php
Expand Up @@ -1375,6 +1375,190 @@ public function provideGetMeaningfulTokenSiblingCases(): \Generator
yield [null, 0, -1, '<?php /**/ foo();'];
}

/**
* @dataProvider provideInsertSlicesAtMultiplePlacesCases
*
* @param array<int, Token> $slices
*/
public function testInsertSlicesAtMultiplePlaces(string $expected, array $slices): void
{
$input = <<<'EOF'
<?php

$after = get_class($after);
$before = get_class($before);
EOF;

$tokens = Tokens::fromCode($input);
$tokens->insertSlices([
16 => $slices,
6 => $slices,
]);

static::assertSame($expected, $tokens->generateCode());
}

public function provideInsertSlicesAtMultiplePlacesCases(): \Generator
{
yield 'one slice count' => [
<<<'EOF'
<?php

$after = get_class($after);
$before = get_class($before);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this case is wrong, we must avoid having whitespace next to whitespace as 2 separated tokens!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

somehow, in line 1398, we check only text-representation of collection. this is wrong, as to whitespace token should not be place one to each other (new Token([T_WHITESPACE, ' ']), new Token([T_WHITESPACE, ' ']))
after strong check of assertTokens, this case is failing d865593

EOF
,
[new Token([T_WHITESPACE, ' '])],
];

yield 'two slice count' => [
<<<'EOF'
<?php

$after = (string) get_class($after);
$before = (string) get_class($before);
EOF
,
[new Token([T_STRING_CAST, '(string)']), new Token([T_WHITESPACE, ' '])],
];

yield 'three slice count' => [
<<<'EOF'
<?php

$after = !(bool) get_class($after);
$before = !(bool) get_class($before);
EOF
,
[new Token('!'), new Token([T_BOOL_CAST, '(bool)']), new Token([T_WHITESPACE, ' '])],
];
}

public function testInsertSlicesChangesState(): void
{
$tokens = Tokens::fromCode('<?php echo 1234567890;');

static::assertFalse($tokens->isChanged());
static::assertFalse($tokens->isTokenKindFound(T_COMMENT));
static::assertSame(5, $tokens->getSize());

$tokens->insertSlices([1 => new Token([T_COMMENT, '/* comment */'])]);

static::assertTrue($tokens->isChanged());
static::assertTrue($tokens->isTokenKindFound(T_COMMENT));
static::assertSame(6, $tokens->getSize());
}

/**
* @dataProvider provideInsertSlicesCases
*/
public function testInsertSlices(Tokens $expected, Tokens $tokens, array $slices): void
{
$tokens->insertSlices($slices);
static::assertTokens($expected, $tokens);
}

public function provideInsertSlicesCases(): iterable
{
// basic insert of single token at 3 different locations including appending as new token

$template = "<?php\n%s\n/* single token test header */%s\necho 1;\n%s";
$commentContent = '/* test */';
$commentToken = new Token([T_COMMENT, $commentContent]);
$from = Tokens::fromCode(sprintf($template, '', '', ''));

yield 'single insert @ 1' => [
Tokens::fromCode(sprintf($template, $commentContent, '', '')),
clone $from,
[1 => $commentToken],
];

yield 'single insert @ 3' => [
Tokens::fromCode(sprintf($template, '', $commentContent, '')),
clone $from,
[3 => Tokens::fromArray([$commentToken])],
];

yield 'single insert @ 9' => [
Tokens::fromCode(sprintf($template, '', '', $commentContent)),
clone $from,
[9 => [$commentToken]],
];

// basic tests for single token, array of that token and tokens object with that token

$openTagToken = new Token([T_OPEN_TAG, "<?php\n"]);
$expected = Tokens::fromArray([$openTagToken]);

$slices = [
[0 => $openTagToken],
[0 => [$openTagToken]],
[0 => Tokens::fromArray([$openTagToken])],
];

foreach ($slices as $i => $slice) {
yield 'insert open tag @ 0 into empty collection '.$i => [$expected, new Tokens(), $slice];
}

// test insert lists of tokens, index out of order

$setOne = [
new Token([T_ECHO, 'echo']),
new Token([T_WHITESPACE, ' ']),
new Token([T_CONSTANT_ENCAPSED_STRING, '"new"']),
new Token(';'),
];

$setTwo = [
new Token([T_WHITESPACE, ' ']),
new Token([T_COMMENT, '/* new comment */']),
];

$setThree = Tokens::fromArray([
new Token([T_VARIABLE, '$new']),
new Token([T_WHITESPACE, ' ']),
new Token('='),
new Token([T_WHITESPACE, ' ']),
new Token([T_LNUMBER, '8899']),
new Token(';'),
new Token([T_WHITESPACE, "\n"]),
]);

$template = "<?php\n%s\n/* header */%s\necho 789;\n%s";
$expected = Tokens::fromCode(
sprintf(
$template,
'echo "new";',
' /* new comment */',
"\$new = 8899;\n"
)
);
$from = Tokens::fromCode(sprintf($template, '', '', ''));

yield 'insert 3 token collections' => [$expected, $from, [9 => $setThree, 1 => $setOne, 3 => $setTwo]];

$sets = [];

for ($j = 0; $j < 4; ++$j) {
$set = ['tokens' => [], 'content' => ''];

for ($i = 0; $i < 10; ++$i) {
$content = sprintf('/* new %d|%s */', $j, $i);

$set['tokens'][] = new Token([T_COMMENT, $content]);
$set['content'] .= $content;
}

$sets[$j] = $set;
}

yield 'overlapping inserts of bunch of comments ' => [
Tokens::fromCode(sprintf("<?php\n%s/* line 1 */\n%s/* line 2 */\n%s/* line 3 */%s", $sets[0]['content'], $sets[1]['content'], $sets[2]['content'], $sets[3]['content'])),
Tokens::fromCode("<?php\n/* line 1 */\n/* line 2 */\n/* line 3 */"),
[1 => $sets[0]['tokens'], 3 => $sets[1]['tokens'], 5 => $sets[2]['tokens'], 6 => $sets[3]['tokens']],
];
}

private static function assertFindBlockEnd(int $expectedIndex, string $source, int $type, int $searchIndex): void
{
Tokens::clearCache();
Expand Down