Skip to content
This repository has been archived by the owner on Dec 3, 2023. It is now read-only.

[CodingStandard] add ArrayPropertyDefaultValueFixer #165

Merged
merged 14 commits into from
Jun 29, 2017

Conversation

TomasVotruba
Copy link
Member

@TomasVotruba TomasVotruba commented May 5, 2017

This fixer checks for undefined default values for arrray properties.

class SomeClass
{
    /**
     * @var int[]
     */
    public $property;
}

That would fail on:

foreach ($this->property as ...) {

}

@todo

composer.json Outdated
@@ -67,7 +67,8 @@
"phpstan/phpstan-nette": "^0.6",
"phpstan/phpstan-doctrine": "^0.6",
"object-calisthenics/phpcs-calisthenics-rules": "v3.0.0-RC3",
"roave/security-advisories": "dev-master"
"roave/security-advisories": "dev-master",
"gecko-packages/gecko-php-unit": "^2.0"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow, you are using it directly ? :D

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH I've to update that package soon for new PHPUnit ;)

Copy link
Member Author

@TomasVotruba TomasVotruba May 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is required by test case and not included in original package :( should be, to prevent such additions.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

@TomasVotruba TomasVotruba May 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrong place. It should be in require section, as AbstractFixerTestCase in /src uses it.

Now I realize I didn't understand your surprised comment initially :). No, I'm not using it directly. It's missed dependency, that is required by PHP-CS-Fixer, but forgotten there.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, nice catch ;)
then it would be better to fix it in original repo instead of making workarounds :D let me fix it for you

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@keradus Thanks! I can finally remove it

use PhpCsFixer\Tokenizer\Tokens;
use SplFileInfo;

final class ArrayPropertyDefaultValueFixer extends AbstractFixer
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't ever rely on internal class of external library

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend implementing FixerInterface and not extend the internal abstract. The interface is not that complex and I'm happy to help out out it if needed.
Also; we keep a BC promise on the interface, which we do not for internal classes. So its the safer bet.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will try, thanks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SpacePossum Works fine 👍


private function addDefaultValueForArrayProperty(Token $semicolonToken): void
{
$semicolonToken->setContent(' = [];');
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is violation and root cause of your problem reported by Travis.

Basically, token_get_all return an array collection of tokens.
All of them are either array (eg [T_CLASS, 'class', _line_number_]) or primitive string (eg ;).
At PHP CS Fixer, we wrap that array collection in Tokens class and single token (which we call token prototype) in Token (to normalize access across array/string token).

Token must be a valid representation of single token in language parsing terminology.
Here, you try to put 6 tokens in one instance (space, =, space, [, ], ;). That's abuse of token representation and that's why it cause you an issue.
Instead of overriding token with a lot of code, you need to inject new tokens into collection.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And solution?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's exactly what I've tried before, but doesn't work: 73cffaf

Copy link

@keradus keradus May 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what was the issue ?
oh, that's new commit, Travis will uncover the issue

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can do a few things here;

  • remove the token and insert new ones
  • override the token and after that insert new ones (or in other order)
  • override the token and insert new ones (in one go)

It doesn't matter which one you pick (not much at least), the important part is that each token is added to the Tokens collection and the one you no longer want is removed.

In this case I would do something like;

    private function addDefaultValueForArrayProperty(Tokens $tokens, int $semicolonTokenIndex): void
    {
        $tokens[$semicolonTokenIndex]->clear();
        $tokens->insertAt($semicolonTokenIndex, new Token([T_WHITESPACE, ' ']));
        $tokens->insertAt($semicolonTokenIndex, new Token('='));
        $tokens->insertAt($semicolonTokenIndex, new Token([T_WHITESPACE, ' ']));
        $tokens->insertAt($semicolonTokenIndex, new Token([CT::T_ARRAY_SQUARE_BRACE_OPEN, '[']));
        $tokens->insertAt($semicolonTokenIndex, new Token([CT::T_ARRAY_SQUARE_BRACE_CLOSE, ']']));
    }

note: CT:: is in namespace namespace PhpCsFixer\Tokenizer
note 2: didn't test the code ;)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note 3: I missed the last token for ;, but its more of the same ;)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one small improvement.
adding Token instances has the biggest complexity in Tokens collection.

$tokens->insertAt(
    $semicolonTokenIndex,
    [ array of `Token` instances ]
);

would be 5 times faster

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note 3: I missed the last token for ;, but its more of the same ;)

then... no need to remove ; first ;) it's the token you cleared on start.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will try this, thank you!

$semicolonToken->setContent(' = [];');
$tokens->insertAt($semicolonPosition, new Token(']'));
$tokens->insertAt($semicolonPosition, new Token('['));
$tokens->insertAt($semicolonPosition, new Token(' '));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so... this is not a valid lexem.
new Token([T_WHITESPACE, ' ']) as shown in PHPUnit failure log

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the error and would have no idea that should be solved like this.

Also token accepts string|array, confusing.

Thanks, will try.

Copy link
Member Author

@TomasVotruba TomasVotruba May 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still something wrong: a32fba2#diff-77b0aab940d6b4e15bf451af42434261R97

I feels like journey to Mount Doom to add just few characters. Could we do something about that? :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm getting lost more and more and don't understand any error PHPUnit or Travis shows. Yes, I know about them, still don't how-to-fix them..

Could you please write exact code I should put there to make it work? I would really aprecciate it!

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sadly, that's how internal PHP's parser works

feel free to propose how to handle it in nicer way

waiting for travis for new issue it will reveal

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TomasVotruba , you cannot replace one token with content of multiple tokens...

Next, you cannot simply inject string, as string is pointless without context. you need to provide context-specified values, so you provide tokens.
if not, what is class content? what it represents? info that one is just about creating new class definition, or he is getting class name via Foo::class syntax ?

possible workaround:

$new = Tokens::fromCode("<?php $mySuperContent=[];");
$new->clearRange(...); // drop initial code, like php tag opening

$tokens->insertAt($index, $new);

But... parsing the code is slow. and you are not inserting random content - you know what you want to insert.

Copy link
Member Author

@TomasVotruba TomasVotruba May 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TomasVotruba , you cannot replace one token with content of multiple tokens...

I know, that's what the method would do.

  • get a code, in this case = []
  • parse it to tokens
  • add those tokens before $position

Something you wrote, just not in 3 lines, but under one clear method.

Now there are many WTFs and hidden knowledge (you know as author, I don't user) like:

  • why backwards?
  • where to find tokens?
  • why some raw, some in CT and some in string?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still, it would be a method at Tokens, not at Token.
originally we had solution that works similar to what you propose internally (it wasn't nice for dev, but it done what you are saying). by removing it we gained ~50% performance boost.

about knowledge - I agree that sth may not be clear, if you are able to improve docs - go ahead !

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know I enter this one a bit late, but I would like to point out one of the "difficulties" we face with tokens. I get it would be nice to override some tokens with new code in string form.
In your example you mention something like [ ], lets say I would like to replace a token with some like [$a].
Within PHP this can mean a lot of things, like;
$a[$a]; -> array index
[$a] = b(); -> array destruction on PHP 7.1 (CT tokens as PHP does not tokenize different)
b = [$a]; -> new array with element one element; $a

In order to figure out how this code-string inserted should be tokenized to new tokens the context is needed or a lot back tracking has to be done. Therefor we did the trade off for performance over complex, slow and hard to maintain code.

So yes, dealing with Tokens does require some background knowledge about how PHP does it and how we worked around some of it within the fixer. This might not be perfect or under documented, so feel free to contribute or talk to us, as documenting it all is just to much we can handle atm.

Copy link
Member Author

@TomasVotruba TomasVotruba May 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's never too late to improve the code 👍
I don't want to replace anything in this case, just add some code.

I realize the fixer is complicated, so just this should clear the intent


public function fix(SplFileInfo $file, Tokens $tokens): void
{
foreach ($tokens as $index => $token) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding element to collection you are traversing forward is not the best idea, as you may end up with not visiting all of the elements:
https://3v4l.org/YBaFj
iterate backward instead

Copy link
Member Author

@TomasVotruba TomasVotruba May 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm glad you point out what si wrong. Could you also add solution right to your comment? would bring me much more value.

So, how?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iterate backward instead

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace

foreach ($tokens as $index => $token) {

with something like

for ($index = count($tokens) - 1; $index > 1; --$index) {
    $token = $tokens[$index];
}

as long as you insert tokens after the used $index you'll no longer miss candidates

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to iterate backward?

@SpacePossum Thanks!

Why is that not used in other fixers then?
Is that really relevant now? All works fine.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is used in many fixers
it's not used when it's not needed (eg when fixer is not adding new content)

it is. your scenario works as you have it very small and few-elements offset is not big issue, as it means you not trying to fix end of script, which is usually class closing.

Imagine you would have 100 elements to fix in big crappy file. then fixer would fix only first ~80 properties

Copy link

@SpacePossum SpacePossum May 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some fixers do this, typically fixers that add tokens at some point.
It tends to become an issue when adding tokens while there are more candidates behind it.
For utest we add multiple fix candidates for this after each other.
In a abstract way, lets say we replace A with BC in;
XA => XBC; all good
XAYA => XBCYA ; because we stopped iterating at Y (after first fix), expected XBCYBC final result is.
Doing the reverse will not cause this issue;
XAYA => XAYBC (after first fix), XBCYBC (after second)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you guys, I will try the reverse loop

continue;
}

$semicolonSignOrArrayOpenerPosition = $tokens->getNextMeaningfulToken($index + 4);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is one issue. jumping by arbitrary +4 is causing the requested code to be added twice.
i have no clue how you ended up with this magic value.

Copy link
Member Author

@TomasVotruba TomasVotruba May 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to found out, if there is = [] or ; defined with the property.
How would you solve it?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. don't use magic offsets
  2. use getNextTokenOfKind to search for T_VARIABLE

but that's the solution for your assumption that docblock is put in correct place

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fine.


$semicolonSignOrArrayOpenerPosition = $tokens->getNextMeaningfulToken($index + 4);
$semicolonSignOrArrayOpenerToken = $tokens[$semicolonSignOrArrayOpenerPosition];
if ($semicolonSignOrArrayOpenerToken->equals('[')) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

next bug.

$foo = [];  // case 1
$foo = array(); // case 2
$foo = null; // case 3
$foo; // case 4

so what actually you wanna detect?
case 4.

what you are detecting? nothing. this condition is always false. as [ is array index opening token.
as you saw previously, short array opening token is CT::T_ARRAY_SQUARE_BRACE_OPEN

instead of trying to guess that there is no need for fixing, check the opposite - is there a need for fix:

if (! $tokenAfterVariable->equals(';')) {
    $this->insert...
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome answer. That fixed the issue, you're the man!

Thanks a lot.

/**
* @var int[]
*/
public $property;
Copy link

@keradus keradus May 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add utest with mix of visibility/final/static properties

also, be aware that one could omit visibility attribute

Copy link
Member Author

@TomasVotruba TomasVotruba May 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do, just wanted to run at least the simplest use case first.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, i guess I just to much used to TDD

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PHP-CS-Fixer and TDD doesn't combine to well. There are missing many explicit exceptions.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then... collaborate ;)

/**
* @var int[]
*/
public $property = [];
Copy link

@keradus keradus May 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

be aware of crappy code that one could make:

/**
 * @var int[]
 */
function count_my_array(array $array) { ... }

your code will crash on this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is a another fixer for that.

But...

How to include it in the Fixer? Not sure how to detect, if is property on next line or not.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Case I presented is different - your fixer shall not touch that code, as one put documentation for properties ('@var' instead of '@param') but he tried to document a function.
You must not assume the documentation is valid, put in correct place and has meaning.

with current pattern, when you have docblock, see what is the next token - is it function related code or var related code, but that's tricky as it could be final public static... and then function or $var....

I would rather change your main loop to not search for docblock token, but for T_VARIABLE, and then you are sure that code you gonna modify would be a property, not function.
Then, having properties, you could search for ; or docblock

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather change your main loop to not search for docblock token, but for T_VARIABLE, and then you are sure that code you gonna modify would be a property, not function.
Then, having properties, you could search for ; or docblock

That was my first initial trial, but I was not able to find the docblock for property. findGivenKind() was throwing some errors I didn't understand.

I will look on that now.

Copy link

@SpacePossum SpacePossum May 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test the meaningful tokens after the PHPDoc;
possible modifiers for properties are:

  • T_VAR
  • visibility token (T_PRIVATE, T_PUBLIC, T_PROTECTED)
  • T_PROTECTED

followed by

  • T_VARIABLE

if this does not apply than your are looking at either a method or class constant.

Copy link

@keradus keradus May 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or grab variable and test the meaningful tokens before it (yes, you could search backward). all the same except variable would be replaced with docblock.

I would strongly advice to investigate T_VARIABLE's neighborhood as the fixer is trying to fix it. not make everything dependent on docblock, which may even not be for variable.

@keradus
Copy link

keradus commented May 6, 2017

ok, I think that we make pretty deep analysis with @SpacePossum ;P
going sleep ;)

@TomasVotruba
Copy link
Member Author

TomasVotruba commented May 6, 2017

Thanks you guys ❤️ , me too.

I have some ideas how to improve this process thanks to you 👍

GN

keradus added a commit to PHP-CS-Fixer/PHP-CS-Fixer that referenced this pull request May 9, 2017
This PR was merged into the 2.2 branch.

Discussion
----------

GeckoPHPUnit is not dev dependency

deprecated-packages/symplify#165 (comment)

Commits
-------

9ef65af GeckoPHPUnit is not dev dependency
@TomasVotruba TomasVotruba changed the title [CodingStandard] complete README and few checkers [CodingStandard] add ArrayPropertyDefaultValueFixer Jun 29, 2017
@TomasVotruba
Copy link
Member Author

Thanks for feedback, buys.

I've added few changes and made it work for normal and static properties.

I merged this, because it drags too long and I couldn't remember my work weeks ago :(

Saying that, any additional feedback is appreciated.

@TomasVotruba TomasVotruba merged commit 9390e1a into master Jun 29, 2017
@TomasVotruba TomasVotruba deleted the coding-standard-fixers branch June 29, 2017 17:07
@SpacePossum
Copy link

Nice to see this merged, I would say your fixer is good for what you want to do, so it nice work done : D
Let this be a two way street, if you want me to have a second look on something for refinement or whatever, just ping me 👍

@TomasVotruba
Copy link
Member Author

Thanks for your affirmation, it means a lot to me from you.

Thanks for your feedback offer as well. I will definitely use it. I'd rather wait for next fixer, so it motivates me to do it sooner :)

@TomasVotruba TomasVotruba mentioned this pull request Jul 8, 2017
13 tasks
@deprecated-packages deprecated-packages locked as resolved and limited conversation to collaborators Oct 27, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants