Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"binary_operator_spaces" and "whitespace_after_comma_in_array" rules are extremely slow #4656

Closed
mvorisek opened this issue Nov 22, 2019 · 6 comments

Comments

@mvorisek
Copy link
Contributor

mvorisek commented Nov 22, 2019

The PHP version you are using ($ php -v):

=> 7.3

PHP CS Fixer version you are using ($ php-cs-fixer -V):

=> 2.16.0

All rules of the CS fixer completes within 10 - 20 seconds (tested all @PSR1, @PSR2 and @Symfony one by one), except:

    'whitespace_after_comma_in_array' => true,
    'binary_operator_spaces' => true,

the whitespace_after_comma_in_array rule completed in ~17 minutes, the binary_operator_spaces takes currently forever (more than 4 hours, actually never completed).
on very large arrays formatted like this:

array(1=>2, 3=>4,...)

notice the missing spaces. If the file is preprocessed by regular expression, i.e. "=>" converted to " => " (with one space on each side), then the rules completes also very fast.

Test file:

https://github.com/tecnickcom/TCPDF/blob/6.3.2/fonts/cid0ct.php
@mvorisek
Copy link
Contributor Author

Is here any core developer of the CS Fixer that knows the fixer internals and how to fix this effectively? Large arrays with bad syntax are very common if some maps like this are generated by code.

@SpacePossum
Copy link
Contributor

Hi and thanks for reporting!

I know a bit or two about the binary_operator_spaces fixer and it has some serious issues, however refactoring it will be hard as well.

However, one of the biggest issues is not just with that fixer but the Tokens collection. Inserting into a large Tokens collection triggers a lot of array coping with all the memory and speed issues that comes with that.

That is why the speed of the fixer is OK as long as it doesn't touches the tokens.

There are details here #3996 as well and a proposed fix that might be of interest for you #4250

@mvorisek
Copy link
Contributor Author

Hi @SpacePossum , thank you for your message!

I meant if there can be done an easy fix especially for large arrays, i.e. traverse 1 array level at once, fix the elements and then replace it with the results. Does it makes some sense?

@SpacePossum
Copy link
Contributor

It is worth checking if there is such a fix doable!

However there are some things to keep in mind like usage of multi line comments and spacing, heredoc, nowdoc, anonymous classes, lamdas, etc. within the array. All the fun stuff :)

@mvorisek
Copy link
Contributor Author

mvorisek commented Nov 17, 2020

Here are stats after running only the whitespace_after_comma_in_array rule on https://github.com/tecnickcom/TCPDF/tree/6.3.5 fonts folder (php files with size 100 - 1500 KB):

Fixed all files in 48181.029 seconds, 522.000 MB memory used

@mvorisek
Copy link
Contributor Author

duplicate of #5279 and probably improved by #4250

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants