New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Ability to run Fixer with parallel runner 🎉 #7777
Conversation
OK, let me just merge as-is |
Congratulation to 7777! 😎 |
19f4115
to
6b5ff67
Compare
6b5ff67
to
47acbaf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some early feedback while screening the proposal
d72fb36
to
96354b4
Compare
Most of the projects I use php-cs-fixer on do not have more than 500 files.
Wouldn't 250 per process be a bit of a waste of all the cores available
then?
(But maybe cause of the max 500 files, I never had speed issues with
php-cs-fixer to be honest :)).
…On Tue, 30 Jan 2024 at 23:00, Dariusz Rumiński ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In src/Runner/Parallel/ParallelConfig.php
<#7777 (comment)>
:
> + * This source file is subject to the MIT license that is bundled
+ * with this source code in the file LICENSE.
+ */
+
+namespace PhpCsFixer\Runner\Parallel;
+
+use Fidry\CpuCoreCounter\CpuCoreCounter;
+use Fidry\CpuCoreCounter\Finder\DummyCpuCoreFinder;
+use Fidry\CpuCoreCounter\Finder\FinderRegistry;
+
+/**
+ * @author Greg Korba ***@***.***>
+ */
+final class ParallelConfig
+{
+ private const DEFAULT_FILES_PER_PROCESS = 10;
That's why I was wondering not about using 1000, but 250 - still having
chance to fairly distribute problematic files and not have one handing job
while everyone else finished, but still avoiding the overhead of
initialising full app for 10 files only.
ah, so the timeout is not to pass any msg from worker to supervisor, but
for whole worker to finish the job? if we would have worker reporting each
file progress, would it prevent having big timeouts? (ie it's timeout till
we hear anything back from worker, or timeout to worker to finish
everything?)
[not judging for next steps direction, simply in need to understand how
this work]
—
Reply to this email directly, view it on GitHub
<#7777 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ALMD2DWVMNIMNB73Y6PHOXTYRFUQXAVCNFSM6AAAAABCJFJP56VHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMYTQNJSGQ2TAMBYGU>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
@jorismak default chunk size is 10, so in your case it's ~50 chunks distributed across available workers. But it's configurable (amount of cores, chunk size, timeout) 😉. |
cool. I replied on a comment to set it to 250 and I was thinking 'oh
but...' :).
Configurable is of course the best answer.
…On Wed, 31 Jan 2024 at 10:53, Greg Korba ***@***.***> wrote:
@jorismak <https://github.com/jorismak> default chunk size is 10, so in
your case it's ~50 chunks distributed across available workers. But *it's
configurable* (core's amount, chunk size, timeout) 😉.
—
Reply to this email directly, view it on GitHub
<#7777 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ALMD2DVJN2W4UP3YR7WGYBDYRIICZAVCNFSM6AAAAABCJFJP56VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMJYG42TSOBZHE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
6deb7e0
to
d660808
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks awesome!
Do you think it would be possible to implement high level tests?
@julienfalque thank you very much for the review ❤️. I'll address your comments soon. In terms of tests, I thought about running separate job in CI workflow that would run Fixer in Docker (using |
58a51a6
to
12e893d
Compare
- handled process errors (linting, failed fixes etc.) are now collected with ~original exception as source when possible. These don't break the main process, unless `--stop-on-violation` is used, it was like that before too, but `ParallelisationException` was used for recreating error's source, which was not correct. - other errors (socket in/out errors caught by React, or unhandled errors that kill the worker) are re-thrown in the main process that effectively stop further execution
} | ||
|
||
if (ParallelAction::WORKER_ERROR_REPORT === $workerResponse['action']) { | ||
throw WorkerException::fromRaw($workerResponse); // @phpstan-ignore-line |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
side note:
yes, we pass action
to fromRaw()
. if we would have {action: string, payload: datamodel}
, we would be able to pass data model only ;P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incredible work. I'm very happy to merge this in current format.
I acknowledge that it took me a while to be able to review it. I'm happy with outcome we built.
Awesome work @Wirone! Thanks to everyone who contributed! |
Thank you too @keradus and @julienfalque for reviewing the code and providing feedback! It took quite some time, but the fixes done after each review iteration were worth it, because the final implementation is much better than initial one 😁. @keradus found my really deeply-hidden misconceptions and mistakes and I am really grateful for that, even if it could have been frustrating at times 😅. Happy to see it merged, it's a great day for PHP community 🥳! |
So massive; with empty caches in both cases: Before: $ time composer php-cs-fixer
> @php ./bin/php-cs-fixer.phar fix --diff
PHP CS Fixer 3.57.1 (3810546) 7th Gear by Fabien Potencier, Dariusz Ruminski and contributors.
PHP runtime: 8.3.7
Running analysis on 1 core sequentially.
You can enable parallel runner and speed up the analysis! Please see usage docs for more information.
Loaded config default from ".php-cs-fixer.dist.php".
6474/6474 [▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓] 100%
Fixed 0 of 6474 files in 346.230 seconds, 57.016 MB memory used
real 5m46.724s
user 5m39.636s
sys 0m1.490s Now: $ time composer php-cs-fixer
> @php ./bin/php-cs-fixer.phar fix --diff
PHP CS Fixer 3.57.1 (3810546) 7th Gear by Fabien Potencier, Dariusz Ruminski and contributors.
PHP runtime: 8.3.7
Running analysis on 12 cores with 10 files per process.
Parallel runner is an experimental feature and may be unstable, use it at your own risk. Feedback highly appreciated!
Loaded config default from ".php-cs-fixer.dist.php".
6474/6474 [▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓] 100%
Fixed 0 of 6474 files in 38.130 seconds, 27.016 MB memory used
real 0m38.647s
user 7m8.250s
sys 0m2.629s (of course this is a machine with lots-o-cores, so) This will also make fine tuning the config on big code bases bearable! This is not hypothetical, when you've to wait 5+ minutes on fast machines, you think twice before touching the config 😅 |
Am I right the deps need to be added manually currently? Our situation is the following. We have many repos with CI. What we need is to either this to work out of box (/wo any dep or config to be added) or to have to pass some CLI argument like |
@mvorisek it can't be enabled with CLI option, it must be added to the config. Decision was to continue with sequential analysis by default and switch to CPU autodetection in v4. PHAR and shim follow it too. But it uses autodetection when "future mode" is enabled or Support for CLI options can be added, I just did not want to do it in this PR which was big enough already. |
|
@mvorisek you don't need to modify |
That is great, thank you, @Wirone 🎉🎉 |
Parallel Runner
Fixer is a great and widely used tool, but comparing to other modern PHP tools it lacked one crucial thing: ability to utilise more CPU cores. Until now 🥳 !
I've managed to hook into current runner and provide parallel analysis, heavily inspired by the PHPStan's implementation. By default Fixer still uses sequential analysis, but parallel runner can be easily enabled through config with
->setParallelConfig(\PhpCsFixer\Runner\Parallel\ParallelConfigFactory::detect())
(with core auto-detection) or->setParallelConfig(new \PhpCsFixer\Runner\Parallel\ParallelConfig(5, 20))
(explicit config).Fixes #2803
ℹ️ If you like this change, consider following me and/or sponsoring my OSS work 😎.
Test it on your code! #
Just add this to your config (assuming you're using default config builder):
and then you have 2 options:
Docker image #
docker run --rm -it -v $(pwd):/code wirone/php-cs-fixer:parallel check
Image is multi-arch, so it should work on any kind of hardware/software. Let me know if you have any problems.
Override Composer package with a fork #
Modify your
composer.json
:and run
composer update friendsofphp/php-cs-fixer -w
Concern: more dependencies #
Everything comes at some cost, we can't achieve parallel analysis with our internal code only. I mean, we could, but it does not make sense 😅. I managed to lower Composer constraints for ReactPHP packages so these should be compatible everywhere (or at least almost). For example
react/promise
v2.6 orreact/socket
v1.0 (installed on PHP 7.4 with--prefer-lowest
) are from 2018. All these packages support PHP >=5.3, so I believe they should not cause any issues when it comes to compatibility with people's runtimes and apps.CPU core auto-detection #
Auto-detection works properly, at least for cases I could test locally (native execution on the host, execution in Docker with limited CPU cores):
As you can see in the CI, it also properly works in Github Actions, where it detects 4 CPUs, which also speeds up all the Fixer jobs 🙂.
TODO #
I wanted to provide this change as a draft to collect the feedback - both technical from the review, but also from users' perspective (UX, performance, potential problems).
PR is marked as draft to prevent merge, butreview can be done, having this in mind:xargs
-based referencesReal world impact #
I made some workbench tests and below you can find the numbers for sequential and parallel runs for several projects. Analysis for external projects was done with locally built Docker image containing code from this branch, with parallel auto-detection (effectively 7 cores on MacBook Pro M1, because I have limit set on OrbStack level), and code mounted as a volume:
friendsofphp/php-cs-fixer
180.956 seconds, 152.279 respectively before iterator fix
symfony/symfony
CuyZ/Valinor
(info)