Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Ability to run Fixer with parallel runner 🎉 #7777

Merged
merged 77 commits into from May 15, 2024

Conversation

Wirone
Copy link
Member

@Wirone Wirone commented Jan 24, 2024

Parallel Runner

Fixer is a great and widely used tool, but comparing to other modern PHP tools it lacked one crucial thing: ability to utilise more CPU cores. Until now 🥳 !

I've managed to hook into current runner and provide parallel analysis, heavily inspired by the PHPStan's implementation. By default Fixer still uses sequential analysis, but parallel runner can be easily enabled through config with ->setParallelConfig(\PhpCsFixer\Runner\Parallel\ParallelConfigFactory::detect()) (with core auto-detection) or ->setParallelConfig(new \PhpCsFixer\Runner\Parallel\ParallelConfig(5, 20)) (explicit config).

Fixes #2803

ℹ️ If you like this change, consider following me and/or sponsoring my OSS work 😎.

Test it on your code! #

Just add this to your config (assuming you're using default config builder):

->setParallelConfig(\PhpCsFixer\Runner\Parallel\ParallelConfigFactory::detect())

and then you have 2 options:

Docker image #

docker run --rm -it -v $(pwd):/code wirone/php-cs-fixer:parallel check

Image is multi-arch, so it should work on any kind of hardware/software. Let me know if you have any problems.

Override Composer package with a fork #

Modify your composer.json:

{
    "repositories": [
        {
            "type": "vcs",
            "url": "https://github.com/Wirone/PHP-CS-Fixer"
        }
    ],
    "require": {
        "friendsofphp/php-cs-fixer": "dev-codito/bombazo as 3.7777"
    }
}

and run composer update friendsofphp/php-cs-fixer -w

Concern: more dependencies #

Everything comes at some cost, we can't achieve parallel analysis with our internal code only. I mean, we could, but it does not make sense 😅. I managed to lower Composer constraints for ReactPHP packages so these should be compatible everywhere (or at least almost). For example react/promise v2.6 or react/socket v1.0 (installed on PHP 7.4 with --prefer-lowest) are from 2018. All these packages support PHP >=5.3, so I believe they should not cause any issues when it comes to compatibility with people's runtimes and apps.

CPU core auto-detection #

Auto-detection works properly, at least for cases I could test locally (native execution on the host, execution in Docker with limited CPU cores):

image

As you can see in the CI, it also properly works in Github Actions, where it detects 4 CPUs, which also speeds up all the Fixer jobs 🙂.

TODO #

I wanted to provide this change as a draft to collect the feedback - both technical from the review, but also from users' perspective (UX, performance, potential problems). PR is marked as draft to prevent merge, but review can be done, having this in mind:

  • Remove BC break before continuing
  • Tests. I did not write them because I wasn't sure how the final contract will look like.
  • Handling errors in worker (displaying them on the main process' side)
  • Proper cache support
  • Check required ReactPHP versions, maybe we can lower the constraints and make installation more inclusive (for projects that already use Fixer and ReactPHP with lower version)
  • Usage docs
  • Remove quasi-parallel script from the Composer scripts and any xargs-based references
  • Resolve this list

Real world impact #

I made some workbench tests and below you can find the numbers for sequential and parallel runs for several projects. Analysis for external projects was done with locally built Docker image containing code from this branch, with parallel auto-detection (effectively 7 cores on MacBook Pro M1, because I have limit set on OrbStack level), and code mounted as a volume:

docker build --target dist -t fixer:local .
cd /path/to/project/for/analyse
docker run --rm -it -v $(pwd):/code fixer:local check -vvv
Repository Files count Sequential Parallel
friendsofphp/php-cs-fixer 1080 65.319 seconds 8.088 seconds (11.623 before iterator fix)
GetResponse (with `@PER-CS2.0` ruleset) 31376 518.740 seconds 75.879 seconds (7 cores in Docker), 76.262 (10 cores natively on host)

180.956 seconds, 152.279 respectively before iterator fix
symfony/symfony 6220 183.977 seconds 35.440 seconds (64.352 before iterator fix)
CuyZ/Valinor (info) 605 6.369 seconds 1.959 seconds (4.785 before iterator fix)

@Wirone Wirone self-assigned this Jan 24, 2024
@coveralls
Copy link

coveralls commented Jan 24, 2024

Coverage Status

coverage: 95.72% (-0.4%) from 96.118%
when pulling 1bae68b on Wirone:codito/bombazo
into 8d5cccf on PHP-CS-Fixer:master.

@keradus
Copy link
Member

keradus commented Jan 24, 2024

OK, let me just merge as-is

@mvorisek
Copy link
Contributor

Congratulation to 7777! 😎

@Wirone Wirone changed the title feat: The one you're waiting for 😎 feat: Ability to run Fixer with parallel runner 🎉 Jan 28, 2024
@Wirone Wirone added topic/I/O topic/core Core features of Fixer's engine labels Jan 28, 2024
composer.json Outdated Show resolved Hide resolved
src/Runner/Runner.php Outdated Show resolved Hide resolved
src/Runner/Runner.php Outdated Show resolved Hide resolved
Copy link
Member

@keradus keradus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some early feedback while screening the proposal

src/Runner/Runner.php Show resolved Hide resolved
src/Runner/Runner.php Outdated Show resolved Hide resolved
src/Runner/Runner.php Outdated Show resolved Hide resolved
src/Runner/Parallel/ParallelConfig.php Show resolved Hide resolved
src/Runner/Parallel/ParallelConfig.php Outdated Show resolved Hide resolved
src/Runner/Parallel/Process.php Show resolved Hide resolved
@Wirone Wirone mentioned this pull request Jan 30, 2024
3 tasks
@jorismak
Copy link

jorismak commented Jan 31, 2024 via email

@Wirone
Copy link
Member Author

Wirone commented Jan 31, 2024

@jorismak default chunk size is 10, so in your case it's ~50 chunks distributed across available workers. But it's configurable (amount of cores, chunk size, timeout) 😉.

@jorismak
Copy link

jorismak commented Jan 31, 2024 via email

@Wirone Wirone force-pushed the codito/bombazo branch 2 times, most recently from 6deb7e0 to d660808 Compare February 1, 2024 07:18
Copy link
Member

@julienfalque julienfalque left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks awesome!

Do you think it would be possible to implement high level tests?

src/Runner/FileCachingLintingFileIterator.php Outdated Show resolved Hide resolved
src/Runner/Parallel/ProcessPool.php Outdated Show resolved Hide resolved
src/Runner/Parallel/ProcessPool.php Outdated Show resolved Hide resolved
src/Runner/Parallel/ProcessPool.php Outdated Show resolved Hide resolved
src/Runner/Parallel/ProcessPool.php Outdated Show resolved Hide resolved
tests/Console/ConfigurationResolverTest.php Outdated Show resolved Hide resolved
src/Runner/Runner.php Outdated Show resolved Hide resolved
src/Runner/Parallel/Process.php Show resolved Hide resolved
src/Runner/Parallel/Process.php Outdated Show resolved Hide resolved
src/Console/Command/WorkerCommand.php Outdated Show resolved Hide resolved
@Wirone
Copy link
Member Author

Wirone commented Feb 2, 2024

@julienfalque thank you very much for the review ❤️. I'll address your comments soon.

In terms of tests, I thought about running separate job in CI workflow that would run Fixer in Docker (using docker run) because we can utilise --cpus 2 to ensure stable CPUs amount, and assert that Running analysis on 2 cores with X files per process is in the output. I did not look how it's tested in PHPStan yet, I wanted to make final internal API for this and then figure out how to test it 😅. Any suggestions are highly welcome, though.

- handled process errors (linting, failed fixes etc.) are now collected with ~original exception as source when possible. These don't break the main process, unless `--stop-on-violation` is used, it was like that before too, but `ParallelisationException` was used for recreating error's source, which was not correct.
- other errors (socket in/out errors caught by React, or unhandled errors that kill the worker) are re-thrown in the main process that effectively stop further execution
@Wirone
Copy link
Member Author

Wirone commented May 15, 2024

@keradus to maintain 77 commits I've addressed your comments in interactive rebase, changed commits are:

I believe it's now RTM, since last 2 points on this list can be resolved later 🙂.

}

if (ParallelAction::WORKER_ERROR_REPORT === $workerResponse['action']) {
throw WorkerException::fromRaw($workerResponse); // @phpstan-ignore-line
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

side note:
yes, we pass action to fromRaw(). if we would have {action: string, payload: datamodel}, we would be able to pass data model only ;P

Copy link
Member

@keradus keradus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incredible work. I'm very happy to merge this in current format.

I acknowledge that it took me a while to be able to review it. I'm happy with outcome we built.

@keradus keradus merged commit 5c90224 into PHP-CS-Fixer:master May 15, 2024
27 of 28 checks passed
@julienfalque
Copy link
Member

julienfalque commented May 15, 2024

Awesome work @Wirone! Thanks to everyone who contributed!

@Wirone
Copy link
Member Author

Wirone commented May 15, 2024

Thank you too @keradus and @julienfalque for reviewing the code and providing feedback! It took quite some time, but the fixes done after each review iteration were worth it, because the final implementation is much better than initial one 😁. @keradus found my really deeply-hidden misconceptions and mistakes and I am really grateful for that, even if it could have been frustrating at times 😅. Happy to see it merged, it's a great day for PHP community 🥳!

@mfn
Copy link

mfn commented May 16, 2024

So massive; with empty caches in both cases:

Before:

 $ time composer php-cs-fixer
> @php ./bin/php-cs-fixer.phar fix --diff
PHP CS Fixer 3.57.1 (3810546) 7th Gear by Fabien Potencier, Dariusz Ruminski and contributors.
PHP runtime: 8.3.7
Running analysis on 1 core sequentially.
You can enable parallel runner and speed up the analysis! Please see usage docs for more information.
Loaded config default from ".php-cs-fixer.dist.php".
 6474/6474 [▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓] 100%


Fixed 0 of 6474 files in 346.230 seconds, 57.016 MB memory used

real	5m46.724s
user	5m39.636s
sys	0m1.490s

Now:

 $ time composer php-cs-fixer
> @php ./bin/php-cs-fixer.phar fix --diff
PHP CS Fixer 3.57.1 (3810546) 7th Gear by Fabien Potencier, Dariusz Ruminski and contributors.
PHP runtime: 8.3.7
Running analysis on 12 cores with 10 files per process.
Parallel runner is an experimental feature and may be unstable, use it at your own risk. Feedback highly appreciated!
Loaded config default from ".php-cs-fixer.dist.php".
 6474/6474 [▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓] 100%


Fixed 0 of 6474 files in 38.130 seconds, 27.016 MB memory used

real	0m38.647s
user	7m8.250s
sys	0m2.629s

(of course this is a machine with lots-o-cores, so)

This will also make fine tuning the config on big code bases bearable! This is not hypothetical, when you've to wait 5+ minutes on fast machines, you think twice before touching the config 😅

@mvorisek
Copy link
Contributor

Am I right the deps need to be added manually currently?

Our situation is the following. We have many repos with CI. What we need is to either this to work out of box (/wo any dep or config to be added) or to have to pass some CLI argument like --parallel[=numCores] - is that currently possible? Or can at least the packed https://github.com/PHP-CS-Fixer/shim package support it?

@Wirone
Copy link
Member Author

Wirone commented May 16, 2024

@mvorisek it can't be enabled with CLI option, it must be added to the config. Decision was to continue with sequential analysis by default and switch to CPU autodetection in v4. PHAR and shim follow it too. But it uses autodetection when "future mode" is enabled or PHP_CS_FIXER_PARALLEL is set 🙂.

Support for CLI options can be added, I just did not want to do it in this PR which was big enough already.

@mvorisek
Copy link
Contributor

PHP_CS_FIXER_PARALLEL - so what CLI commands we need currently execute, in https://github.com/atk4/ui/blob/5.0.0/.github/workflows/test-unit.yml#L64, ie. /wo having to modify repo composer.json or fixer config, to benefit from this PR?

@Wirone
Copy link
Member Author

Wirone commented May 16, 2024

@mvorisek you don't need to modify composer.json, all the required dependencies are added directly to the Fixer, so you just need to update to 3.57 and enable parallel runner in the config or via PHP_CS_FIXER_PARALLEL=1 (set globally or as a one-time value PHP_CS_FIXER_PARALLEL=1 php-cs-fixer check ...).

@mvorisek
Copy link
Contributor

That is great, thank you, @Wirone 🎉🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic/core Core features of Fixer's engine topic/I/O
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for parallelisation of analysis (utilise several CPUs)