Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

faster FileHelper::normalizePath() #735

Merged
merged 5 commits into from Oct 25, 2021

Conversation

staabm
Copy link
Contributor

@staabm staabm commented Oct 25, 2021

running phpstan on one of our projects consistently leads to a profile like

grafik
see that FileHelper::normalizePath() is a very hot path, mostly dominated by PCRE invocations.

this PR proposes small adjustments so normalize no longer needs to invoke PCRE in the common case.
with this changes we get a solid speed improvement:

grafik

grafik

also note a improvement in memory consumption.

profiles compared are based on the latest commit on master b726e08

if ($matches !== null) {
[, $scheme, $path] = $matches;
} else {
$scheme = null;
$path = $originalPath;
}

$path = str_replace('\\', '/', $path);
$path = Strings::replace($path, '~/{2,}~', '/');
$path = str_replace(['\\', '//', '///', '////'], '/', $path);
Copy link
Contributor Author

@staabm staabm Oct 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that we won't see path with more then 4 slashes in a row.
the previous regex supported "any number of slashes" but I think thats more we actually need.

@@ -41,16 +41,21 @@ public function absolutizePath(string $path): string
/** @api */
public function normalizePath(string $originalPath, string $directorySeparator = DIRECTORY_SEPARATOR): string
{
$matches = \Nette\Utils\Strings::match($originalPath, '~^([a-z]+)\\:\\/\\/(.+)~');
$isLocalPath = $originalPath && $originalPath[0] === '/';
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

most of the path we get passed start with a / (regular unix path).
these can never match the match-regex below which tries to detect path from within phar files.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about Windows?

Copy link
Contributor Author

@staabm staabm Oct 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah also thought about that (I am even working on windows most of the time ;-)).

we could either do a educated guess like the 2nd char is a : as in C:\.

maybe it would even be possible to check whether the first char of the string is != 'p', as the code later on is checking for a phar:// prefix. not sure this code is meant to handle more cases other then phar://?

@staabm staabm marked this pull request as ready for review October 25, 2021 12:46
@ondrejmirtes ondrejmirtes merged commit 1505153 into phpstan:master Oct 25, 2021
@ondrejmirtes
Copy link
Member

Thank you!

@staabm staabm deleted the speedup-file-helper branch October 25, 2021 15:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants