-
-
Notifications
You must be signed in to change notification settings - Fork 9.7k
[Routing] Optimised dumped matcher #21926
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Routing] Optimised dumped matcher #21926
Conversation
At the risk of sounding like a stick in the mud, adding ~470 LOC seems like a big change sans benchmarks to show the performance improvement and to show that performance doesn't suffer with few routes. |
@assertchris It's actually not adding that. Those are extra compiled cases, so there are more test fixtures. |
@assertchris in addition to that, the fast majority of that code is used at compile time, not at runtime. There are also optimisation which benefit small groups. |
Still, @assertchris is right about one thing: we want to see the numbers when you claim it improves performance. |
@stof it has the same behaviour in smaller cases, so I don't know what you want to see. Also, in those cases neither before or after would show up in blackfire, it's literally double digit nanosecond wall time we're talking about. |
@arnaud-lb since you did the initial work on the prefix optimisation I thought I'd ping you. |
* @param $prefix | ||
* @param $route | ||
* | ||
* @return bool|StaticPrefixCollection |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use null
, not false
for the not found case. This would create an API with a nullable return type rather than with a mixed type
* | ||
* @param StaticPrefixCollection|array $item | ||
* @param $prefix | ||
* @param $route |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
invalid phpdoc: missing types
// Lower index to pass through the same index again after optimizing. | ||
// The first item of the replacements might be a group needing optimization. | ||
--$index; | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
useless continue, as it is the end of the iteration anyway
@@ -223,14 +223,39 @@ private static function compilePattern(Route $route, $pattern, $isHost) | |||
} | |||
|
|||
return array( | |||
'staticPrefix' => 'text' === $tokens[0][0] ? $tokens[0][1] : '', | |||
'staticPrefix' => static::determineStaticPrefix($route, $tokens), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
access to private APIs MUST use self
, not static
. Otherwise it breaks when using inheritance as it will use ChildClass::determineStaticPrefix
, which is forbidden by the visibility.
if ('text' !== $tokens[0][0]) { | ||
return $route->hasDefault($tokens[0][3]) | ||
? '' | ||
: $tokens[0][1]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should stay on the same line
|
||
// bar2 | ||
if (0 === strpos($pathinfo, '/a/b\'b/') && preg_match('#^/a/b\'b/(?P<bar1>[^/]++)$#s', $pathinfo, $matches)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this inlining looks suspicious to me. It will force checking the prefix twice before doing the regex checks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stof I've disabled inlining for groups which contain a route that has the same prefix at the group it resides in.
return $this->mergeDefaults(array_replace($matches, array('_route' => 'foo4')), array ()); | ||
} | ||
// ababa | ||
if ('/ababa' === $pathinfo) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for fully static routes, I agree that checking a prefix first looks useless.
*/ | ||
private function accepts($prefix) | ||
{ | ||
return '' === $this->prefix || strpos($prefix, $this->prefix, 0) === 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
passing the offset explicitly is useless, as you use the default one
Well, I never said that the numbers you are showing us should be on the smaller case. You talk about a 60x performance improvement. I want a proof. |
@stof ok, how do you want the proof and what is the baseline? |
@frankdejonge the comparison should be between the master branch and your optimized version. See #5734 (comment) about the way @arnaud-lb did it when applying the initial optimization. |
@stof I'll create the benchmark and post it with the results here. |
@stof I believe you meant "Thank you for your contribution, this would be a great addition to the framework. Could we maybe have some benchmarks on less routes / more user-land samples before we make a decision?" but for some reason your auto correct went quite poorly. |
Seems great @frankdejonge, impressive improvements ! If I understand correctly, you pushed the optimizations farther by allowing the routes to be re-ordered (while taking into account that some routes must still be matched before others - those with parameters at least), and striping inefficient / too small groupings ? |
@arnaud-lb that's exactly correct! |
DumperPrefixCollection is an internal class. So if the new logic does not use it anymore, the class should be removed (or the same class should be reused differently). No need to keep dead internal code around. |
While the benchmarks are running I'll provide some more context. Part of the reason for this optimisation is because we use a locale prefix in our bundles. More specifically it's because we also use BeSimpleRouting bundle which adds multiple routes based on locale per route definition. This causes the prefixed routes to not be grouped, missing the opportunity of benefitting from the prefix grouping currently in place. The problem is also not only specific to the BeSimpleRouting bundle, every bundle which provides this will benefit from this more intelligent grouping. Also, in most cases this dumping strategy outputs less code to do the same work. Because of the advanced grouping, we can also be smarter about excluding multiple paths. Like, if a group is followed by another group, the second group can never be matched if the first group did. Simply by chaining the groups in if/elseif statements additional exclusions are possible. |
Also, believe it or not, this is not even the most we can push out of the routing bundle just by sorting and grouping routes. There's still the possibility of taking into account segments after parameters to intelligently group routes. For instance, if you have a the paths (in order) |
After profiling it seems that big as well as smaller groups benefit from this optimisation. As an added benefit the cost of routing is more stable (less difference between lower and upper bounds), which is especially beneficial for determining scaling needs. |
@stof the old code and tree builder has been deleted. |
@dmaicher could you share a route definition collection with which I can simulate this? |
I think I know which case this is... |
@dmaicher should be good now. |
Still the same problem 😢 I can try to put together a minimal route definition set tomorrow morning to reproduce the problem. |
@dmaicher I've recreated the issue, working on the solution. |
@dmaicher could you try again? |
@frankdejonge that case is fine now but I still have some more test fails 😢 |
@dmaicher what kind of failures? |
Here is one more failing example:
Dumped matcher for your version: elseif (0 === strpos($pathinfo, '/statistics/semantic')) {
// ca_statistics_semantic
if ('/statistics/semantic/' === $pathinfo) {
if (!in_array($canonicalMethod, array('GET', 'POST'))) {
$allow = array_merge($allow, array('GET', 'POST'));
goto not_ca_statistics_semantic;
}
return array(...) Dumped matcher for if (0 === strpos($pathinfo, '/statistics/semantic')) {
// ca_statistics_semantic
if (rtrim($pathinfo, '/') === '/statistics/semantic') {
if (!in_array($this->context->getMethod(), array('GET', 'POST', 'HEAD'))) {
$allow = array_merge($allow, array('GET', 'POST', 'HEAD'));
goto not_ca_statistics_semantic;
}
if (substr($pathinfo, -1) !== '/') {
return $this->redirect($pathinfo.'/', 'ca_statistics_semantic');
}
return array (...) |
This could be the effect of other changes, looks like your dumper doesn't allow a trailing slack to be absent, which is caused by having a non redirecting matcher. I'll look into this some more, try and reproduce it and come back. I'm AFK now, but might be good to look into the strict matching setting on your end too. |
@dmaicher I would have expected a piece of code like this: // a_fifth
if ('/a/55' === $trimmedPathinfo) {
if (substr($pathinfo, -1) !== '/') {
return $this->redirect($pathinfo.'/', 'a_fifth');
}
return array('_route' => 'a_fifth');
} Which is generated for a route if it has redirect support. But I see in the code that those redirects only happen if the matcher supports redirects, but the method should have no methods (thus matching all methods) or contains the HEAD method. This code wasn't changed in this PR. Perhaps @fabpot could shed some light on this? I saw he was the last one to touch that bit of logic. |
Maybe my test fails are related to other changes on |
Ok @frankdejonge I think its related to something else indeed. I just took |
@dmaicher cool, thanks for trying out this PR in a real project btw! Very valuable! |
I also did a quick benchmark for your PR on my project using this script: https://gist.github.com/dmaicher/5e85c23145e84a4400354224da85bd08 I did 5 runs for each version.
your PR changes:
I don't see a clear performance advantage here but it might be because the optimization is not effective for my route collection. @frankdejonge do you see something wrong with the benchmark script? Maybe some other people could try it on real projects? 😊 |
@dmaicher it really depends on the project. Also, the times of the script is not the cleanest indicator of wether there are speed improvements. Did you look at the benchmark script I posted above? |
@dmaicher Also, it could be that your routes grouped nicely with the current optimisation methods, but not all of them do. In the case of the application I was working in the algorithm didn't match at all. |
@frankdejonge yes I also believe your new optimizations simply don't apply for my application 😉 This particular app has 4 different hosts and a lot of routes are filtered/matched by host first. I checked your benchmark script but my script was simply easier to run on an existing app without extracting route definitions first. The good news is that your solution is for sure not slower than |
@dmaicher nope, on average they are faster in your results too, so that's good. |
@dmaicher all good on your side, no remaining failing edge cases? |
@nicolas-grekas yes the remaining test fails were not related to the changes in this PR (Running a Symfony 2.8 app and updated |
Thank you @frankdejonge. |
This PR was squashed before being merged into the 3.3-dev branch (closes #21926). Discussion ---------- [Routing] Optimised dumped matcher | Q | A | ------------- | --- | Branch? | master | Bug fix? | no | New feature? | no | BC breaks? | no | Deprecations? | no | Tests pass? | yes | Fixed tickets | - | License | MIT | Doc PR | - TL;DR: I've optimised the PhpMatcherDumper output for a <del>60x</del> 4.4x performance improvement on a collection of ~800 routes by inducing cyclomatic complexity. [EDIT] The 60x performance boost was only visible when profiling with blackfire, which is quite possibly a result of the cost of profiling playing a part. After doing some more profiling the realistic benefit of the optimisation is more likely to be in the ranges is 1.3x to 4.4x. After the previous optimisation I began looking at how the PrefixCollection was adding its performance boost. I spotted another way to do this, which has the same theory behind it (excluding groups based on prefixes). The current implementation only groups when one prefix resides in the other. In this new implementation I've created a way to detect common prefixes, which allows for much more efficient grouping. Every time a route is added to the group it'll either merge into an existing group, merge into a new group with a route that has a common prefix, or merge into a new group with an existing group that has a common prefix. However, when a parameter is present grouping must only be done AFTER that route, this case is accounted for. In all other cases, where there's no collision routes can be grouped freely because if a group was matched other groups wouldn't have matched. After all the groups are created the groups are optimised. Groups with fewer than 3 children are inlined into the parent group. This is because a group with 2 children would potentially result in 3 prefix checks while if they are inlines it's 2 checks. Like with the previous optimisation I've profiled this using blackfire. But the match function didn't show up anymore. I've added `usleep` calls in the dumped matcher during profiling, which made it show up again. I've verified with @simensen that this is because the wall time of the function was too small for it to be of any interest. When it DID get detected, because of more tasks running, it would show up with around 250 nanoseconds. In comparison, the previous speed improvement brought the wall time down from 7ms to ~2.5ms on a set of ~800 routes. Because of the altered grouping behaviour I've not modified the PrefixCollection but I've created a new StaticPrefixCollection and updated the PhpMatcherDumper to use that instead. Commits ------- 449b691 [Routing] Optimised dumped matcher
TL;DR: I've optimised the PhpMatcherDumper output for a
60x4.4x performance improvement on a collection of ~800 routes by inducing cyclomatic complexity.[EDIT] The 60x performance boost was only visible when profiling with blackfire, which is quite possibly a result of the cost of profiling playing a part. After doing some more profiling the realistic benefit of the optimisation is more likely to be in the ranges is 1.3x to 4.4x.
After the previous optimisation I began looking at how the PrefixCollection was adding its performance boost. I spotted another way to do this, which has the same theory behind it (excluding groups based on prefixes). The current implementation only groups when one prefix resides in the other. In this new implementation I've created a way to detect common prefixes, which allows for much more efficient grouping. Every time a route is added to the group it'll either merge into an existing group, merge into a new group with a route that has a common prefix, or merge into a new group with an existing group that has a common prefix.
However, when a parameter is present grouping must only be done AFTER that route, this case is accounted for. In all other cases, where there's no collision routes can be grouped freely because if a group was matched other groups wouldn't have matched.
After all the groups are created the groups are optimised. Groups with fewer than 3 children are inlined into the parent group. This is because a group with 2 children would potentially result in 3 prefix checks while if they are inlines it's 2 checks.
Like with the previous optimisation I've profiled this using blackfire. But the match function didn't show up anymore. I've added
usleep
calls in the dumped matcher during profiling, which made it show up again. I've verified with @simensen that this is because the wall time of the function was too small for it to be of any interest. When it DID get detected, because of more tasks running, it would show up with around 250 nanoseconds. In comparison, the previous speed improvement brought the wall time down from 7ms to ~2.5ms on a set of ~800 routes.Because of the altered grouping behaviour I've not modified the PrefixCollection but I've created a new StaticPrefixCollection and updated the PhpMatcherDumper to use that instead.