Skip to content

Commit

Permalink
minor #15852 [CssSelector] Optimize regexs matching simple selectors …
Browse files Browse the repository at this point in the history
…(stof)

This PR was merged into the 2.3 branch.

Discussion
----------

[CssSelector] Optimize regexs matching simple selectors

| Q             | A
| ------------- | ---
| Bug fix?      | no
| New feature?  | no
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | n/a
| License       | MIT
| Doc PR        | n/a

These shortcut parsers are applied first when converting a CSS selector to XPath, to be faster for simple selectors (tag matching, class matching with an optional tag, id matching with an optional tag).
None of the regexes defined here could have more chances to match more element when backtracking identifiers. So the backtracking is only slowing down the regex engine when the regex does not match (i.e. for any more complex selector for instance, or even for simple selectors without namespace of without tag name). Making quantifiers possessive solves this issue.

I also turned some capturing groups (around the namespace and the namespace delimiter) into non-capturing groups as we don't care about them in the output (they are just here to be optional).

Commits
-------

d5abe0b [CssSelector] Optimize regexs matching simple selectors
  • Loading branch information
fabpot committed Sep 21, 2015
2 parents 6947f69 + 877e88b commit 6bc51c1
Show file tree
Hide file tree
Showing 3 changed files with 20 additions and 23 deletions.
15 changes: 7 additions & 8 deletions Parser/Shortcut/ClassParser.php
Expand Up @@ -33,15 +33,14 @@ public function parse($source)
{
// Matches an optional namespace, optional element, and required class
// $source = 'test|input.ab6bd_field';
// $matches = array (size=5)
// 0 => string 'test:input.ab6bd_field' (length=22)
// 1 => string 'test:' (length=5)
// 2 => string 'test' (length=4)
// 3 => string 'input' (length=5)
// 4 => string 'ab6bd_field' (length=11)
if (preg_match('/^(([a-z]+)\|)?([\w-]+|\*)?\.([\w-]+)$/i', trim($source), $matches)) {
// $matches = array (size=4)
// 0 => string 'test|input.ab6bd_field' (length=22)
// 1 => string 'test' (length=4)
// 2 => string 'input' (length=5)
// 3 => string 'ab6bd_field' (length=11)
if (preg_match('/^(?:([a-z]++)\|)?+([\w-]++|\*)?+\.([\w-]++)$/i', trim($source), $matches)) {
return array(
new SelectorNode(new ClassNode(new ElementNode($matches[2] ?: null, $matches[3] ?: null), $matches[4])),
new SelectorNode(new ClassNode(new ElementNode($matches[1] ?: null, $matches[2] ?: null), $matches[3])),
);
}

Expand Down
13 changes: 6 additions & 7 deletions Parser/Shortcut/ElementParser.php
Expand Up @@ -32,13 +32,12 @@ public function parse($source)
{
// Matches an optional namespace, required element or `*`
// $source = 'testns|testel';
// $matches = array (size=4)
// 0 => string 'testns:testel' (length=13)
// 1 => string 'testns:' (length=7)
// 2 => string 'testns' (length=6)
// 3 => string 'testel' (length=6)
if (preg_match('/^(([a-z]+)\|)?([\w-]+|\*)$/i', trim($source), $matches)) {
return array(new SelectorNode(new ElementNode($matches[2] ?: null, $matches[3])));
// $matches = array (size=3)
// 0 => string 'testns|testel' (length=13)
// 1 => string 'testns' (length=6)
// 2 => string 'testel' (length=6)
if (preg_match('/^(?:([a-z]++)\|)?([\w-]++|\*)$/i', trim($source), $matches)) {
return array(new SelectorNode(new ElementNode($matches[1] ?: null, $matches[2])));
}

return array();
Expand Down
15 changes: 7 additions & 8 deletions Parser/Shortcut/HashParser.php
Expand Up @@ -33,15 +33,14 @@ public function parse($source)
{
// Matches an optional namespace, optional element, and required id
// $source = 'test|input#ab6bd_field';
// $matches = array (size=5)
// 0 => string 'test:input#ab6bd_field' (length=22)
// 1 => string 'test:' (length=5)
// 2 => string 'test' (length=4)
// 3 => string 'input' (length=5)
// 4 => string 'ab6bd_field' (length=11)
if (preg_match('/^(([a-z]+)\|)?([\w-]+|\*)?#([\w-]+)$/i', trim($source), $matches)) {
// $matches = array (size=4)
// 0 => string 'test|input#ab6bd_field' (length=22)
// 1 => string 'test' (length=4)
// 2 => string 'input' (length=5)
// 3 => string 'ab6bd_field' (length=11)
if (preg_match('/^(?:([a-z]++)\|)?+([\w-]++|\*)?+#([\w-]++)$/i', trim($source), $matches)) {
return array(
new SelectorNode(new HashNode(new ElementNode($matches[2] ?: null, $matches[3] ?: null), $matches[4])),
new SelectorNode(new HashNode(new ElementNode($matches[1] ?: null, $matches[2] ?: null), $matches[3])),
);
}

Expand Down

0 comments on commit 6bc51c1

Please sign in to comment.