Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

[2.2] [CssSelector] fixes and enhancements #4271

Closed
wants to merge 6 commits into
from

Conversation

Projects
None yet
5 participants

ju1ius commented May 13, 2012

fixes nth-child & nth-last-child
fixes :not()
fixes issue #3547
fixes attribute selector matching
added new tests

Most changes ported from the Nokogiri gem.

This pull request passes (merged 3f9982c into e193452).

The same changes has been in lxml, but then changed to something else again for performance reasons.

See lxml/lxml@5fc9eac

I'm no expert in XPath, so I don't know what to think about these changes. What do you think?

Owner

ju1ius replied May 13, 2012

I think it needs benchmarking.
But if the performance regression is minor, then it's worth the tradeoff.
Will investigate.

Owner

ju1ius replied May 13, 2012

Based on this benchmark:

Using descendant:: instead of // gives at most a 2% speed improvement, with no difference memory usage.
So this is a minor optimization and I think we should switch to using // right away, as it fixes some edge cases.

The benchmark also reveals that using name queries like *[name() = 'tagname' and whatever] instead of just tagname[whatever] is more than two times slower.
So a good optimization would be to avoid unnecessary calls to XPathExpr::addNameTest(), or refactor it to use the element part of the expression whenever possible.

Owner

ju1ius replied May 13, 2012

Reading through the code, I can't come up with a case where the use of the name() test is useful.
As far as I understand, the name function is useful only for complex matching expressions, like //*[starts-with(name(), 'h')].
What do you think ?

@stloyd stloyd and 1 other commented on an outdated diff May 13, 2012

src/Symfony/Component/CssSelector/Node/FunctionNode.php
}
$xpath->addCondition(sprintf('position() = %s', $b));
-
@stloyd

stloyd May 13, 2012

Contributor

This should be reverted.

@ju1ius

ju1ius May 13, 2012

Why ?
This is the intended behavior and conforms to what at least Gecko and WebKit do...
See the selector matching tests.

@stloyd stloyd commented on an outdated diff May 13, 2012

src/Symfony/Component/CssSelector/Node/FunctionNode.php
$xpath->addStarPrefix();
- if ($a == 0) {
+
+ if ($a === 0) {
@stloyd

stloyd May 13, 2012

Contributor

It should be 0 === $a.

@stloyd stloyd commented on an outdated diff May 13, 2012

src/Symfony/Component/CssSelector/Node/FunctionNode.php
- if ($a != 1) {
- $expr = array(sprintf('(position() %s) mod %s = 0', $bNeg, $a));
- } else {
- $expr = array();
- }
-
- if ($b >= 0) {
- $expr[] = sprintf('position() >= %s', $b);
- } elseif ($b < 0 && $last) {
- $expr[] = sprintf('position() < (last() %s)', $b);
+ if ($b === 0) {
@stloyd

stloyd May 13, 2012

Contributor

Should be 0 === $b.

@stloyd stloyd commented on an outdated diff May 13, 2012

src/Symfony/Component/CssSelector/Node/FunctionNode.php
- if ($expr) {
- $xpath->addCondition($expr);
- }
+ $xpath->addCondition(sprintf(
+ "(%s %s %s) and (((%s - %s) mod %s) = 0)",
+ $position, $compare, $b,
+ $position, $b, abs($a)
@stloyd

stloyd May 13, 2012

Contributor

You should split every variable into new line, or none of them.

@stloyd stloyd commented on an outdated diff May 13, 2012

src/Symfony/Component/CssSelector/Node/FunctionNode.php
@@ -224,6 +202,10 @@ protected function _xpath_contains($xpath, $expr)
protected function _xpath_not($xpath, $expr)
{
// everything for which not expr applies
+ if ($expr instanceof ElementNode) {
+ $xpath->addCondition(sprintf("not(name() = '%s')", $expr->toXpath()));
+ return $xpath;
@stloyd

stloyd May 13, 2012

Contributor

Missing new line before return.

@stloyd stloyd and 1 other commented on an outdated diff May 13, 2012

.../Component/CssSelector/Tests/SelectorMatchingTest.php
+
+ <b foo="à">B-1</b>
+ <b foo="é">B-2</b>
+ <b foo="î">B-3</b>
+ <b foo="ö">B-4</b>
+ <b foo="é-ö">B-5</b>
+ <b foo="àé">B-6</b>
+ <b foo="àé öù">B-7</b>
+ <b foo="où îö ùÿ">B-8</b>
+
+</root>
+EOS;
+
+ private static
+ $_DOM = null,
+ $_XPATH = null;
@stloyd

stloyd May 13, 2012

Contributor

Please fix CS in this file to match Symfony2 CS.

@ju1ius

ju1ius May 13, 2012

Thanks for the link.
Will fix all related issues.

@stloyd stloyd and 1 other commented on an outdated diff May 13, 2012

.../Component/CssSelector/Tests/SelectorMatchingTest.php
+/*
+ * This file is part of the Symfony package.
+ *
+ * (c) Fabien Potencier <fabien@symfony.com>
+ *
+ * For the full copyright and license information, please view the LICENSE
+ * file that was distributed with this source code.
+ */
+
+namespace Symfony\Component\CssSelector\Tests;
+
+use Symfony\Component\CssSelector\CssSelector;
+
+class SelectorMatchingTest extends \PHPUnit_Framework_TestCase
+{
+ private static $_XML = <<<EOS
@stloyd

stloyd May 13, 2012

Contributor

Why not as an external fixture ?

@ju1ius

ju1ius May 13, 2012

I looked at the DomCrowler tests, and the xml documents are defined inline.
I like the idea to make it external however.
Where should I put the file ?
Under Tests/Fixtures ?

This pull request passes (merged a730cf1 into e193452).

fabpot commented on 48117d3 May 13, 2012

Can you create a PR with just this commit? After that, I would like to backport fixes that have been done upstream, and try to not diverge too much from upstream (as it makes maintenance easier for us to backport future upstream fixes).

Owner

ju1ius replied May 14, 2012

Just in case, the cssselect part of lxml has been moved to it's own separate project at https://github.com/SimonSapin/cssselect.
They have already done the change from descendant::* to descendant-or-self::*, which is the same as //.
I'll try to do as you said and PR the other changes to upstream if needed.

Exactly. The project is not independent and many changes have been made. If you can work on that, this would be wonderful!

Owner

fabpot commented Jun 17, 2012

@ju1ius Will you have time to work on this soon? If not, I can try to find some time as I want these fixes to be part of 2.1.

ju1ius commented Jun 21, 2012

Hi,
I'm afraid I won't have the time to work on this until the second week of July, as I'll be very busy with my musical works.
I'll be glad to help out if it fits the timeline.

At the moment I only had the time to integrate upstream changes into my css-parser library (a hand-written LL(2) parser and library which I wrote for my port the premailer gem to php, and which uses some code from the CssSelector component).

Looking at the code might be of little value for you, however I think the selector matching tests could be useful.

Member

stof commented Oct 13, 2012

@ju1ius any news on this ?

Owner

fabpot commented Oct 29, 2012

I'm going to do the work if @ju1ius is not available.

@fabpot fabpot added a commit that referenced this pull request Mar 23, 2013

@fabpot fabpot merged branch jfsimon/css-selector-rewriting (PR #7463)
This PR was merged into the master branch.

Discussion
----------

[CssSelector] fully rewritted component

The `CssSelector` component is a port of the Python https://github.com/SimonSapin/cssselect library. Previous implementation was a port of the `v0.1` tag, this implementation is a port of the `v0.7.1` tag. As Python and PHP have different philosophies, this is not a simple language-to-language translation, I needed to re-architecture the lib.

**Note about BC:** This new version introduces some changes making fail legacy tests.
New XPath should be equivalents, these changes are:
-  When having a condition on an class, legacy condition is prefixed with a test of class existence. Example: `[contains(@class, 'foo')]` is transformed to `[@class and contains(@class, 'foo')]`.
-  When having conditions on descendants, `/descendant::*` is transformed to `/descendant-or-self::*/*`.

I updated legacy tests (stored in `CssSelectorTest` class) accordingly.

| Q             | A
| ------------- | ---
| Bug fix?      | yes
| New feature?  | yes
| BC breaks?    | see above
| Deprecations? | no
| Tests pass?   | yes

Should fix #3615 and #4271

Commits
-------

c6f87d0 [CssSelector] fully rewritted component
d855650
Owner

fabpot commented Mar 23, 2013

Closing in favor of #7463

@fabpot fabpot closed this Mar 23, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment