Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for :empty #20

Merged
merged 1 commit into from
Nov 15, 2012
Merged

Fix for :empty #20

merged 1 commit into from
Nov 15, 2012

Conversation

sjp
Copy link
Contributor

@sjp sjp commented Nov 15, 2012

Now just checking whether the matched non-element node has a (possibly coerced) string-length() > 0.

<test>
    <p>          </p>
    <q></q>
    <r><![CDATA[]]></r>
    <![CDATA[abcdef]]>
</test>

In this case on this document :empty would return <q></q> and <r><![CDATA[]]></r>.

I believe this is the correct behaviour regarding CDATA, but I'm not 100% on that.

@SimonSapin
Copy link
Contributor

So, the bug is that the previous code matched an element containing only white-space, but shouldn’t. Is that correct?

Regarding CDATA, an interesting test case would be <s><![CDATA[abcdef]]></s>.

@sjp
Copy link
Contributor Author

sjp commented Nov 15, 2012

Oh, sorry that was a typo on my end (I meant to add wrapping elements...). The following will not match empty under the current implementation.

<s><![CDATA[abcdef]]></s>

The selectors spec says the following:

...only element nodes and content nodes (such as text nodes, CDATA nodes, and entity references) whose data has a non-zero length must be considered as affecting emptiness;

This suggests to me that anything that matches a non-element that has a zero string-length() is a reasonable interpretation of the spec in terms of implementing :empty.

SimonSapin added a commit that referenced this pull request Nov 15, 2012
Have :empty not match whitespace-only elements.
@SimonSapin SimonSapin merged commit 653a5a5 into scrapy:master Nov 15, 2012
@SimonSapin
Copy link
Contributor

Merged, thanks.

jsonn referenced this pull request in jsonn/pkgsrc Apr 11, 2014
Version 0.9.1
-------------

Released on 2013-10-17.

* **Backward incompatible change from 0.9**:
  :meth:`~GenericTranslator.selector_to_xpath` defaults to
  ignoring pseudo-elements,
  as it did in 0.8 and previous versions.
  (:meth:`~GenericTranslator.css_to_xpath` doesn’t change.)
* Drop official support for Python 2.4 and 3.1,
  as testing was becoming difficult.
  Nothing will break overnight,
  but future releases may on may not work on these versions.
  Older releases will remain available on PyPI.


Version 0.9
-----------

Released on 2013-10-11.

Add parser support for :attr:`functional
pseudo-elements <Selector.pseudo_element>`.

*Update:*
This version accidentally introduced a **backward incompatible** change:
:meth:`~GenericTranslator.selector_to_xpath` defaults to
rejecting pseudo-elements instead of ignoring them.


Version 0.8
-----------

Released on 2013-03-15.

Improvements:

* `#22 <https://github.com/SimonSapin/cssselect/issues/22>`_
  Let extended translators override what XPathExpr class is used
* `#19 <https://github.com/SimonSapin/cssselect/issues/19>`_
  Use the built-in ``lang()`` XPath function
  for implementing the ``:lang()`` pseudo-class
  with XML documents.
  This is probably faster than ``ancestor-or-self::``.

Bug fixes:

* `#14 <https://github.com/SimonSapin/cssselect/issues/14>`_
  Fix non-ASCII pseudo-classes. (Invalid selector instead of crash.)
* `#20 <https://github.com/SimonSapin/cssselect/issues/20>`_
  As per the spec, elements containing only whitespace are not considered empty
  for the ``:empty`` pseudo-class.


Version 0.7.1
-------------

Released on 2012-06-14. Code name *remember-to-test-with-tox*.

0.7 broke the parser in Python 2.4 and 2.5; the tests in 2.x.
Now all is well again.

Also, pseudo-elements are now correctly made lower-case. (They are supposed
to be case-insensitive.)
jsonn referenced this pull request in jsonn/pkgsrc Apr 13, 2014
Version 0.9.1
-------------

Released on 2013-10-17.

* **Backward incompatible change from 0.9**:
  :meth:`~GenericTranslator.selector_to_xpath` defaults to
  ignoring pseudo-elements,
  as it did in 0.8 and previous versions.
  (:meth:`~GenericTranslator.css_to_xpath` doesn’t change.)
* Drop official support for Python 2.4 and 3.1,
  as testing was becoming difficult.
  Nothing will break overnight,
  but future releases may on may not work on these versions.
  Older releases will remain available on PyPI.


Version 0.9
-----------

Released on 2013-10-11.

Add parser support for :attr:`functional
pseudo-elements <Selector.pseudo_element>`.

*Update:*
This version accidentally introduced a **backward incompatible** change:
:meth:`~GenericTranslator.selector_to_xpath` defaults to
rejecting pseudo-elements instead of ignoring them.


Version 0.8
-----------

Released on 2013-03-15.

Improvements:

* `#22 <https://github.com/SimonSapin/cssselect/issues/22>`_
  Let extended translators override what XPathExpr class is used
* `#19 <https://github.com/SimonSapin/cssselect/issues/19>`_
  Use the built-in ``lang()`` XPath function
  for implementing the ``:lang()`` pseudo-class
  with XML documents.
  This is probably faster than ``ancestor-or-self::``.

Bug fixes:

* `#14 <https://github.com/SimonSapin/cssselect/issues/14>`_
  Fix non-ASCII pseudo-classes. (Invalid selector instead of crash.)
* `#20 <https://github.com/SimonSapin/cssselect/issues/20>`_
  As per the spec, elements containing only whitespace are not considered empty
  for the ``:empty`` pseudo-class.


Version 0.7.1
-------------

Released on 2012-06-14. Code name *remember-to-test-with-tox*.

0.7 broke the parser in Python 2.4 and 2.5; the tests in 2.x.
Now all is well again.

Also, pseudo-elements are now correctly made lower-case. (They are supposed
to be case-insensitive.)
jsonn referenced this pull request in jsonn/pkgsrc Jun 11, 2014
Version 0.9.1
-------------

Released on 2013-10-17.

* **Backward incompatible change from 0.9**:
  :meth:`~GenericTranslator.selector_to_xpath` defaults to
  ignoring pseudo-elements,
  as it did in 0.8 and previous versions.
  (:meth:`~GenericTranslator.css_to_xpath` doesn’t change.)
* Drop official support for Python 2.4 and 3.1,
  as testing was becoming difficult.
  Nothing will break overnight,
  but future releases may on may not work on these versions.
  Older releases will remain available on PyPI.


Version 0.9
-----------

Released on 2013-10-11.

Add parser support for :attr:`functional
pseudo-elements <Selector.pseudo_element>`.

*Update:*
This version accidentally introduced a **backward incompatible** change:
:meth:`~GenericTranslator.selector_to_xpath` defaults to
rejecting pseudo-elements instead of ignoring them.


Version 0.8
-----------

Released on 2013-03-15.

Improvements:

* `#22 <https://github.com/SimonSapin/cssselect/issues/22>`_
  Let extended translators override what XPathExpr class is used
* `#19 <https://github.com/SimonSapin/cssselect/issues/19>`_
  Use the built-in ``lang()`` XPath function
  for implementing the ``:lang()`` pseudo-class
  with XML documents.
  This is probably faster than ``ancestor-or-self::``.

Bug fixes:

* `#14 <https://github.com/SimonSapin/cssselect/issues/14>`_
  Fix non-ASCII pseudo-classes. (Invalid selector instead of crash.)
* `#20 <https://github.com/SimonSapin/cssselect/issues/20>`_
  As per the spec, elements containing only whitespace are not considered empty
  for the ``:empty`` pseudo-class.


Version 0.7.1
-------------

Released on 2012-06-14. Code name *remember-to-test-with-tox*.

0.7 broke the parser in Python 2.4 and 2.5; the tests in 2.x.
Now all is well again.

Also, pseudo-elements are now correctly made lower-case. (They are supposed
to be case-insensitive.)
jsonn referenced this pull request in jsonn/pkgsrc Oct 11, 2014
Version 0.9.1
-------------

Released on 2013-10-17.

* **Backward incompatible change from 0.9**:
  :meth:`~GenericTranslator.selector_to_xpath` defaults to
  ignoring pseudo-elements,
  as it did in 0.8 and previous versions.
  (:meth:`~GenericTranslator.css_to_xpath` doesn’t change.)
* Drop official support for Python 2.4 and 3.1,
  as testing was becoming difficult.
  Nothing will break overnight,
  but future releases may on may not work on these versions.
  Older releases will remain available on PyPI.


Version 0.9
-----------

Released on 2013-10-11.

Add parser support for :attr:`functional
pseudo-elements <Selector.pseudo_element>`.

*Update:*
This version accidentally introduced a **backward incompatible** change:
:meth:`~GenericTranslator.selector_to_xpath` defaults to
rejecting pseudo-elements instead of ignoring them.


Version 0.8
-----------

Released on 2013-03-15.

Improvements:

* `#22 <https://github.com/SimonSapin/cssselect/issues/22>`_
  Let extended translators override what XPathExpr class is used
* `#19 <https://github.com/SimonSapin/cssselect/issues/19>`_
  Use the built-in ``lang()`` XPath function
  for implementing the ``:lang()`` pseudo-class
  with XML documents.
  This is probably faster than ``ancestor-or-self::``.

Bug fixes:

* `#14 <https://github.com/SimonSapin/cssselect/issues/14>`_
  Fix non-ASCII pseudo-classes. (Invalid selector instead of crash.)
* `#20 <https://github.com/SimonSapin/cssselect/issues/20>`_
  As per the spec, elements containing only whitespace are not considered empty
  for the ``:empty`` pseudo-class.


Version 0.7.1
-------------

Released on 2012-06-14. Code name *remember-to-test-with-tox*.

0.7 broke the parser in Python 2.4 and 2.5; the tests in 2.x.
Now all is well again.

Also, pseudo-elements are now correctly made lower-case. (They are supposed
to be case-insensitive.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants