Cssselect method #47

wants to merge 11 commits into


None yet

3 participants


Add a cssselect method to all element. This is a partial fix for #45. It is based on #46 which makes the diff awkward, but only the last commit is relevant. It could be rebased on master (but without the translator part.)


Instead of the sys.modules stuff below, you could simply have base_cssselect = __import__('cssselect') here instead.


For sys.modules['cssselect'] yes, but what about sys.modules['cssselect.xpath'] for example?

Assuming the cssselect package triggers those imports then you can just use base_cssselect.xpath, otherwise use base_cssselect = __import__('cssselect.xpath') to ensure that module is imported. See the example here.


__import__('cssselect.xpath') ensures that the submodule is imported, but I think not that it is reachable as an attribute of the parent package. Currently, cssutils/__init__.py does not have from cssselect import xpath but from cssselect.xpath import foo, bar.

What can avoid the use of sys.modules according to the doc is something like this:

xpath = __import__('cssselect.xpath', globals(), locals(), ['xpath_literal'], 0)

... but I’m not sure it is better.

The submodule is set as an attribute on the parent package when it is imported, consider the following:

>>> import lxml
>>> lxml.cssselect
Traceback (most recent call last):
  File "<console>", line 1, in <module>
AttributeError: 'module' object has no attribute 'cssselect'
>>> __import__('lxml.cssselect')
<module 'lxml' from '/data/buildout/eggs/lxml-2.3.4-py2.7-macosx-10.6-intel.egg/lxml/__init__.pyc'>
>>> lxml.cssselect
<module 'lxml.cssselect' from '/data/buildout/eggs/lxml-2.3.4-py2.7-macosx-10.6-intel.egg/lxml/cssselect.pyc'>

(The same thing happens if you instead import lxml.cssselect)


Oh, I didn’t know that. I’ll try and see, thanks.

scoder commented on 7820b8d Apr 20, 2012

Ah, sorry, my bad. I should have checked. I had thought this method was already there and needed fixing, but it was only in lxml.html, not in lxml.etree. I didn't mean to add it to lxml.etree.


Do you mean that it should not be added there?

That's what I'm saying, yes. In any case, it's an addition to the API that is independent of the intention of this pull request.

Is there a real advantage of having it on all Elements, instead of just HtmlElements? I mean, it's convenient, ok, but it's not efficient and also won't work on all systems because it requires an external module to be installed. Making it a method feels too tightly integrated for an external module.

I'm really not sure, but I'm leaning towards considering the method on HtmlElements a legacy issue rather than a sparkling good idea. And if that's the case, we shouldn't add to it.


Ok, I’ll make two independent pull requests so they can be judged separately.


SimonSapin added some commits Apr 20, 2012
@SimonSapin SimonSapin Docstring fix. da3d1d3
@SimonSapin SimonSapin Revert "Add a cssselect method to all elements, not just HtmlElement"
This is an addition to the API that should be considered separatly
from using the new cssselect.

This reverts commit 7820b8d.
@SimonSapin SimonSapin Fix the test suite for _Element.cssselect removal. 5acdb84
@SimonSapin SimonSapin Add a cssselect method to all elements, not just HtmlElement
translator defaults to 'xml' in _Element and 'html' in HtmlElement

(Add it back again, but on a separate pull request.)
@SimonSapin SimonSapin closed this Apr 23, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment