Skip to content


Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Fetching contributors…

Cannot retrieve contributors at this time

119 lines (83 sloc) 3.556 kb
lxml supports a number of interesting languages for tree traversal and element
selection. The most important is obviously XPath_, but there is also
ObjectPath_ in the `lxml.objectify`_ module. The newest child of this family
is `CSS selection`_, which is made available in form of the ``lxml.cssselect``
Although it started its life in lxml, cssselect_ is now an independent project.
It translates CSS selectors to XPath 1.0 expressions that can be used with
lxml's XPath engine. ``lxml.cssselect`` adds a few convenience shortcuts into
that package.
.. _XPath: xpathxslt.html#xpath
.. _ObjectPath: objectify.html#objectpath
.. _`lxml.objectify`: objectify.html
.. _`CSS selection`:
.. _cssselect:
.. contents::
1 The CSSSelector class
2 CSS Selectors
2.1 Namespaces
3 Limitations
The CSSSelector class
The most important class in the ``lxml.cssselect`` module is ``CSSSelector``. It
provides the same interface as the XPath_ class, but accepts a CSS selector
expression as input:
.. sourcecode:: pycon
>>> from lxml.cssselect import CSSSelector
>>> sel = CSSSelector('div.content')
>>> sel #doctest: +ELLIPSIS
<CSSSelector ... for 'div.content'>
>>> sel.css
The selector actually compiles to XPath, and you can see the
expression by inspecting the object:
.. sourcecode:: pycon
>>> sel.path
"descendant-or-self::div[@class and contains(concat(' ', normalize-space(@class), ' '), ' content ')]"
To use the selector, simply call it with a document or element
.. sourcecode:: pycon
>>> from lxml.etree import fromstring
>>> h = fromstring('''<div id="outer">
... <div id="inner" class="content body">
... text
... </div></div>''')
>>> [e.get('id') for e in sel(h)]
Using ``CSSSelector`` is equivalent to translating with ``cssselect``
and using the ``XPath`` class:
.. sourcecode:: pycon
>>> from cssselect import GenericTranslator
>>> from lxml.etree import XPath
>>> sel = XPath(GenericTranslator().css_to_xpath('div.content'))
``CSSSelector`` takes a ``translator`` parameter to let you choose which
translator to use. It can be ``'xml'`` (the default), ``'xhtml'``, ``'html'``
or a `Translator object`_.
.. _Translator object:
The cssselect method
lxml ``Element`` objects have a ``cssselect`` convenience method.
.. sourcecode:: pycon
>>> h.cssselect('div.content') == sel(h)
Note however that pre-compiling the expression with the ``CSSSelector`` or
``XPath`` class can provide a substantial speedup.
The method also accepts a ``translator`` parameter. On ``HtmlElement``
objects, the default is changed to ``'html'``.
Supported Selectors
Most `Level 3`_ selectors are supported. The details are in the
`cssselect documentation`_.
.. _Level 3:
.. _cssselect documentation:
In CSS you can use ``namespace-prefix|element``, similar to
``namespace-prefix:element`` in an XPath expression. In fact, it maps
one-to-one, and the same rules are used to map namespace prefixes to
namespace URIs: the ``CSSSelector`` class accepts a dictionary as its
``namespaces`` argument.
Jump to Line
Something went wrong with that request. Please try again.