Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Fetching contributors…

Cannot retrieve contributors at this time

158 lines (123 sloc) 4.681 kb
=====================
APIs specific to lxml
=====================
lxml tries to follow established APIs wherever possible. Sometimes
however the need to expose a feature in an easy way led to the
invention of a new API.
lxml.etree
==========
lxml.etree tries to follow the etree API wherever it can. There are
however some incompatibilities (see compatibility.txt). There are also
some extensions.
xpath method on ElementTree, Element
------------------------------------
lxml.etree extends the ElementTree and Element interfaces with an
xpath method. For ElementTree, the xpath method performs a global
xpath query against the document. When xpath is used on an element,
the xpath expression is performed taking the element as the xpath
context node.
You call the xpath() method with the XPath expression to use, and
optionally a second namespaces argument, which should be a dictionary
mapping namespace prefixes to be used in the XPath expression to
namespace URIs.
The return values of xpath vary, depending on the XPath expression
used:
* 1 or 0, when the XPath expression has a boolean result
* a float, when the XPath expression has a floating point result
* a (unicode) string, when the XPath expression has a string result.
* a list of items, when the XPath expression has a list as result. The
items may include element nodes, strings. When the nodeset would
contain text nodes or attributes, the node result is also a string
(the text node content or attribute value). When the nodeset would
contain a comment, the result contains a string as well, inside
``<!--`` and ``-->`` markers.
Example::
>>> import lxml.etree
>>> from StringIO import StringIO
>>> f = StringIO('<foo><bar></bar></foo>')
>>> doc = lxml.etree.parse(f)
>>> r = doc.xpath('/foo/bar')
>>> len(r)
1
>>> r[0].tag
'bar'
Example of using namespace prefixes::
>>> f = StringIO('''\
... <a:foo xmlns:a="http://codespeak.net/ns/test1"
... xmlns:b="http://codespeak.net/ns/test2">
... <b:bar>Text</b:bar>
... </a:foo>
... ''')
>>> doc = lxml.etree.parse(f)
>>> r = doc.xpath('/t:foo/b:bar', {'t': 'http://codespeak.net/ns/test1',
... 'b': 'http://codespeak.net/ns/test2'})
>>> len(r)
1
>>> r[0].tag
'{http://codespeak.net/ns/test2}bar'
>>> r[0].text
'Text'
XSLT
----
lxml.etree introduces a new class, lxml.etree.XSLT. The class can be
given an ElementTree object to construct an XSLT transformer::
>>> f = StringIO('''\
... <xsl:stylesheet version="1.0"
... xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
... <xsl:template match="*" />
... <xsl:template match="/">
... <foo><xsl:value-of select="/a/b/text()" /></foo>
... </xsl:template>
... </xsl:stylesheet>''')
>>> xslt_doc = lxml.etree.parse(f)
>>> style = lxml.etree.XSLT(xslt_doc)
You can then apply the style against some ElementTree document, and this
results in another ElementTree object::
>>> f = StringIO('<a><b>Text</b></a>')
>>> doc = lxml.etree.parse(f)
>>> result = style.apply(doc)
The result object can accessed like a normal ElementTree document::
>>> result.getroot().text
'Text'
but can also be turned into an (XML or text) string using the style's
``tostring`` method::
>>> style.tostring(result)
'<?xml version="1.0"?>\n<foo>Text</foo>\n'
RelaxNG
-------
lxml.etree introduces a new class, lxml.etree.RelaxNG. The class can
be given an ElementTree object to construct a Relax NG validator::
>>> f = StringIO('''\
... <element name="a" xmlns="http://relaxng.org/ns/structure/1.0">
... <zeroOrMore>
... <element name="b">
... <text />
... </element>
... </zeroOrMore>
... </element>
... ''')
>>> relaxng_doc = lxml.etree.parse(f)
>>> relaxng = lxml.etree.RelaxNG(relaxng_doc)
You can then validate some ElementTree document with this. You'll get
back true if the document is valid against the Relax NG schema, and
false if not::
>>> valid = StringIO('<a><b></b></a>')
>>> doc = lxml.etree.parse(valid)
>>> relaxng.validate(doc)
1
>>> invalid = StringIO('<a><c></c></a>')
>>> doc2 = lxml.etree.parse(invalid)
>>> relaxng.validate(doc2)
0
write_c14n on ElementTree
-------------------------
The lxml.etree.ElementTree class has a method write_c14n, which takes
one argument: a file object. This file object will receive an UTF-8
representation of the canonicalized form of the XML, following the W3C
C14N recommendation. For example::
>>> f = StringIO('<a><b/></a>')
>>> tree = lxml.etree.parse(f)
>>> f2 = StringIO()
>>> tree.write_c14n(f2)
>>> f2.getvalue()
'<a><b></b></a>'
Jump to Line
Something went wrong with that request. Please try again.