XPath et XSLT
==

**XPath** est un langage permettant une interrogation simplifiée d’un document XML.

Toute la puissance de **XPath** se révèle lorsque le développeur maîtrise bien la formulation de ses interrogations et recense tous les cas possibles décrits par son expression.

**XSLT** est l’acronyme de *eXtensible Stylesheet Language Transformations* et, comme son nom le laisse supposer, il s’agit d’un langage **XML** qui permet d'opérer des *transformations de style* afin de traduire un document XML respectant un certain schéma en un autre document XML respectant un autre schéma.

Le document XML d’entrée ou de sortie peut être des utilisations particulières de langages, comme *(X)HTML* ou *SVG*, mais dans tous les cas ces documents doivent être bien formés, sans quoi la transformation est impossible.

Chargement d'un document XML
--

In [None]:
with open('document.xml') as f:
    print(f.read())

In [None]:
from lxml import etree

In [None]:
with open('document.xml') as f:
    tree = etree.parse(f)

XPath
--

In [None]:
tree.getpath(tree.getroot().getchildren()[1])

In [None]:
personnes = tree.xpath('/liste/personne')

In [None]:
len(personnes)

In [None]:
personnes[0].tag

In [None]:
tree.xpath('count(/*/personne)')

In [None]:
for i in range(1, 4):
    expr = f"/*/personne[{i}]"
    print(f"{expr}: {tree.xpath(expr)}")

In [None]:
tree.xpath("/*/personne[@id]")

In [None]:
tree.xpath("/*/personne[@nom]")

In [None]:
tree.xpath("/*/personne[@id=2]")

In [None]:
tree.xpath("/*/personne[@id=$id]", id=2)

Fichiers OpenDocument
--

In [None]:
import zipfile
with zipfile.ZipFile('document.odt') as f:
    content = f.read('content.xml')

In [None]:
document_tree = etree.fromstring(content)

In [None]:
namespaces = {
    "table": "urn:oasis:names:tc:opendocument:xmlns:table:1.0",
    "fo": "urn:oasis:names:tc:opendocument:xmlns:xsl-fo-compatible:1.0",
    "manifest": "urn:oasis:names:tc:opendocument:xmlns:manifest:1.0",
    "presentation": "urn:oasis:names:tc:opendocument:xmlns:presentation:1.0",
    "meta": "urn:oasis:names:tc:opendocument:xmlns:meta:1.0",
    "style": "urn:oasis:names:tc:opendocument:xmlns:style:1.0",
    "draw": "urn:oasis:names:tc:opendocument:xmlns:drawing:1.0",
    "text": "urn:oasis:names:tc:opendocument:xmlns:text:1.0",
    "office": "urn:oasis:names:tc:opendocument:xmlns:office:1.0",
}

In [None]:
titres = [(n.get('{urn:oasis:names:tc:opendocument:xmlns:text:1.0}outline-level'), n.text) for n in document_tree.xpath('//text:h', namespaces=namespaces)]

In [None]:
from pprint import pprint
pprint(titres)

XSLT
--

In [None]:
with open('document.xslt') as f:
    print(f.read())

In [None]:
with open('document.xslt') as f:
    xslt = etree.XSLT(etree.parse(f))

In [None]:
new_tree = xslt(tree, **{'date': '20110901'})

In [None]:
print(etree.tostring(new_tree, pretty_print=True).decode())

In [None]:
print(etree.tostring(tree, pretty_print=True).decode())

In [None]:
with open('transforme.xml', 'wb') as f:
    f.write(etree.tostring(new_tree, pretty_print=True))

In [None]:
with open('transforme.xml') as f:
    print(f.read())

---