HTTPS clone URL
Subversion checkout URL
WeasyPrint converts web documents (HTML with CSS, SVG, …) to PDF.
attic/layout-state-object attic/points-as-internal-unit attic/slots border3 cffi empty_cell_and_collapse html5lib html-attributes hyphenation image-resolution master mime_sniff shadow table_cell trailingspaces website
Nothing to show
Fetching latest commit…
Cannot retrieve the latest commit at this time
|Failed to load latest commit information.|
WeasyPrint converts web documents (HTML, CSS, ...) to PDF. See the documentation at http://weasyprint.org/ Dependencies ------------ Listed in setup.py, will install automatically if you use easy_install or pip: * html5lib * lxml * cssutils * Attest Not listed in setup.py since they are either not on PyPI or tricky to compile. You need to install these manually: * PyCairo * PyGTK * python-rsvg About the PyGTK dependency -------------------------- WeasyPrint does not use GTK+, but it uses Pango for text rendering and rsvg for SVG rendering. Both of them can work work without GTK+, but their Python bindings either are part of PyGTK (for Pango) or depend on PyGTK (for rsvg). If someday we have GObject introspection for all of Pango, rsvg and cairo we can switch to those and drop the PyGTK dependency. Standards conformance --------------------- WeasyPrint strives for web standards conformance. For some standards however, conformance is just that of the libraries we use: * HTML parsing: (turning bytes into a DOM tree), we currently use lxml.html (see below.) * CSS parsing: cssutils * CSS selectors: lxml.cssselect (conforms to CSS3 with some exceptions, see http://lxml.de/cssselect.html#limitations) * SVG: rsvg The biggest part where WeasyPrint only has itself to blame about conformance is the graphical rendering and layout of documents. (That is: all of CSS but syntax and selectors.) Inline SVG ---------- SVG, even when inlined in the HTML document, is rendered by the rsvg library independently of the rest of the document. In CSS speak, we consider it to be a “replaced element”. HTML parsing ------------ We use lxml to parse HTML into an object tree. lmxl’s own parser is very fast, but it can optionnaly use the html5lib parser. html5lib implements the HTML5 parsing algorithm so it should give better results on broken HTML, though “they all parse pretty-good HTML the same.”   http://stackoverflow.com/questions/2676872/how-to-parse-malformed-html-in-python-using-standard-libraries/2680724#2680724 lxml vs ElementTree ------------------- lxml uses the same API as ElementTree so that some programs can use any of them. However we need lxml.cssselect, which does not exist in ElementTree.