This Python 2 and 3 project (tested with CPython 2.7 and 3.3 as well as PyPy 2) allows for easy creation of XML. The hierarchical structure of XML is easy to spot and the code to create XML is much shorter than using SAX, DOM or similar APIs. There is also functionality to efficiently validate and transform XML while it is being created.
This project uses the MIT License, so you may freely use, distribute and
modify it, provided you include the content of License.txt
.
Install using setuptools:
easy_install ecoxipy
The ECoXiPy Documentation and the source distribution are available from its PyPI entry.
0.4.0
- Added: An output implementation may specify a method
fragment
with one iterable argument, which is used to create a representation for a XML fragment. - Added: An output implementation may specify a method
preprocess
with one argument. If such a method exists, it is called with each content object. - Added: Module
ecoxipy.html
, movedhtml5
,HTML5_ELEMENTS
andHTML5_ELEMENT_NAMES
fromecoxipy.decorators
here (THEY ARE NO LONGER AVAILABLE THERE). - Added: The decorator-creator
ecoxipy.html.html5_cats
makes selected HTML5 categories available for XML creation. - Added: The HTML5 template function
ecoxipy.html.html5_template
. - Added: The module
ecoxipy.validation
provides a base for creating only valid XML. - Added: The module
ecoxipy.transformation
implements an API for transformation of XML. Its classMarkupTransformer
works as anecoxipy.Output
implementation modifying XML before creating its output representation. - Improved: Preprocessing of content objects.
- Changed: No longer using abstract classes in
ecoxipy.pyxom
. - Changed: Now classes are also abstract for Python 3 (by using
tinkerpy.metaclass
).
0.3.1
- Improved: Performance – more duck-typing and less
isinstance
, usingcollections.queue
instead oflist
for children handling. - Fixed: Text handling of
ecoxipy.etree_output.ETreeOutput
. - Added: Performance tests for
ecoxipy.etree_output.ETreeOutput
.
0.3.0
This is a major release introducing new capabilities and Python 3 support.
- Added: Support for Python 3.
- Added: Use
|
onecoxipy.MarkupBuilder
to create comments. - Added: Use slicing on
ecoxipy.MarkupBuilder
to create documents or processing instructions. - Added: The module
ecoxipy.parsing
contains SAX to ECoXiPy parsing facilities. - Changed: Renamed module
ecoxipy.element_output
toecoxipy.pyxom.output
and moved the XML representation classes to their own moduleecoxipy.pyxom
naming them PyXOM - Pythonic XML Object Model as well as adding new functionality. DOM creation was removed from those classes. - Changed: All XML data is internally handled as Unicode, a
ecoxipy.MarkupBuilder
instance converts byte strings from an encoding given on creation (defaults to UTF-8). - Changed: XML parsing is now handled by
ecoxipy.MarkupBuilder
instead of theecoxipy.Output
implementations. - Changed: Text node creation is now handled by
ecoxipy.MarkupBuilder
instead of theecoxipy.Output
implementations. - Improved: Unpacking of content while processing is done recursively on iterable and callable content.
0.2.0
- Added: Use
&
onecoxipy.MarkupBuilder
to create text nodes. - Improvement: Better performance of
ecoxipy.string_output.StringOutput
.
0.1.0
- Initial release.
Here's a simple HTML5 document template function:
# In the function it is applied to the "html5" decorator creates the variable
# "_b" being an instance of "ecoxipy.MarkupBuilder" with
# "ecoxipy.string_output.StringOutput" for XML creation. It also creates for
# each HTML5 element a variable being a method of "_b", with the name of
# element, variable and method all being equal.
from ecoxipy.html import html5
@html5
def create_testdoc(title, subtitle, *content):
# Slicing without start argument on the builder creates a document, the
# stop argument defines the document type declaration and the step
# argument defines if the XML declaration should be omitted.
return _b[:'html':True](
# Method calls on a MarkupBuilder instance create elements with the
# name equal to the method name.
html(
# Child dictionary entries become attributes, especially useful
# for non-identifier attribute names:
{'data-info': 'Created by Ecoxipy'},
head(
_b.title(
# Children which are not of the XML representation
# and are either "str" or "unicode" instances or are
# neither iterables, generators nor callables, become text
# nodes:
title
)
),
body(
article(
# Child iterables and generators are unpacked
# automatically:
[h1(title), h2(subtitle)], # Iterable
(p(item) for item in content), # Generator
# Child callables are executed:
hr,
# Calling a MarkupBuilder creates a XML fragment from the
# arguments, here strings are regarded as raw XML.
_b(
# Explicitly create text node:
_b & '<THE END>',
'<footer>Copyright 2013</footer>' # raw XML
)
)
),
# You can also create comments:
_b | "This is a comment.",
# Slicing with a start argument creates a processing instruction:
_b['pi-target':'PI content.'],
# Named arguments of element method-calls become attributes:
xmlns='http://www.w3.org/1999/xhtml/'
)
)
It could be used like this:
>>> create_testdoc('Test', 'A Simple Test Document', 'Hello World & Universe!', 'How are you?')
b'<!DOCTYPE html><html data-info="Created by Ecoxipy" xmlns="http://www.w3.org/1999/xhtml/"><head><title>Test</title></head><body><article><h1>Test</h1><h2>A Simple Test Document</h2><p>Hello World & Universe!</p><p>How are you?</p><hr/><THE END><footer>Copyright 2013</footer><!--This is a comment.--><?pi-target PI content.?></article></body></html>'
Pretty-printing the result yields the following HTML:
<!DOCTYPE html>
<html data-info="Created by Ecoxipy" xmlns="http://www.w3.org/1999/xhtml/">
<head>
<title>Test</title>
</head>
<body>
<article>
<h1>Test</h1>
<h2>A Simple Test Document</h2>
<p>Hello World & Universe!</p>
<p>How are you?</p>
<hr/>
<THE END>
<footer>Copyright 2013</footer>
</article>
</body>
<!--This is a comment.-->
<?pi-target PI content.?>
</html>
Install egg for development:
python setup.py develop
This installs TinkerPy, if you omit it you must install TinkerPy manually.
Build documentation with Sphinx (which of course must be installed):
python setup.py build_sphinx
Execute unit tests:
python setup.py test
Setup
The same XHTML5 document is created with different APIs. All output
implementations of EcoXiPy (in ecoxipy.string_output
, ecoxipy.dom_output
,
ecoxipy.pyxom.output
and ecoxipy.etree_output
) are tested as well as
xml.sax
, xml.dom.minidom
and xml.etree.ElementTree
. For each of the APIs
one test creates its native representation and one test transforms this into
an UTF-8 encoded byte string, as most XML will ultimately be serialised in
this form. The SAX and ecoxipy.string_output
tests create byte strings in
both test types.
Running
To run the timeit tests execute in a terminal from the project's root directory:
python -m tests.performance.timeit_tests <string output> <repetitions> <data count> [<CSV output path>]
Use no arguments to get help.
To run a batch of tests with CPython 2.7, CPython 3.3 and PyPy, once to create
native structures and once to create byte strings, writing the results to the
file timeit.csv
, execute the Bash script run_timeit_tests
.
Running cProfile tests:
python tests/performance/profiling_tests.py
Results
These timeit
tests show that the overhead of using ECoXiPy is not great and
the differences between using different APIs depend on the used Python
platform, with surprising results using PyPy. If encoded strings are wanted as
output, ecoxipy.string_output
is a viable alternative to using xml.sax
or
xml.etree
. The full
results
are available, see here the graph: