Skip to content

IvIePhisto/ECoXiPy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ECoXiPy - Easy Creation of XML in Python

This Python 2 and 3 project (tested with CPython 2.7 and 3.3 as well as PyPy 2) allows for easy creation of XML. The hierarchical structure of XML is easy to spot and the code to create XML is much shorter than using SAX, DOM or similar APIs. There is also functionality to efficiently validate and transform XML while it is being created.

This project uses the MIT License, so you may freely use, distribute and modify it, provided you include the content of License.txt.

Getting Started

Install using setuptools:

easy_install ecoxipy

The ECoXiPy Documentation and the source distribution are available from its PyPI entry.

Release History

0.4.0

  • Added: An output implementation may specify a method fragment with one iterable argument, which is used to create a representation for a XML fragment.
  • Added: An output implementation may specify a method preprocess with one argument. If such a method exists, it is called with each content object.
  • Added: Module ecoxipy.html, moved html5, HTML5_ELEMENTS and HTML5_ELEMENT_NAMES from ecoxipy.decorators here (THEY ARE NO LONGER AVAILABLE THERE).
  • Added: The decorator-creator ecoxipy.html.html5_cats makes selected HTML5 categories available for XML creation.
  • Added: The HTML5 template function ecoxipy.html.html5_template.
  • Added: The module ecoxipy.validation provides a base for creating only valid XML.
  • Added: The module ecoxipy.transformation implements an API for transformation of XML. Its class MarkupTransformer works as an ecoxipy.Output implementation modifying XML before creating its output representation.
  • Improved: Preprocessing of content objects.
  • Changed: No longer using abstract classes in ecoxipy.pyxom.
  • Changed: Now classes are also abstract for Python 3 (by using tinkerpy.metaclass).

0.3.1

  • Improved: Performance – more duck-typing and less isinstance, using collections.queue instead of list for children handling.
  • Fixed: Text handling of ecoxipy.etree_output.ETreeOutput.
  • Added: Performance tests for ecoxipy.etree_output.ETreeOutput.

0.3.0

This is a major release introducing new capabilities and Python 3 support.

  • Added: Support for Python 3.
  • Added: Use | on ecoxipy.MarkupBuilder to create comments.
  • Added: Use slicing on ecoxipy.MarkupBuilder to create documents or processing instructions.
  • Added: The module ecoxipy.parsing contains SAX to ECoXiPy parsing facilities.
  • Changed: Renamed module ecoxipy.element_output to ecoxipy.pyxom.output and moved the XML representation classes to their own module ecoxipy.pyxom naming them PyXOM - Pythonic XML Object Model as well as adding new functionality. DOM creation was removed from those classes.
  • Changed: All XML data is internally handled as Unicode, a ecoxipy.MarkupBuilder instance converts byte strings from an encoding given on creation (defaults to UTF-8).
  • Changed: XML parsing is now handled by ecoxipy.MarkupBuilder instead of the ecoxipy.Output implementations.
  • Changed: Text node creation is now handled by ecoxipy.MarkupBuilder instead of the ecoxipy.Output implementations.
  • Improved: Unpacking of content while processing is done recursively on iterable and callable content.

0.2.0

  • Added: Use & on ecoxipy.MarkupBuilder to create text nodes.
  • Improvement: Better performance of ecoxipy.string_output.StringOutput.

0.1.0

  • Initial release.

Example

Here's a simple HTML5 document template function:

# In the function it is applied to the "html5" decorator creates the variable
# "_b" being an instance of "ecoxipy.MarkupBuilder" with
# "ecoxipy.string_output.StringOutput" for XML creation. It also creates for
# each HTML5 element a variable being a method of "_b", with the name of
# element, variable and method all being equal.

from ecoxipy.html import html5

@html5
def create_testdoc(title, subtitle, *content):
    # Slicing without start argument on the builder creates a document, the
    # stop argument defines the document type declaration and the step
    # argument defines if the XML declaration should be omitted.
    return _b[:'html':True](
        # Method calls on a MarkupBuilder instance create elements with the
        # name equal to the method name.
        html(
            # Child dictionary entries become attributes, especially useful
            # for non-identifier attribute names:
            {'data-info': 'Created by Ecoxipy'},
            head(
                _b.title(
                    # Children which are not of the XML representation
                    # and are either "str" or "unicode" instances or are
                    # neither iterables, generators nor callables, become text
                    # nodes:
                    title
                )
            ),
            body(
                article(
                    # Child iterables and generators are unpacked
                    # automatically:
                    [h1(title), h2(subtitle)],          # Iterable
                    (p(item) for item in content),      # Generator

                    # Child callables are executed:
                    hr,

                    # Calling a MarkupBuilder creates a XML fragment from the
                    # arguments, here strings are regarded as raw XML.
                    _b(
                        # Explicitly create text node:
                        _b & '<THE END>',
                        '<footer>Copyright 2013</footer>'       # raw XML
                    )
                )
            ),

            # You can also create comments:
            _b | "This is a comment.",

            # Slicing with a start argument creates a processing instruction:
            _b['pi-target':'PI content.'],

            # Named arguments of element method-calls become attributes:
            xmlns='http://www.w3.org/1999/xhtml/'
        )
    )

It could be used like this:

>>> create_testdoc('Test', 'A Simple Test Document', 'Hello World & Universe!', 'How are you?')
b'<!DOCTYPE html><html data-info="Created by Ecoxipy" xmlns="http://www.w3.org/1999/xhtml/"><head><title>Test</title></head><body><article><h1>Test</h1><h2>A Simple Test Document</h2><p>Hello World &amp; Universe!</p><p>How are you?</p><hr/>&lt;THE END&gt;<footer>Copyright 2013</footer><!--This is a comment.--><?pi-target PI content.?></article></body></html>'

Pretty-printing the result yields the following HTML:

<!DOCTYPE html>
<html data-info="Created by Ecoxipy" xmlns="http://www.w3.org/1999/xhtml/">
    <head>
        <title>Test</title>
    </head>
    <body>
        <article>
            <h1>Test</h1>
            <h2>A Simple Test Document</h2>
            <p>Hello World &amp; Universe!</p>
            <p>How are you?</p>
            <hr/>
            &lt;THE END&gt;
            <footer>Copyright 2013</footer>
        </article>
    </body>
    <!--This is a comment.-->
    <?pi-target PI content.?>
</html>

Development

Install egg for development:

python setup.py develop

This installs TinkerPy, if you omit it you must install TinkerPy manually.

Common Tasks

Build documentation with Sphinx (which of course must be installed):

python setup.py build_sphinx

Execute unit tests:

python setup.py test

Performance Tests

Setup

The same XHTML5 document is created with different APIs. All output implementations of EcoXiPy (in ecoxipy.string_output, ecoxipy.dom_output, ecoxipy.pyxom.output and ecoxipy.etree_output) are tested as well as xml.sax, xml.dom.minidom and xml.etree.ElementTree. For each of the APIs one test creates its native representation and one test transforms this into an UTF-8 encoded byte string, as most XML will ultimately be serialised in this form. The SAX and ecoxipy.string_output tests create byte strings in both test types.

Running

To run the timeit tests execute in a terminal from the project's root directory:

python -m tests.performance.timeit_tests <string output> <repetitions> <data count> [<CSV output path>]

Use no arguments to get help.

To run a batch of tests with CPython 2.7, CPython 3.3 and PyPy, once to create native structures and once to create byte strings, writing the results to the file timeit.csv, execute the Bash script run_timeit_tests.

Running cProfile tests:

python tests/performance/profiling_tests.py

Results

These timeit tests show that the overhead of using ECoXiPy is not great and the differences between using different APIs depend on the used Python platform, with surprising results using PyPy. If encoded strings are wanted as output, ecoxipy.string_output is a viable alternative to using xml.sax or xml.etree. The full results are available, see here the graph:

Performance Testing Results Graph