manage collections of SAX processors
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


README for XML-SAX-Machines

XML::SAX::Machines is a collection of APIs that allow complex SAX machines
to be constructed without a huge amount of extra typing.

This distribution contains three kinds of modules: machines, helpers, and
filters.  Here's how they are laid out:

- XML::SAX::* contains machines and helpers.
    - XML::SAX::Machines lets you import the "classic" constructor
      functions like Tap(), Pipeline(), Manifold(), and ByRecord().
    - Each machine type has a class that implements it, like
      XML::SAX::Tap, XML::SAX::Pipeline, etc.
    - There is currently only one available helper,
      XML::SAX::EventMethodMaker, which is most useful for building a
      collection of methods to handle different events in the same way,
      without having to know all of their names.  It is also useful as a
      reference for all of the SAX events by looking at the source code,
      which contains simple tables of what events occur for what kind of
      handler (compiled by Robin Berjon).

- XML::Filter::* contains filters that are used by ByRecord and Manifold
  machines to handle SAX events (machines don't handle SAX events, they
  delegate to the generators/filters/handlers they contain).
    - XML::Filter::DocSplitter - Splits one doc in to multiple
      documents, optionally coordinating with an aggregator like
      XML::Filter::Merger to reassemble them.  ByRecord uses this.
    - XML::Filter::Distributor - buffers a document and reemits it to
      each handler in turn. Used by Manifold.
    - XML::Filter::Tee - a dynamically reconfigurable tee fitting.  Does
      not buffer.  Used by Tap.  Morally equivalent to
      XML::Filter::SAXT but more flexible.
    - XML::Filter::Merger - collects multiple documents and merges them,
      inserting all secondary documents in to one master document.
      Used by both ByRecord and Manifold.

All of the XML::Filter::* classes are useful outside of the machines
that use them.  For instance, XML::Filter::DocSplitter has been used
(not by me) in a Pipeline to split a huge record oriented file in to
individual files containing single records (using a custom class derived
from XML::SAX::Writer).  XML::Filter::Merger is useful as a general way
to implement <XInclude> style processing when XInclude is not a good

See the examples/ directory for, well, examples (and feel free to write
up creative examples, eventually I'd like to compile a cookbook).

To give a more concrete idea of how SAX machines are typically used,
here's how to build a pipeline of SAX processors:

    use XML::SAX::Machines qw( Pipeline );
    use My::SAX::Filter2;

    my $p = Pipeline(
        My::SAX::Filter2->new( ... ),

    $p->parse_uri( $ARGV[0] );

That loads (if need be) XML::SAX::Writer and calls it's new() function
with an Output => \$output option, calls the passed-in instance of
XML::SAX::Filter2 and calls its set_handler() method to point it to the
XML::SAX::Writer that was just created, and then loads (if need be)
My::SAX::Filter1 and calls it's new() function with a Handler => option
pointing to the XML::SAX::Filter2 instance.