Skip to content


Subversion checkout URL

You can clone with
Download ZIP
A javascript object extractor that parses XML using SAX and an XPATH like specification.
Branch: master

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.

The idea behind saxtract is to use a combination of SAX parsing and XPath data extraction. This means you do not need to load the entire DOM to leverage the simplicity of XPath. Saxtract uses a spec object to define the data to extract during parsing. For example:

var spec = {
    '/root/@id': 'id'

Says to take whatever matches the xpath /root/@id and store it in the result object under the key id. So if you were to parse this XML:

var xml = "<root id='abc' />";


var result = saxtract.parse(xml, spec);

Your result would look like this (using JSON.stringify):


A more real world example pulled directly from the unit tests (test/saxtract_test.js) shows:

    var saxtract = require("../saxtract"),
        assert = require("assert"),
        expected = {
            id: '5293',
            name: 'Robert Ludlum',
            link: ''
        result = saxtract.parse(
            "<?xml version='1.0' encoding='UTF-8'?>\n" +
            "<GoodreadsResponse>\n" +
            "  <Request>\n" + 
            "    <authentication>true</authentication>\n" +
            "    <key><![CDATA[API_KEY_GOES_HERE]]></key>\n" +
            "    <method><![CDATA[api_author_link]]></method>\n" +
            "  </Request>\n" +
            "  <author id='5293'>\n" +
            "    <name><![CDATA[Robert Ludlum]]></name>\n" +
            "    <link>;utm_source=author_link</link>\n" +
            "  </author>\n" +
                '/GoodreadsResponse/author/@id': 'id',
                '/GoodreadsResponse/author/name': 'name',
                '/GoodreadsResponse/author/link': 'link',

    console.log("*****expected*****\n" + JSON.stringify(expected, null, 2) + 
        "\n*****actual*****\n" + JSON.stringify(result, null, 2) + 

    assert.deepEqual(expected, result);

I will add to this as I have time, but if you are actually interested, you can look at test/saxtract_test.js which has the most up to date examples.

Something went wrong with that request. Please try again.