saxtract

Require

Saxtract now supports both libxmljs and saxjs. You must include one or the other in your package.json as neither is a dependency of saxtract. You indicate which one you desire to use thusly:

var saxtract = require("saxtract").libxmljs;

// or

var saxtract = require("saxtract").saxjs;

General Usage

The idea behind saxtract is to use a combination of SAX parsing and XPath data extraction. This means you do not need to load the entire DOM to leverage the simplicity of XPath. Saxtract uses a spec object to define the data to extract during parsing. For example:

var spec = {
    '/root/@id': 'id'
};

Says to take whatever matches the xpath /root/@id and store it in the result object under the key id. So if you were to parse this XML:

var xml = "<root id='abc' />";

Thusly:

var result = saxtract.parse(xml, spec);

Your result would look like this (using JSON.stringify):

{'id':'abc'}

A more real world example pulled directly from the unit tests (test/saxtract_test.js) shows:

    var saxtract = require("../saxtract").saxjs,
        assert = require("assert"),
        expected = {
            id: '5293',
            name: 'Robert Ludlum',
            link: 'http://www.goodreads.com/author/show/5293.Robert_Ludlum?utm_medium=api&utm_source=author_link'
        },
        result = saxtract.parse(
            "<?xml version='1.0' encoding='UTF-8'?>\n" +
            "<GoodreadsResponse>\n" +
            "  <Request>\n" + 
            "    <authentication>true</authentication>\n" +
            "    <key><![CDATA[API_KEY_GOES_HERE]]></key>\n" +
            "    <method><![CDATA[api_author_link]]></method>\n" +
            "  </Request>\n" +
            "  <author id='5293'>\n" +
            "    <name><![CDATA[Robert Ludlum]]></name>\n" +
            "    <link>http://www.goodreads.com/author/show/5293.Robert_Ludlum?utm_medium=api&amp;utm_source=author_link</link>\n" +
            "  </author>\n" +
            "</GoodreadsResponse>", 
            {
                '/GoodreadsResponse/author/@id': 'id',
                '/GoodreadsResponse/author/name': 'name',
                '/GoodreadsResponse/author/link': 'link',
            });
    
    console.log("*****expected*****\n" + JSON.stringify(expected, null, 2) + 
        "\n*****actual*****\n" + JSON.stringify(result, null, 2) + 
        "\n*****end*****");
    
    assert.deepEqual(expected, result);

I will add to this as I have time, but if you are actually interested, you can look at test/saxtract_test.js which has the most up to date examples.

Options

Logging

Logging can be turned on using:

    var saxtract = require("saxtract").saxjs;
    saxtract.logging = true;

    ...

Whitespace Preservation

Whitespace preservation can be enabled globally using:

    var saxtract = require("saxtract").saxjs;
    saxtract.preserveWhitespace = true;

    ...

or per call to parse:

    var saxtract = require("saxtract").saxjs;
    saxtract.parse(xml, spec, preserveWhitespace);

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
common		common
lib		lib
test		test
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
README.md		README.md
package.json		package.json
saxtract.js		saxtract.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

common

common

lib

lib

test

test

.gitignore

.gitignore

CHANGELOG.md

CHANGELOG.md

README.md

README.md

package.json

package.json

saxtract.js

saxtract.js

Repository files navigation

saxtract

Require

General Usage

Options

Logging

Whitespace Preservation

About

Releases

Packages

Languages

lucastheisen/saxtract

Folders and files

Latest commit

History

Repository files navigation

saxtract

Require

General Usage

Options

Logging

Whitespace Preservation

About

Resources

Stars

Watchers

Forks

Languages