Skip to content
A tiny, super fast, namespace aware, sax-style XML parser.
Branch: master
Clone or download
Pull request Compare This branch is 185 commits ahead, 10 commits behind vflash:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
lib
test
.eslintignore
.eslintrc
.gitignore
.npmignore
.npmrc
.travis.yml
CHANGELOG.md
LICENSE
README.md
package.json
rollup.config.js

README.md

/saxen/ parser

Build Status Codecov

A tiny, super fast, namespace aware sax-style XML parser written in plain JavaScript.

Features

  • (optional) entity decoding and attribute parsing
  • (optional) namespace aware
  • element / attribute normalization in namespaced mode
  • tiny (2.6Kb minified + gzipped)
  • pretty damn fast

Usage

var {
  Parser
} = require('saxen');

var parser = new Parser();

// enable namespace parsing: element prefixes will
// automatically adjusted to the ones configured here
// elements in other namespaces will still be processed
parser.ns({
  'http://foo': 'foo',
  'http://bar': 'bar'
});

parser.on('openTag', function(elementName, attrGetter, decodeEntities, selfClosing, getContext) {

  elementName;
  // with prefix, i.e. foo:blub

  var attrs = attrGetter();
  // { 'bar:aa': 'A', ... }
});

parser.parse('<blub xmlns="http://foo" xmlns:bar="http://bar" bar:aa="A" />');

Supported Hooks

We support the following parse hooks:

  • openTag(elementName, attrGetter, decodeEntities, selfClosing, contextGetter)
  • closeTag(elementName, decodeEntities, selfClosing, contextGetter)
  • error(err, contextGetter)
  • warn(warning, contextGetter)
  • text(value, decodeEntities, contextGetter)
  • cdata(value, contextGetter)
  • comment(value, decodeEntities, contextGetter)
  • attention(str, decodeEntities, contextGetter)
  • question(str, contextGetter)

In contrast to error, warn receives recoverable errors, such as malformed attributes.

In proxy mode, openTag and closeTag a view of the current element replaces the raw element name. In addition element attributes are not passed as a getter to openTag. Instead, they get exposed via the element.attrs:

  • openTag(element, decodeEntities, selfClosing, contextGetter)
  • closeTag(element, selfClosing, contextGetter)

Namespace Handling

In namespace mode, the parser will adjust tag and attribute namespace prefixes before passing the elements name to openTag or closeTag. To do that, you need to configure default prefixes for wellknown namespaces:

parser.ns({
  'http://foo': 'foo',
  'http://bar': 'bar'
});

To skip the adjustment and still process namespace information:

parser.ns();

Proxy Mode

In this mode, the first argument passed to openTag and closeTag is an object that exposes more internal XML parse state. This needs to be explicity enabled by instantiating the parser with { proxy: true }.

// instantiate parser with proxy=true
var parser = new Parser({ proxy: true });

parser.ns({
  'http://foo-ns': 'foo'
});

parser.on('openTag', function(el, decodeEntities, selfClosing, getContext) {
  el.originalName; // root
  el.name; // foo:root
  el.attrs; // { 'xmlns:foo': ..., id: '1' }
  el.ns; // { xmlns: 'foo', foo: 'foo', foo$uri: 'http://foo-ns' }
});

parser.parse('<root xmlns:foo="http://foo-ns" id="1" />')

Proxy mode comes with a performance penelty of roughly five percent.

Caution! For performance reasons the exposed element is a simple view into the current parser state. Because of that, it will change with the parser advancing and cannot be cached. If you would like to retain a persistent copy of the values, create a shallow clone:

parser.on('openTag', function(el) {
  var copy = Object.assign({}, el);
  // copy, ready to keep around
});

Non-Features

/saxen/ lacks some features known in other XML parsers such as sax-js:

  • no support for parsing loose documents, such as arbitrary HTML snippets
  • no support for text trimming
  • no automatic entity decoding
  • no automatic attribute parsing

...and that is ok ❤.

Credits

We build on the awesome work done by easysax.

/saxen/ is named after Sachsen, a federal state of Germany. So geht sächsisch!

LICENSE

MIT

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.