My JavaScript parser
Latest commit c99240c May 26, 2013 Peter van der Zee Version bump
Failed to load latest commit information.
.npmignore [fix] Don't bundle tests when deploying package. Oct 16, 2012
LICENSE First commit! Oct 1, 2011
README Added highlighting example Mar 25, 2012
Tokenizer.js Better error detection in tag parsing... Nov 5, 2012
interactive.html Added an interactive console for the parser which shows you the token… Mar 25, 2012
package.json Version bump May 26, 2013
test-parser.html Fixed case for files Dec 23, 2011
test-tokenizer.html Fixed case for files Dec 23, 2011
tests.js Add unit tests for determining the size of the parse tree. Dec 23, 2011


This is a JavaScript parser.
(c) Peter van der Zee


The Tokenizer is used by the parser. The parser tells the tokenizer whether the next token may be a regular expression or not. Without the parser, the tokenizer will fail if regular expression literals are used in the input.


Returns a "parse tree" which is a tree of an array of arrays with tokens (regular objects) as leafs. Meta information embedded as properties (of the arrays and the tokens).


Returns a new ZeParser instance which has already parsed the input. Amongst others, the ZeParser instance will have the properties .tree, .wtree and .btree.

.tree is the parse tree mentioned above.
.wtree ("white" tree) is a regular array with all the tokens encountered (including whitespace, line terminators and comments)
.btree ("black" tree) is just like .wtree but without the whitespace, line terminators and comments. This is what the specification would call the "token stream".

I'm aware that the naming convention is a bit awkward. It's a tradeoff between short and descriptive. The streams are used quite often in the analysis.

Tokens are regular objects with several properties. Amongst them are .tokposw and .tokposw, they correspond with their own position in the .wtree and .btree.

The parser has two modes for parsing: simple and extended. Simple mode is mainly for just parsing and returning the streams and a simple parse tree. There's not so much meta information here and this mode is mainly built for speed. The other mode has everything required for Zeon to do its job. This mode is toggled by the instance property .ast, which is true by default :)

Non-factory example:

var input = "foo";
var tree = []; // this should probably be refactored away some day
var tokenizer = new Tokenizer(input); // dito
var parser = new ZeParser(input, tokenizer, tree);
parser.parse(); // returns tree..., should never throw errors
parser.tokenizer.fixValues(); // makes sure all tokens have a .value property

Highlighting example:

var parser = ZeParser.createParser(textarea.value); // textarea.value:input
parser.tokenizer.fixValues(); // makes sure all tokens have a .value property
var wtree = parser.tokenizer.wtree; // all the tokens ("token stream", including whitespace)
textarea.className = '';
var tokenstrings ={
	if ( == 14) textarea.className = 'error';
	return '<span class="t''">'+('\u29e6':('\u292c':t.value)).replace(/&/g,'&amp;').replace(/</g,'&lt;').replace(/>/g,'&gt;')+'</span>';
// the string that would contain highlighted code
// tokenstrings.join('');