Skip to content
mattbierner edited this page Dec 9, 2014 · 24 revisions

Basic Parsers

parse.always(x)

Parser that always succeeds with x.

var p = parse.always('3');

parse.run(p, ''); // 3
parse.run(p, 'abc'); // 3

parse.never(x)

Parser that always fails with x.

parse.run(parse.never(5), 'abc'); // throws 5
parse.run(parse.never('Error'), 'abc'); // throws 'Error'

parse.bind(p, f)

Parser that parses 'p', passing the result to function 'f' which returns a parser to continue the computation.

'f' is passed three arguments: the value from 'p', the state from 'p' and the memotable from 'p'

var p = parse.bind(parse.always(1), \x -> parse.always(x + 1));
parse.run(p, ''); // 2
    
var err = parse.bind(parse.never(), \x -> parse.always(x + 1));
parse.run(p, ''); // throws UnknownError, 'f' never called.

parse.fail([msg])

Parser that always fails with an error.

If 'msg' is not specified, fails with 'UnknownError'. Otherwise, fails with 'ParseError' with message 'msg'

parse.run(parse.never(), 'abc'); // throws UnknownError
parse.run(parse.never('Error'), 'abc'); // throws ParserError with error message "Error"

Sequencing

parse.next(p, q)

Consumes 'p' then 'q'. Returns result from 'q'

var p = parse.next(
    parse_text.character('a'),
    parse_text.character('b'));

parse.run(p, 'ab'); // b 
parse.run(p, 'abc'); // b
parse.run(p, 'a'); // Error!
parse.run(p, 'aab'); // Error!
parse.run(p, 'ba'); // Error!

// If `p` fails, `q` is never run.
var p = parse.next(
    parse.never(),
    parse_text.character('b'));

parse.run(p, 'b'); // Error
parse.run(p, ''); // Error

parse.sequence(...parsers)

Parser that consumes each element of parsers in order, returning the last result.

var p = parse.sequence(
    parse_text.character('a'),
    parse_text.character('b'),
    parse_text.character('c'));

parse.run(p, 'a'); // Error, expected b
parse.run(p, 'ab'); // Error expected c
parse.run(p, 'abc'); // c
parse.run(p, 'abcd'); // c

parse.sequencea(parsers)

Same as parse.sequence but gets parsers from arguments array.

var p = parse.sequencea(['a', 'b', 'c'].map(parse_text.character));

parse.run(p, 'a'); // Error, expected b
parse.run(p, 'ab'); // Error expected c
parse.run(p, 'abc'); // c

parse.sequences(parsers)

Same as parse.sequence but get parsers from a stream of parsers parser. This allows parsing infinite lazy sequences of parsers.

var tenA = parse.sequences(
    stream.repeat(10, parse_text.character('a'));

parse.run(tenA, 'aaaaaaaaaaaa'); // a

Choice

parse.either(p, q)

Parser that succeeds with either 'p' or 'q'. Attempts 'p' first and if 'p' fails attempts 'q'

If both fail, fails with a MultipleError with errors from 'p' and 'q'.

var p = parse.either(
    parse_text.character('a'),
    parse_text.character('b'));

parse.run(p, 'a'); // a 
parse.run(p, 'b'); // b
parse.run(p, 'c'); // Error! MultipleError

// Either does not automatically backtrack, use 'parse.attempt'
var p = parse.either(
    parse.next(
        parse_text.character('a'),
        parse_text.character('b')),
    parse.next(
        parse_text.character('a'),
        parse_text.character('c')));
parse.run(p, 'ab'); // b
parse.run(p, 'ac'); // Error! First parser succeeded on 'a' then failed on 'c'.
// Returned error is the error from the first parser

parse.attempt(p)

Parser that attempts to parse 'p' and can backtrack if needed.

Useful with either parsers

// Modified example from 'parse.either'
var p = parse.either(
    parse.attempt(parse.next(
        parse_text.character('a'),
        parse_text.character('b'))),
    parse.next(
        parse_text.character('a'),
        parse_text.character('c')));

parse.run(p, 'ab'); // b
parse.run(p, 'ac'); // c
parse.run(p, 'z'); // Error! MultipleError 

parse.look(p)

Parse p, but don't consume any input if it succeeds. Returns results from p. This is the same as Parsec's lookahead

var p = parse.sequence(
    parse_text.character('a'),
    parse.look(parse_text.character('b')));

parse.run(p, 'ab'); 'b'
parse.run(p, 'ax'); Error
parse.run(p, 'a'); Error

// When look succeeds, it consumes no input
parse.run(parse.next(p, parse.anyToken), 'ab'); 'b'

parse.choice(...choices)

Attempts a variable number of parser in order until one succeeds or all fail. Returns result of first to succeed.

When all fails, returns a MultipleError with all errors from 'choices'.

var p = parse.choice(
    parse_text.character('a'),
    parse_text.character('b'),
    parse_text.character('c')); 
parse.run(p, 'a'); // a
parse.run(p, 'b'); // b
parse.run(p, 'c'); // c
parse.run(p, 'z'); // Error! MultipleError

parse.choicea(choices)

Same as choice, but gets choices from array:

var p = parse.choicea(['a', 'b', 'c'].map(parse_text.character));

parse.run(p, 'a'); // a
parse.run(p, 'b'); // b
parse.run(p, 'c'); // c
parse.run(p, 'z'); // Error! MultipleError

parse.optional(default, p)

Parser p once, or return default if p fails.

var p = parse.optional('def', parse_text.character('b'));
parse.run(p, 'b'); // b
parse.run(p, ''); // 'def'
parse.run(p, 'z'); // 'def'

Enumeration

parse.many(p)

Consumes 'p' zero or more times and succeeds with a Nu stream of results.

var p = parse.many(parse_text.character('a'));
parse.run(p, ''); // empty stream 
parse.run(p, 'z'); // empty stream
parse.run(p, 'a'); // stream of ['a'] 
parse.run(p, 'aaa'); // stream of ['a', 'a', 'a'] 
parse.run(p, 'aabaa); // stream of ['a', 'a'] 

parse.many1(p)

Parser that consumes 'p' one or more times and succeeds with a Nu stream of results.

var p = parse.many1(parse_text.character('a'));
parse.run(p, ''); // Error!
parse.run(p, 'z'); // Error!
parse.run(p, 'a'); // stream of ['a'] 
parse.run(p, 'aaa'); // stream of ['a', 'a', 'a'] 
parse.run(p, 'aabaa); // stream of ['a', 'a'] 

parse.cons(p, q)

Parser that cons the result of 'p' onto result of 'q'. 'q' must return a Nu stream.

var p = parse.cons(
   parse_text.character('a'),
   parse.enumeration(
      parse_text.character('b'),
      parse_text.character('c'));

parse.run(p, ''); // Error!
parse.run(p, 'z'); // Error!
parse.run(p, 'ab'); // Error!
parse.run(p, 'abc'); // stream of ['a', 'b', 'c'] 
parse.run(p, 'abcxyz'); // stream of ['a', 'b', 'c'] 

parse.append(p, q)

Parser that joins the result of 'p' onto result of 'q'. Both 'p' and 'q' must return a Nu stream.

var p = parse.append(
   parse.enumeration(
      parse_text.character('a'),
      parse_text.character('b'),
   parse.enumeration(
      parse_text.character('c'),
      parse_text.character('d'));

parse.run(p, ''); // Error!
parse.run(p, 'z'); // Error!
parse.run(p, 'ab'); // Error!
parse.run(p, 'abc'); // Error!
parse.run(p, 'abcd'); // stream of ['a', 'b', 'c', 'd'] 
parse.run(p, 'abcdefg'); // stream of ['a', 'b', 'c', 'd'] 

parse.enumeration(...parsers)

Consume parsers in order, building a stream of results.

var p = parse.enumeration
    parse_text.character('a'),
    parse_text.character('b'))
parse.run(p, 'ab'); // stream of ['a', 'b'] 
parse.run(p, 'ax'); // Error, expected b found x. No backtracking

parse.enumerationa(parsers)

Same as parse.enumeration but get parser from Array parsers

parse.eager(p)

Parser that flattens results of 'p' to array

var p = parse.eager(parse.many(parse_text.character('a')));
parse.run(p, ''); // []
parse.run(p, 'z'); // []
parse.run(p, 'a'); //  ['a'] 
parse.run(p, 'aaa'); //  ['a', 'a', 'a'] 
parse.run(p, 'aabaa); //  ['a', 'a'] 

parse.binds(p, f)

Same operation as bind, but 'p' succeeds with a Nu stream and 'f' is called with stream values as arguments.

var seq = parse.enumeration(parse.always(1), parse.always(2));

var p = parse.binds(seq, \x y -> parse.always(x + y));
parse.run(p, ''); // 3

var err = parse.binds(parse.never(), \x, y -> parse.always(x + y));
parse.run(p, ''); // throws UnknownError, 'f' never called.

Tokens

parse.token(consume, [err])

Parser that consumes a single item from the head of input if 'consume' returns true for that item. Fails to consume input if consume is false or input is empty.

When consume succeeds, advanced the input

'err' is called to get the error object when 'consume' fails. Defaults to returning an UnexpectError

var p = parse.token(\x -> x === 'a');
parse.run(p, ''); // Error!
parse.run(p, 'b'); // Error!
parse.run(p, 'a'); // 'a'
parse.run(p, 'abc'); // 'a'

var p = parse.next(
    parse.token(\x -> x === 'a'),
    parse.token(\x -> x === 'b'));
parse.run(p, ''); // Error!
parse.run(p, 'b'); // Error!
parse.run(p, 'a'); // Error!
parse.run(p, 'ab'); // 'b'
parse.run(p, 'abc'); // 'b'

var p = parse.token(x -> x === 'a', (pos, found) -> new parse.ExpectError(pos, 'a', found));
parse.run(p, ''); // Error! ExpectError for expected 'a'
parse.run(p, 'b'); // Error!  ExpectError for expected 'a'

parse.anyToken

Parser that consumes any token.

var p = parse.anyToken;
parse.run(p, ''); // Error! Unexpected eof
parse.run(p, 'b'); // 'b'
parse.run(p, 'a'); // 'a'

State Interaction

parse.getParserState

Succeeds with the current parse state.

parse.run(
     parse.next(
          parse_text.character('a'),
          parse.getParserState),
     'abc'); // returns a ParserState(Position(1), input, ud)

parse.setParserState(s)

Sets the parser state to s. Succeeds with s.

parse.modifyParserState(f)

Modify the state using function f, succeeding the result and setting the state to be the result.

parse.run(
     parse.modifyParserState(\s ->
           s.setPosition(parse.Position.initial)),
     'abc'); // returns the parserState

parse.extract(f)

Parser that extracts a value from the parser state by calling function 'f' called with state.

parse.getState

Succeeds with user state.

parse.run( parse.getState, "abc", 'user state'); // returns 'user state'

parse.setState(s)

Sets the user state to s. Succeeds with s.

parse.run(
    parse.setState('new user state'),
    "abc",
    'user state'); // returns 'new user state'

parse.modifyState(f)

Modify the users state using function f, succeeding the result and setting the state to be the result.

parse.run(
    parse.sequence(
         parse.modifyState(\x -> x + 10),
         parse.modifyState(\x -> x / 2),
         parse.getUserState)
    "abc",
    0); // returns 5

parse.getPosition

Get the current position.

parse.run(
    parse.next(
         parse_text.character('a'),
         parse.getPosition)
    "abc"); // returns Position(1)

parse.setPosition(pos)

Set the current position.

parse.getInput

Get the current input. This returns the stream of remaining input.

parse.run(
    parse.next(
         parse_text.character('a'),
         parse.bind(parse.getInput, \inputStream ->
              always(stream.toArray(inputStream)))
    "abc"); // returns ['b', 'c']

parse.setInput(input)

Set the current input to the stream input. Parsing continues on new input.

parse.run(
    parse.sequence(
         parse_text.character('a'),
         parse.setInput(stream.NIL),
         parse_text.character('b'))
    "abc"); // throws error, found eof expected 'b'

Objects

parse.Position

Default object used to track the parser's position in the input stream. 'parse.Position' simply keeps track of the index in the stream.

parse.InputState

Default state object. Keeps track of the stream and position.

Parser Creation

parse.rec(def)

Creates a function using a factory function 'def' to allow self references.

'def' is a function that is passed a reference to the object to be created and returns the object to be created.

For example, using a traditional definition the self reference to 'b' evaluates to undefined:

var b = parse.either(parse_text.character('b'), b)

// Really this is equivalent to:
var b = parse.either(parse_text.character('b'), undefined)

Using rec, we fix this.

var b = rec(\self ->
    parse.either(parse_text.character('b'), self));

and now 'b' correctly references itself.

parse.label(name, impl)

Create a parser with display name name and implementation impl. Display names help with debugging.

Running

Running a parser to extract the success or failure result.

parse.run(p, input, [userData])

Run parser p against array like object 'input'. Returns success results, throws error results.

parse.run(
    parse_text.string('ab'),
    'abc'); // returns 'ab'

parse.run(
    parse_text.string('ab'),
    'x'); // throws Expected 'a' found 'x'

parse.runStream(p, s, [userData])

Run parser p against a potentially infinite [Nu][nu] stream s. Returns success results, throws error results.

parse.runStream(
    parse_text.string('ab'),
    stream.from('abc')); // returns 'ab'

// Run against infinite, lazy stream
parse.runStream(
    parse_text.string('aaaaa'),
    gen.repeat(Infinity, 'a')); // returns 'aaaaa'

parse.runState(p, state)

Run parser p with 'state'. This is used if you have a custom parser state. Returns success results, throws error results.

parse.runState(
    parse_text.string('ab'),
    new parse.ParserState(
         parse.Position.initial,
         stream.from('abc'),
         {});
 // returns 'ab'

parse.test(p, input, [userData])

Same as parse.run but returns if the parser succeeded or failed.

parse.test(
    parse_text.string('ab'),
    'abc'); // returns true

parse.run(
    parse_text.string('ab'),
    'x'); // returns false

parse.testStream(p, s, [userData])

Same as parse.runStream but returns if the parser succeeded or failed.

parse.testState(p, state)

Same as parse.runState but returns if the parser succeeded or failed.

parse.parse(p, input, userData, ok, err)

Run parser p against input, calling ok if the parser succeeds and err if it fails. Returns result of continuation.

parse.parse(
    parse_text.string('ab'),
    'abc',
    null
    \x -> console.log("suc" + x),
    \x -> console.log("error" + x));

parse.parseStream(p, s, userData, ok, err)

Same as parse.parse but takes and stream s as input.

parse.testState(p, state, ok, err)

Same as parse.runState but takes an explicit parser state state.

Structures

parse.Parser(impl)

Base parser type. All Bennu parsers are instances of this object.

You will probably never need to use this object directly, but all base parsers are instances of this object and it implements the [Fantasy Land][fl] methods.

Error Objects

ParserError(message)

Error thrown when there is an error with the parser definition itself (for example, calling many on a parser that succeeds and consumes no input).

ParseError(position, message)

Base type of error thrown durring during parsing.

The position property gets the location where the error occurred.

The message property gets the complete error description (you can also use toString).

The errorMessage property gets the description of just the error, without the position information.

MultipleError(...errors)

Merges one or more ParserError into a single error.

The position is the position of the first error. The message is combined messages from errors.

UnknownError(pos)

Error whose exact cause is unknown.

UnexpectError(pos, unexpected)

ParseError when an unexpected token 'unexpected' is encountered at position 'pos'.

ExpectedError(pos, expected, [found])

ParseError when an unexpected token 'found' is encountered at position 'pos' when 'expected was expected'.

[fl] : https://github.com/fantasyland/fantasy-land [nu] : https://github.com/mattbierner/nu

Clone this wiki locally