Skip to content

Latest commit

 

History

History
297 lines (198 loc) · 7.34 KB

parser-core-functions.md

File metadata and controls

297 lines (198 loc) · 7.34 KB

Parser Object :

  • It reads a stream of characters
    • The parser has functions to validate the stream
    • You can build your Parser by adding specific functions
  • The Parser is a monoid
    • It wraps one (and only one) value
    • It has some functions to work on that value

Streaming inputs

  • The Parser is constructed with a Streaming function
  • The Parser will consume elements form the stream
  • The stream will stop if the Parser can't match the next element
    • state of the Parser will be Rejected
  • If the Stream finishes, state of the Parser is Accepted
  • Once a element is consumed, the Parser can't go back
    • this allows speed and above all low memory use

Parser constructor

Usually, you would NOT create a Parser from its constructor. You will combine existing parsers to create a new one. However it can solve specific problems when combining existing parser is too difficult or not efficient.

    const newParser = new Parser(parseFunction);
    // But don't do that, except if you know what you are doing
  • difficulty : 3
  • construct a Parser object
  • parseFunction is a streaming function
    • reads characters at a given index
    • can end the stream
  • the parseFunction function will determine the behaviour of the Parser

Here is an example of a home-made parser for going back after an Accept: [#138 (comment) 5]

Essential Parser functions

then

  • Construct a Tuple of values from previous accepted values
let stream = Streams.ofString('abc');
const charsParser = C.char('a')
    .then(C.char('b'))
    .then(C.char('c'))
    .then(F.eos().drop()); // End Of Stream ; droping its value, just checking it's here
let parsing = charsParser.parse(stream);
assertEquals(parsing.value, 'abc');

drop()

  • difficulty : 1
  • Uses then() and returns only the left or right value
const stream = Streams.ofString('|4.6|');
const floorCombinator = C.char('|').drop()
    .then(N.number())    // we have ['|',4.6], we keep 4.6
    .then(C.char('|').drop())   // we have [4.6, '|'], we keep 4.6
    .map(x =>Math.floor(x));

// Masala needs a stream of characters
const parsing = floorCombinator.parse(stream);
assertEquals( 4, parsing.value, 'Floor parsing');

then() and drop() will often be used to find the right value in your data.

map(f)

  • difficulty : 0
  • Change the value of the response
const stream = Streams.ofString("5x8");
const combinator = N.integer()
                    .then(C.charIn('x*').drop())
                    .then(N.integer())
                    // values are [5,8] : we map to its multiplication
                    .map(values => values[0] * values[1]);
assertEquals(combinator.parse(stream).value, 40)

returns(value)

  • difficulty : 1
  • Forces the value at a given point
  • It's a simplification of map
const stream = Streams.ofString("ab");
// given 'ac', value should be ['X' , 'c']
const combinator = C.char('a')
                    .thenReturns('X')
                    .then(C.char('b')); 
assertEquals(combinator.parse(stream).value, ['X', 'b'])

It could be done using map():

const combinator = C.char('a')
                    .map(anyVal => 'X')
                    .then(C.char('c'));

eos()

  • difficulty : 1
  • Test if the stream reaches the end of the stream

any()

  • difficulty : 0
  • next character will always work
  • consumes a character

TODO : There is no explicit test for any()

opt()

  • difficulty : 0
  • Allows optional use of a Parser
  • Internally used for optrep() function
        const P = parser;
        // ok for 'ac' but also 'abc'    
        C.char('a').opt( C.char('b') ).char('c')

rep()

  • difficulty : 0
  • Ensure a parser is repeated at least one time
const stream = Streams.ofString('aaa');
const parsing = C.char('a').rep().parse(stream);
test.ok(parsing.isAccepted());
// We need to call list.array()
test.deepEqual(parsing.value.array(),['a', 'a', 'a']);

rep() will produce a List of values. You can get the more standard array value by calling list.array()

optrep

  • difficulty : 0
  • A Parser can be repeated zero or many times
// ok for 'ac' but also 'abbbbbc'    
C.char('a').optrep( C.char('b') ).char('c')

Useful but touchy

try() and or() are useful, and work often together. or() alone is not difficult, but it's harder to understand when it must work with try()

try() and or()

  • Essential !

  • difficulty : 1

  • Try a succession of parsers

  • If success, then continues

  • If not, jump after the succession, and continues with or()

       try(   x().then(y())  ).or(...)
    

TODO : what's the difference with : ( x().then(y()) ).or() TODO : There is a story of consumed input - in tests : 'expect (then.or) left to be consumed' TODO : missing a pertinent test for using try()

flatMap (f )

  • difficulty : 3

  • parameter f is a function

  • pass parser.value to f function (TODO : better explain)

  • f can combine parsers to continue to read the stream, knowing the previous value

      'expect (flatMap) to be return a-b-c': function(test) {
          test.equal(parser.char("a")
              .flatMap(
                  aVal=> parser.char('b').then(parser.char('c'))
                  .map(bcVal=>aVal+'-'+bcVal.join('-')) //--> join 3 letters
              ) 
              .parse(Streams.ofString("abc")).value,
              'a-b-c',
              'should be accepted.');
        },
    

It can help you to read your document knowing what happen previously

/* We need to parse this:
        name: Nicolas
        hotel: SuperMarriot
        Nicolas: nz@robusta.io
 */
function combinator() {
    return readNextTag('name').map( name =>  {name})
        .then(readNextTag('hotel')).map(([context, hotel]) => Object.assign(context, {hotel}))
        // we don't know that tag is Nicolas. It depends on running context 
        .flatMap(userEmail);
        // now parsing value has name, hotel and email keys
}


// We have Nicolas: nz@robusta.io
function userEmail(context){// context injected is the running value of the parsing
    return readNextTag(context.name).map(email => Object.assign(context, {email}))
}

filter (predicate)

  • difficulty : 1
  • To be used once a value is defined
  • predicate is a function pVal -> boolean
  • Check if the stream satisfies the predicate
    • Parse will be Rejected if filter is false

      'expect (filter) to be accepted': function(test) { test.equal(parser.char("a").filter(a => a === 'a') .parse(Streams.ofString("a")).isAccepted(), true, 'should be accepted.'); }

match (matchValue)

  • difficulty : 0

  • Simplification of filter()

  • Check if the stream value is equal to the matchValue

      //given 123
      N.number().match(123)
    

error()

  • difficulty : 0
  • Forces an error
  • The parser will be rejected

TODO : Is it possible to have a value for this error ? It would give a live hint for the writer.

satisfy(predicate)

  • difficulty : 2
  • Used internally by higher level functions
  • If predicate is true, consumes a element from the stream, and the value is set to the element
  • If predicate is false, the element is not consumed