Skip to content

hugoetchegoyen/pegpp

Repository files navigation

The pegpp library
-----------------

Pegpp is a header-only library that allows building PEG grammars with embedded actions
in C++ without intermediate code generation.

The grammars are expressed in an EBNF-like syntax inspired by the PEG and LEG parser 
compilers by Ian Piumarta. Value stack handling is inspired by YACC.

Library sources
---------------

peg.h

Requirements
------------

C++17. Tested on Linux with gcc 8.2.0 through 11.2.0.

Documentation
-------------

pegpp.pdf

Examples
--------

Username:

    A simple example showing how LEG parsers can be easily translated to pegpp. 
    The LEG version (username.leg) was taken from the PEG/LEG distribution.

    The parser replaces all occurrences of the string "username" by the current user's 
    login name.

Pal:

    An example demonstrating the use of semantic predicates and parsing-time actions.
    This parser splits the input stream into palindromes and prints them one per line. 
    A palindrome is any symmetric string of length 1 or more.

    If there are no repeated characters the parser outputs the longest possible
    palindromes. With repeated characters the behaviour is more erratic: it outputs 
    palindromes, but not necessarily the longest ones.

    Some input -> output samples:

        abcba -> abcba

        aaaaa -> aa
                 aa
                 a

        aaaa  -> aaaa

        aaa   -> aa
                 a

Mpal:

    A variant of pal that uses memoization if argc > 1 (i.e. if some argument is given
    in the command line) and measures parsing time to demonstrate its benefits. 
    See pegpp.pdf for details.

Intcalc:

    An integer calculator that supports the four basic operations and 
    grouping with parentheses.

    Input/output sample:

    (1 + 2) * 3 -> 9

Intcalcerr:

    The same calculator with error reporting using labeled rules.

Numsum:

    An example to illustrate the use of a variant value stack.
    Copies input to output but recognizes sums of integers and replaces
    them with their numeric values.

    Input/output sample:

    aaabbb100+0001+000000003ccc011ddd -> aaabbb104ccc11ddd

Varcalc:

    A floating-point calculator with named variables.

        Values are doubles. Valid input formats: 

            17
            2.
            2.33
            .28
            3.21e3
            22e-5

        Supported operations:

            addition/substraction
            multiplication/division
            exponentiation
            grouping (parentheses)
            variable assignment
            printing (print command)

        Named variables: 

            Variable names have the format of C identifiers.
            Assignment is an expression (assignments can be cascaded).

        Operators from highest to lowest precedence and their associativity:

            ^       exponentiation      left
            + -     unary plus/minus    right
            * /     multiply/divide     left
            + -     add/substract       left
            =       assignment          right

    Varcalc reads statements. Newline or semi-colon end a statement.

    Statements can be:

        <expression>
        print <expression>

    Any text from // to the end of the line is a comment.
    White space, empty lines and comments are ignored. 
    Invalid statements are reported as errors and ignored. 
    Valid statements are executed.

    Assume the following input:

        a = b = 23 + c
        print 2 * a
        hello!
        print c
        print 2^(-1/2) 

    Variable c is being used without a previous assignment. The variable is
    automatically initialized to 0 and a warning is printed on standard error.

    This is the output:

        line 1: defining c = 0      (standard error)
        46                          (standard output)
        line 3: ERROR: hello!       (standard error)
        0                           (standard output)
        0.707107                    (standard output)