Skip to content

A simple parse tree generator for any user-defined LR(1) programming language

License

Notifications You must be signed in to change notification settings

alumik/parse-tree

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LR(1) Parse Tree Generator

version-0.2.2 Python->=3.11 license-MIT

Demo

Modify the config file if necessary and then execute:

python demo.py --config=<config> --text=<text>

The image of the output parse tree will be saved in out/parse-tree.svg. (This behavior can be changed in demo.py.)

Examples

Lexical Analysis

NFA to DFA

Take the regular expression (a|b)*abb for example.

  1. Create an NFA from the regular expression:

    import ptree
    
    from ptree.lexer.regex import Regex, RegexEngine
    
    regex = Regex(name='(a|b)*abb', pattern='(a|b)*abb')
    engine = RegexEngine()
    nfa = engine.parse(regex)
    ptree.render(nfa, directory='out', name='nfa', output_format='svg')

    NFA

  2. Convert the NFA to a DFA:

    dfa = nfa.to_dfa()
    ptree.render(dfa, directory='out', name='dfa', output_format='svg')

    DFA

Merge multiple NFAs

Take the regular expressions a*b+, a, abb for example.

  1. Create three NFAs from the regular expressions.

    import ptree
    
    from ptree.lexer.regex import Regex, RegexEngine
    
    regexes = [
        Regex(name='a*b+', pattern='a*b+'),
        Regex(name='a', pattern='a'),
        Regex(name='abb', pattern='abb')
    ]
    engine = RegexEngine()
    nfas = [engine.parse(regex) for regex in regexes]
    for i, nfa in enumerate(nfas):
        ptree.render(nfa, directory='out', name=f'nfa-{i}', output_format='svg')

    NFA-1

    NFA-2

    NFA-3

  2. Merge the NFAs.

    from ptree.lexer.nfa import NFA
    
    nfa = NFA.union(nfas)
    ptree.render(nfa, directory='out', name='nfa', output_format='svg')

    NFA

  3. Convert the NFA to DFA.

    dfa = nfa.to_dfa()
    ptree.render(dfa, directory='out', name='dfa', output_format='svg')

    DFA

Tokenize a string

  1. Create a lexer from the config file.

    The config file used in this example is:

    nonterminal_symbols:
    terminal_symbols:
        COMMENT: '(//[^\n]*)|(/\*([^\*]|(\*)*[^\*/])*(\*)*\*/)'
        KEYWORD: 'auto|short|int|long|float|double|char|struct|union|enum|typedef|const|unsigned|signed|extern|register|static|volatile|void|if|else|switch|case|for|do|while|goto|continue|break|default|sizeof|return|using|namespace'
        IDENTIFIER: '[A-Za-z_][A-Za-z0-9_]*'
        INTEGER: '[0-9]+'
        FLOAT: '[0-9]+\.[0-9]+'
        COMPARISON: '==|>|<|>=|<=|!='
        LSTREAM: '<<'
        RSTREAM: '>>'
        LP: '\('
        RP: '\)'
        LB: '{'
        RB: '}'
        COMMA: ','
        SEMICOLON: ';'
        LSB: '\['
        RSB: '\]'
        ASSIGN_OP: '='
        ADD_OP: '\+'
        SUB_OP: '\-'
        MULT_OP: '\*'
        DIV_OP: '/'
        MOD_OP: '%'
        POWER_OP: '\^'
        AND_OP: '&&'
        OR_OP: '\|\|'
        NOT_OP: '!'
        SPACE: '[ \t\n\r]+'
    ignored_symbols:
        ? SPACE
        ? COMMENT
    start_symbol:
    production_rules:
    import ptree
    
    config = ptree.load_config('config.yaml')
    grammar = ptree.Grammar(config)
    lexer = ptree.Lexer(config, symbol_pool=grammar.symbol_pool)
  2. Tokenize the input string.

    The input string is:

    int main() {int a = a + 1; cout << a << endl; return 0;}
    tokens = lexer.tokenize('''int main() {int a = a + 1; cout << a << endl; return 0;}''')
    ptree.pprint(tokens)
    +----+------------+--------+
    |    | SYMBOL     | VALUE  |
    +====+============+========+
    | 1  | KEYWORD    | int    |
    +----+------------+--------+
    | 2  | IDENTIFIER | main   |
    +----+------------+--------+
    | 3  | LP         | (      |
    +----+------------+--------+
    | 4  | RP         | )      |
    +----+------------+--------+
    | 5  | LB         | {      |
    +----+------------+--------+
    | 6  | KEYWORD    | int    |
    +----+------------+--------+
    | 7  | IDENTIFIER | a      |
    +----+------------+--------+
    | 8  | ASSIGN_OP  | =      |
    +----+------------+--------+
    | 9  | IDENTIFIER | a      |
    +----+------------+--------+
    | 10 | ADD_OP     | +      |
    +----+------------+--------+
    | 11 | INTEGER    | 1      |
    +----+------------+--------+
    | 12 | SEMICOLON  | ;      |
    +----+------------+--------+
    | 13 | IDENTIFIER | cout   |
    +----+------------+--------+
    | 14 | LSTREAM    | <<     |
    +----+------------+--------+
    | 15 | IDENTIFIER | a      |
    +----+------------+--------+
    | 16 | LSTREAM    | <<     |
    +----+------------+--------+
    | 17 | IDENTIFIER | endl   |
    +----+------------+--------+
    | 18 | SEMICOLON  | ;      |
    +----+------------+--------+
    | 19 | KEYWORD    | return |
    +----+------------+--------+
    | 20 | INTEGER    | 0      |
    +----+------------+--------+
    | 21 | SEMICOLON  | ;      |
    +----+------------+--------+
    | 22 | RB         | }      |
    +----+------------+--------+
    

Parse Tree Generation

Elementary arithmetic

  1. Write a config file (in YAML format).

    nonterminal_symbols:
        ? E
        ? T
        ? F
    terminal_symbols:
        '+': '\+'
        '-': '\-'
        '*': '\*'
        '/': '/'
        '(': '\('
        ')': '\)'
        'num': '[0-9]+'
    ignored_symbols:
    start_symbol: E
    production_rules:
        - E -> E + T
        - E -> E - T
        - E -> T
        - T -> T * F
        - T -> T / F
        - T -> F
        - F -> num
        - F -> ( E )
  2. Generate the parse table.

    import ptree
    
    config = ptree.load_config('config.yaml')
    grammar = ptree.Grammar(config)
    grammar.init()
    parser = ptree.Parser(grammar)
    ptree.pprint(grammar.parse_table)
    +----+----------------------------------------------+--------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    |    | ACTION                                       | GOTO         | STATE                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
    +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    |    | +   | -   | *   | /   | (  | )   | num | $   | E  | T  | F  |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
    +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | 0  |     |     |     |     | s1 |     | s2  |     | 4  | 5  | 3  | {F ->  . ( E ), *; F ->  . num, *; T ->  . F, $; E ->  . E - T, -; _S ->  . E, $; E ->  . T, $; T ->  . F, /; F ->  . ( E ), $; T ->  . T * F, $; T ->  . T / F, $; E ->  . E + T, -; F ->  . num, $; T ->  . T * F, /; E ->  . E - T, $; T ->  . F, +; E ->  . E + T, $; F ->  . ( E ), /; E ->  . T, +; T ->  . F, -; T ->  . T * F, +; T ->  . T / F, +; T ->  . T / F, /; F ->  . ( E ), +; F ->  . num, +; T ->  . T / F, -; F ->  . num, /; T ->  . F, *; E ->  . E - T, +; T ->  . T * F, -; F ->  . num, -; E ->  . T, -; E ->  . E + T, +; F ->  . ( E ), -; T ->  . T * F, *; T ->  . T / F, *}                                                                       |
    +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | 1  |     |     |     |     | s6 |     | s7  |     | 8  | 10 | 9  | {F ->  . ( E ), *; F ->  . num, *; E ->  . E - T, -; T ->  . F, /; F -> ( . E ), /; E ->  . E + T, -; T ->  . T / F, ); E ->  . T, ); F ->  . ( E ), ); F -> ( . E ), +; T ->  . T * F, ); T ->  . T * F, /; F ->  . num, ); T ->  . T * F, *; T ->  . F, +; F ->  . ( E ), /; E ->  . E - T, ); E ->  . T, +; E ->  . E + T, ); T ->  . F, -; T ->  . T * F, +; F -> ( . E ), *; F -> ( . E ), -; T ->  . T / F, +; T ->  . T / F, /; F ->  . ( E ), +; F ->  . num, +; T ->  . T / F, -; F ->  . num, /; T ->  . F, *; E ->  . E - T, +; T ->  . T * F, -; F ->  . num, -; F -> ( . E ), $; E ->  . T, -; E ->  . E + T, +; F ->  . ( E ), -; T ->  . F, ); T ->  . T / F, *} |
    +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | 2  | r7  | r7  | r7  | r7  |    |     |     | r7  |    |    |    | {F -> num . , -; F -> num . , +; F -> num . , /; F -> num . , *; F -> num . , $}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
    +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | 3  | r6  | r6  | r6  | r6  |    |     |     | r6  |    |    |    | {T -> F . , -; T -> F . , +; T -> F . , /; T -> F . , *; T -> F . , $}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
    +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | 4  | s12 | s11 |     |     |    |     |     | acc |    |    |    | {E -> E . - T, -; E -> E . - T, $; E -> E . + T, +; _S -> E . , $; E -> E . + T, -; E -> E . + T, $; E -> E . - T, +}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
    +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | 5  | r3  | r3  | s14 | s13 |    |     |     | r3  |    |    |    | {T -> T . / F, -; T -> T . / F, +; E -> T . , +; T -> T . * F, /; T -> T . * F, +; E -> T . , -; E -> T . , $; T -> T . / F, /; T -> T . / F, *; T -> T . / F, $; T -> T . * F, *; T -> T . * F, $; T -> T . * F, -}                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
    +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | 6  |     |     |     |     | s6 |     | s7  |     | 15 | 10 | 9  | {F -> ( . E ), ); F ->  . ( E ), *; F ->  . num, *; E ->  . E - T, -; T ->  . F, /; F -> ( . E ), /; E ->  . E + T, -; T ->  . T / F, ); E ->  . T, ); F ->  . ( E ), ); F -> ( . E ), +; T ->  . T * F, ); T ->  . T * F, /; F ->  . num, ); T ->  . T * F, *; T ->  . F, +; F ->  . ( E ), /; E ->  . E - T, ); E ->  . T, +; E ->  . E + T, ); T ->  . F, -; T ->  . T * F, +; F -> ( . E ), *; F -> ( . E ), -; T ->  . T / F, +; T ->  . T / F, /; F ->  . ( E ), +; F ->  . num, +; T ->  . T / F, -; F ->  . num, /; T ->  . F, *; E ->  . E - T, +; T ->  . T * F, -; F ->  . num, -; E ->  . T, -; E ->  . E + T, +; F ->  . ( E ), -; T ->  . F, ); T ->  . T / F, *} |
    +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | 7  | r7  | r7  | r7  | r7  |    | r7  |     |     |    |    |    | {F -> num . , ); F -> num . , -; F -> num . , +; F -> num . , /; F -> num . , *}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
    +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | 8  | s16 | s18 |     |     |    | s17 |     |     |    |    |    | {E -> E . + T, ); F -> ( E . ), +; E -> E . - T, ); F -> ( E . ), -; E -> E . - T, -; E -> E . + T, +; F -> ( E . ), *; F -> ( E . ), $; F -> ( E . ), /; E -> E . + T, -; E -> E . - T, +}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
    +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | 9  | r6  | r6  | r6  | r6  |    | r6  |     |     |    |    |    | {T -> F . , -; T -> F . , +; T -> F . , ); T -> F . , /; T -> F . , *}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
    +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | 10 | r3  | r3  | s20 | s19 |    | r3  |     |     |    |    |    | {T -> T . / F, -; T -> T . / F, +; E -> T . , +; T -> T . * F, /; T -> T . * F, +; T -> T . * F, -; E -> T . , -; T -> T . / F, /; T -> T . / F, *; T -> T . * F, *; E -> T . , ); T -> T . * F, ); T -> T . / F, )}                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
    +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | 11 |     |     |     |     | s1 |     | s2  |     |    | 21 | 3  | {F ->  . ( E ), *; F ->  . num, *; T ->  . F, $; E -> E - . T, +; T ->  . F, /; F ->  . ( E ), $; T ->  . T * F, $; T ->  . T / F, $; F ->  . num, $; T ->  . T * F, /; E -> E - . T, -; T ->  . F, +; F ->  . ( E ), /; T ->  . F, -; T ->  . T * F, +; T ->  . T / F, +; T ->  . T / F, /; F ->  . ( E ), +; F ->  . num, +; E -> E - . T, $; T ->  . T / F, -; F ->  . num, /; T ->  . F, *; T ->  . T * F, -; F ->  . num, -; F ->  . ( E ), -; T ->  . T * F, *; T ->  . T / F, *}                                                                                                                                                                                         |
    +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | 12 |     |     |     |     | s1 |     | s2  |     |    | 22 | 3  | {F ->  . ( E ), *; F ->  . num, *; T ->  . F, $; T ->  . F, /; F ->  . ( E ), $; T ->  . T * F, $; T ->  . T / F, $; E -> E + . T, +; F ->  . num, $; T ->  . T * F, /; T ->  . F, +; F ->  . ( E ), /; E -> E + . T, -; T ->  . F, -; T ->  . T * F, +; T ->  . T / F, +; T ->  . T / F, /; F ->  . ( E ), +; F ->  . num, +; T ->  . T / F, -; F ->  . num, /; T ->  . F, *; T ->  . T * F, -; F ->  . num, -; E -> E + . T, $; F ->  . ( E ), -; T ->  . T * F, *; T ->  . T / F, *}                                                                                                                                                                                         |
    +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | 13 |     |     |     |     | s1 |     | s2  |     |    |    | 23 | {T -> T / . F, +; F ->  . ( E ), *; F ->  . ( E ), /; T -> T / . F, /; T -> T / . F, *; F ->  . num, *; F ->  . num, /; F ->  . num, $; T -> T / . F, $; F ->  . num, -; T -> T / . F, -; F ->  . num, +; F ->  . ( E ), +; F ->  . ( E ), -; F ->  . ( E ), $}                                                                                                                                                                                                                                                                                                                                                                                                                 |
    +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | 14 |     |     |     |     | s1 |     | s2  |     |    |    | 24 | {F ->  . ( E ), *; T -> T * . F, /; T -> T * . F, *; F ->  . num, /; F ->  . num, *; F ->  . ( E ), /; T -> T * . F, $; F ->  . num, $; F ->  . ( E ), $; F ->  . num, -; T -> T * . F, -; F ->  . ( E ), +; F ->  . ( E ), -; T -> T * . F, +; F ->  . num, +}                                                                                                                                                                                                                                                                                                                                                                                                                 |
    +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | 15 | s16 | s18 |     |     |    | s25 |     |     |    |    |    | {E -> E . + T, ); F -> ( E . ), +; E -> E . - T, ); F -> ( E . ), -; E -> E . - T, -; E -> E . + T, +; F -> ( E . ), *; F -> ( E . ), ); F -> ( E . ), /; E -> E . + T, -; E -> E . - T, +}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
    +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | 16 |     |     |     |     | s6 |     | s7  |     |    | 26 | 9  | {F ->  . ( E ), *; F ->  . num, *; T ->  . F, /; E -> E + . T, +; T ->  . T / F, ); F ->  . ( E ), ); T ->  . T * F, ); T ->  . T * F, /; F ->  . num, ); T ->  . T * F, *; T ->  . F, +; F ->  . ( E ), /; E -> E + . T, -; F ->  . ( E ), -; T ->  . F, -; T ->  . T * F, +; T ->  . T / F, +; T ->  . T / F, /; F ->  . ( E ), +; F ->  . num, +; T ->  . T / F, -; F ->  . num, /; T ->  . F, *; T ->  . T * F, -; F ->  . num, -; E -> E + . T, ); T ->  . F, ); T ->  . T / F, *}                                                                                                                                                                                         |
    +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | 17 | r8  | r8  | r8  | r8  |    |     |     | r8  |    |    |    | {F -> ( E ) . , $; F -> ( E ) . , /; F -> ( E ) . , +; F -> ( E ) . , -; F -> ( E ) . , *}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
    +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | 18 |     |     |     |     | s6 |     | s7  |     |    | 27 | 9  | {F ->  . ( E ), *; F ->  . num, *; E -> E - . T, +; T ->  . F, /; T ->  . T / F, ); F ->  . ( E ), ); T ->  . T * F, ); T ->  . T * F, /; F ->  . num, ); T ->  . T * F, *; E -> E - . T, -; T ->  . F, +; F ->  . ( E ), /; T ->  . F, -; T ->  . T * F, +; T ->  . T / F, +; T ->  . T / F, /; F ->  . ( E ), +; F ->  . num, +; T ->  . T / F, -; F ->  . num, /; T ->  . F, *; T ->  . T / F, *; T ->  . T * F, -; F ->  . num, -; F ->  . ( E ), -; T ->  . F, ); E -> E - . T, )}                                                                                                                                                                                         |
    +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | 19 |     |     |     |     | s6 |     | s7  |     |    |    | 28 | {T -> T / . F, +; F ->  . ( E ), *; F ->  . ( E ), /; T -> T / . F, /; T -> T / . F, *; F ->  . num, *; F ->  . num, /; F ->  . ( E ), ); F ->  . num, -; T -> T / . F, -; F ->  . num, ); T -> T / . F, ); F ->  . ( E ), +; F ->  . ( E ), -; F ->  . num, +}                                                                                                                                                                                                                                                                                                                                                                                                                 |
    +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | 20 |     |     |     |     | s6 |     | s7  |     |    |    | 29 | {F ->  . ( E ), *; T -> T * . F, /; T -> T * . F, *; F ->  . num, /; F ->  . num, *; F ->  . ( E ), /; F ->  . ( E ), ); F ->  . num, -; T -> T * . F, ); F ->  . num, ); T -> T * . F, -; F ->  . ( E ), +; F ->  . ( E ), -; T -> T * . F, +; F ->  . num, +}                                                                                                                                                                                                                                                                                                                                                                                                                 |
    +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | 21 | r2  | r2  | s14 | s13 |    |     |     | r2  |    |    |    | {T -> T . / F, -; T -> T . / F, +; E -> E - T . , +; T -> T . * F, /; T -> T . * F, +; T -> T . / F, /; T -> T . / F, *; E -> E - T . , -; T -> T . / F, $; T -> T . * F, *; E -> E - T . , $; T -> T . * F, $; T -> T . * F, -}                                                                                                                                                                                                                                                                                                                                                                                                                                                |
    +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | 22 | r1  | r1  | s14 | s13 |    |     |     | r1  |    |    |    | {T -> T . / F, -; E -> E + T . , -; E -> E + T . , $; T -> T . / F, +; T -> T . * F, /; T -> T . * F, +; T -> T . * F, -; T -> T . / F, /; T -> T . / F, *; T -> T . / F, $; T -> T . * F, *; T -> T . * F, $; E -> E + T . , +}                                                                                                                                                                                                                                                                                                                                                                                                                                                |
    +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | 23 | r5  | r5  | r5  | r5  |    |     |     | r5  |    |    |    | {T -> T / F . , $; T -> T / F . , *; T -> T / F . , -; T -> T / F . , +; T -> T / F . , /}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
    +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | 24 | r4  | r4  | r4  | r4  |    |     |     | r4  |    |    |    | {T -> T * F . , *; T -> T * F . , $; T -> T * F . , -; T -> T * F . , /; T -> T * F . , +}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
    +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | 25 | r8  | r8  | r8  | r8  |    | r8  |     |     |    |    |    | {F -> ( E ) . , /; F -> ( E ) . , ); F -> ( E ) . , +; F -> ( E ) . , -; F -> ( E ) . , *}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
    +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | 26 | r1  | r1  | s20 | s19 |    | r1  |     |     |    |    |    | {T -> T . / F, -; E -> E + T . , -; T -> T . / F, +; T -> T . * F, /; T -> T . * F, +; T -> T . * F, -; E -> E + T . , ); T -> T . / F, /; T -> T . / F, *; T -> T . * F, *; E -> E + T . , +; T -> T . * F, ); T -> T . / F, )}                                                                                                                                                                                                                                                                                                                                                                                                                                                |
    +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | 27 | r2  | r2  | s20 | s19 |    | r2  |     |     |    |    |    | {T -> T . / F, -; T -> T . / F, +; E -> E - T . , +; T -> T . * F, /; T -> T . * F, +; T -> T . * F, -; T -> T . / F, /; T -> T . / F, *; E -> E - T . , ); E -> E - T . , -; T -> T . * F, *; T -> T . * F, ); T -> T . / F, )}                                                                                                                                                                                                                                                                                                                                                                                                                                                |
    +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | 28 | r5  | r5  | r5  | r5  |    | r5  |     |     |    |    |    | {T -> T / F . , *; T -> T / F . , ); T -> T / F . , -; T -> T / F . , +; T -> T / F . , /}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
    +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | 29 | r4  | r4  | r4  | r4  |    | r4  |     |     |    |    |    | {T -> T * F . , *; T -> T * F . , -; T -> T * F . , ); T -> T * F . , /; T -> T * F . , +}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
    +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    
  3. Tokenize the input string.

    The input string is 3*(6+(4/2)-5)+8.

    lexer = ptree.Lexer(config, symbol_pool=grammar.symbol_pool)
    tokens = lexer.tokenize('3*(6+(4/2)-5)+8')
    ptree.pprint(tokens)
    +----+--------+-------+
    |    | SYMBOL | VALUE |
    +====+========+=======+
    | 1  | num    | 3     |
    +----+--------+-------+
    | 2  | *      | *     |
    +----+--------+-------+
    | 3  | (      | (     |
    +----+--------+-------+
    | 4  | num    | 6     |
    +----+--------+-------+
    | 5  | +      | +     |
    +----+--------+-------+
    | 6  | (      | (     |
    +----+--------+-------+
    | 7  | num    | 4     |
    +----+--------+-------+
    | 8  | /      | /     |
    +----+--------+-------+
    | 9  | num    | 2     |
    +----+--------+-------+
    | 10 | )      | )     |
    +----+--------+-------+
    | 11 | -      | -     |
    +----+--------+-------+
    | 12 | num    | 5     |
    +----+--------+-------+
    | 13 | )      | )     |
    +----+--------+-------+
    | 14 | +      | +     |
    +----+--------+-------+
    | 15 | num    | 8     |
    +----+--------+-------+
    
  4. Generate the parse tree.

    parse_tree = parser.parse(tokens)
    ptree.render(parse_tree, directory='out', name='parse-tree', output_format='svg')

    Parse Tree

Another example

  1. Get the predefined grammar.

    1. S'->S
    2. S->CβBA
    3. A->Aαβ
    4. A->αβ
    5. B->C
    6. B->Dβ
    7. C->α
    8. D->α
    
  2. Write a config file (in YAML format).

    nonterminal_symbols:
        # ? name
        ? A
        ? B
        ? C
        ? D
        ? S
    terminal_symbols:
        # name: regex
        α: a
        β: b
    ignored_symbols:
    start_symbol: S
    production_rules:
        # - left part -> right part
        - S -> C β B A
        - A -> A α β
        - A -> α β
        - B -> C
        - B -> D β
        - C -> α
        - D -> α
  3. Generate the parse table.

    import ptree
    
    config = ptree.load_config('config.yaml')
    grammar = ptree.Grammar(config)
    grammar.init()
    parser = ptree.Parser(grammar)
    ptree.pprint(grammar.parse_table)
    +----+-----------------+-------------------+-----------------------------------------------------------------------------------------+
    |    | ACTION          | GOTO              | STATE                                                                                   |
    +----+-----+-----+-----+---+---+---+---+---+-----------------------------------------------------------------------------------------+
    |    | α   | β   | $   | A | B | C | D | S |                                                                                         |
    +----+-----+-----+-----+---+---+---+---+---+-----------------------------------------------------------------------------------------+
    | 0  | s1  |     |     |   |   | 3 |   | 2 | {C ->  . α, β; _S ->  . S, $; S ->  . C β B A, $}                                       |
    +----+-----+-----+-----+---+---+---+---+---+-----------------------------------------------------------------------------------------+
    | 1  |     | r6  |     |   |   |   |   |   | {C -> α . , β}                                                                          |
    +----+-----+-----+-----+---+---+---+---+---+-----------------------------------------------------------------------------------------+
    | 2  |     |     | acc |   |   |   |   |   | {_S -> S . , $}                                                                         |
    +----+-----+-----+-----+---+---+---+---+---+-----------------------------------------------------------------------------------------+
    | 3  |     | s4  |     |   |   |   |   |   | {S -> C . β B A, $}                                                                     |
    +----+-----+-----+-----+---+---+---+---+---+-----------------------------------------------------------------------------------------+
    | 4  | s5  |     |     |   | 6 | 8 | 7 |   | {C ->  . α, α; S -> C β . B A, $; D ->  . α, β; B ->  . D β, α; B ->  . C, α}           |
    +----+-----+-----+-----+---+---+---+---+---+-----------------------------------------------------------------------------------------+
    | 5  | r6  | r7  |     |   |   |   |   |   | {C -> α . , α; D -> α . , β}                                                            |
    +----+-----+-----+-----+---+---+---+---+---+-----------------------------------------------------------------------------------------+
    | 6  | s10 |     |     | 9 |   |   |   |   | {A ->  . A α β, $; A ->  . A α β, α; A ->  . α β, $; S -> C β B . A, $; A ->  . α β, α} |
    +----+-----+-----+-----+---+---+---+---+---+-----------------------------------------------------------------------------------------+
    | 7  |     | s11 |     |   |   |   |   |   | {B -> D . β, α}                                                                         |
    +----+-----+-----+-----+---+---+---+---+---+-----------------------------------------------------------------------------------------+
    | 8  | r4  |     |     |   |   |   |   |   | {B -> C . , α}                                                                          |
    +----+-----+-----+-----+---+---+---+---+---+-----------------------------------------------------------------------------------------+
    | 9  | s12 |     | r1  |   |   |   |   |   | {A -> A . α β, $; A -> A . α β, α; S -> C β B A . , $}                                  |
    +----+-----+-----+-----+---+---+---+---+---+-----------------------------------------------------------------------------------------+
    | 10 |     | s13 |     |   |   |   |   |   | {A -> α . β, $; A -> α . β, α}                                                          |
    +----+-----+-----+-----+---+---+---+---+---+-----------------------------------------------------------------------------------------+
    | 11 | r5  |     |     |   |   |   |   |   | {B -> D β . , α}                                                                        |
    +----+-----+-----+-----+---+---+---+---+---+-----------------------------------------------------------------------------------------+
    | 12 |     | s14 |     |   |   |   |   |   | {A -> A α . β, $; A -> A α . β, α}                                                      |
    +----+-----+-----+-----+---+---+---+---+---+-----------------------------------------------------------------------------------------+
    | 13 | r3  |     | r3  |   |   |   |   |   | {A -> α β . , $; A -> α β . , α}                                                        |
    +----+-----+-----+-----+---+---+---+---+---+-----------------------------------------------------------------------------------------+
    | 14 | r2  |     | r2  |   |   |   |   |   | {A -> A α β . , $; A -> A α β . , α}                                                    |
    +----+-----+-----+-----+---+---+---+---+---+-----------------------------------------------------------------------------------------+
    
  4. Tokenize the input string.

    The input string is abababab.

    lexer = ptree.Lexer(config, symbol_pool=grammar.symbol_pool)
    tokens = lexer.tokenize('abababab')
    ptree.pprint(tokens)

    DFA

    +---+--------+-------+
    |   | SYMBOL | VALUE |
    +===+========+=======+
    | 1 | α      | a     |
    +---+--------+-------+
    | 2 | β      | b     |
    +---+--------+-------+
    | 3 | α      | a     |
    +---+--------+-------+
    | 4 | β      | b     |
    +---+--------+-------+
    | 5 | α      | a     |
    +---+--------+-------+
    | 6 | β      | b     |
    +---+--------+-------+
    | 7 | α      | a     |
    +---+--------+-------+
    | 8 | β      | b     |
    +---+--------+-------+
    
  5. Generate the parse tree.

    parse_tree = parser.parse(tokens)
    ptree.render(parse_tree, directory='out', name='parse-tree', output_format='svg')

    Parse Tree

About

A simple parse tree generator for any user-defined LR(1) programming language

Topics

Resources

License

Stars

Watchers

Forks

Languages