Skip to content
/ TatSu Public
forked from neogeny/TatSu

竜 TatSu generates Python parsers from grammars in a variation of EBNF

License

Notifications You must be signed in to change notification settings

barichd/TatSu

 
 

Repository files navigation

license pyversions fury circleci docs

At least for the people who send me mail about a new language that they're designing, the general advice is: do it to learn about how to write a compiler. Don't have any expectations that anyone will use it, unless you hook up with some sort of organization in a position to push it hard. It's a lottery, and some can buy a lot of the tickets. There are plenty of beautiful languages (more beautiful than C) that didn't catch on. But someone does win the lottery, and doing a language at least teaches you something.

Dennis Ritchie (1941-2011) Creator of the C programming language and of Unix

TatSu

def WARNING():
    return 'v4.4.0 is the last version of |TatSu| supporting Python 2.7'

TatSu (the successor to Grako) is a tool that takes grammars in a variation of EBNF as input, and outputs memoizing (Packrat) PEG parsers in Python.

TatSu can compile a grammar stored in a string into a tatsu.grammars.Grammar object that can be used to parse any given input, much like the re module does with regular expressions, or it can generate a Python module that implements the parser.

TatSu supports left-recursive rules in PEG grammars using the algorithm by Laurent and Mens. The generated AST has the expected left associativity.

Installation

$ pip install TatSu

Using the Tool

TatSu can be used as a library, much like Python's re, by embedding grammars as strings and generating grammar models instead of generating Python code.

  • tatsu.compile(grammar, name=None, **kwargs)

    Compiles the grammar and generates a model that can subsequently be used for parsing input with.

  • tatsu.parse(grammar, input, **kwargs)

    Compiles the grammar and parses the given input producing an AST as result. The result is equivalent to calling:

    model = compile(grammar)
    ast = model.parse(input)
    

    Compiled grammars are cached for efficiency.

  • tatsu.to_python_sourcecode(grammar, name=None, filename=None, **kwargs)

    Compiles the grammar to the Python sourcecode that implements the parser.

This is an example of how to use 竜 TatSu as a library:

GRAMMAR = '''
    @@grammar::CALC


    start = expression $ ;


    expression
        =
        | expression '+' term
        | expression '-' term
        | term
        ;


    term
        =
        | term '*' factor
        | term '/' factor
        | factor
        ;


    factor
        =
        | '(' expression ')'
        | number
        ;


    number = /\d+/ ;
'''


if __name__ == '__main__':
    import pprint
    import json
    from tatsu import parse
    from tatsu.util import asjson

    ast = parse(GRAMMAR, '3 + 5 * ( 10 - 20 )')
    print('# PPRINT')
    pprint.pprint(ast, indent=2, width=20)
    print()

    print('# JSON')
    print(json.dumps(asjson(ast), indent=2))
    print()

TatSu will use the first rule defined in the grammar as the start rule.

This is the output:

# PPRINT
[ '3',
  '+',
  [ '5',
    '*',
    [ '10',
      '-',
      '20']]]

# JSON
[
  "3",
  "+",
  [
    "5",
    "*",
    [
      "10",
      "-",
      "20"
    ]
  ]
]

Documentation

For a detailed explanation of what 竜 TatSu is capable of, please see the documentation.

Questions?

Please use the [tatsu] tag on StackOverflow for general Q&A, and limit Github issues to bugs, enhancement proposals, and feature requests.

Changes

See the CHANGELOG for details.

License

You may use 竜 TatSu under the terms of the BSD-style license described in the enclosed LICENSE.txt file. If your project requires different licensing please email.

About

竜 TatSu generates Python parsers from grammars in a variation of EBNF

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 98.7%
  • Other 1.3%