MiniParse How To
- Import the miniparse module.
- Construct a
miniparse.MiniParse
object. The constructor accepts one or more "start" symbols. - Use the
.left(...)
,.right(...)
, and.nonassoc(...)
methods to assign precedence levels (see note, below) and associativity as needed to terminal symbols. - Declare production rules using the
.rule(...)
and.renaming(...)
methods. (See "On Rules" below.) - Call
.parse(...)
on a stream of tokens, which should be pairs consisting of(token_type, attribute)
.
No special support exists for location tracking, but you can certainly include that kind of information in the attribute cell from the scanner.
Each production rule is associated with a combining function of some form. By default, it will take one parameter for each element of the right-hand side.
However, you can limit attention to a specific selection of right-hand symbols by prepending .
to the relevant symbols, .like .this
.
That may enable you to simplify your coding significantly.
There is a complete worked example at example\JSON.py
. The first majority of that file is a scanner
definition, but the parser definition is well-marked at the bottom.
The file at tests/test_miniparse
defines a four-function calculator using the operator precedence features and Python's operator module as rule actions.
The miniscan
module defines a grammar for regular expressions. It's in the middle; search for the identifier rex
.
- Precedence declarations are highest-first, which makes more sense to me than the YACC/BISON style of lowest-first. This could be made configurable if there is demand.
- It is possible to declare the precedence of a rule independently, but by default it is that of its first nonterminal for which precedence has been declared.
- If a shift-reduce conflict occurs on a symbol with a precedence declaration, it will be resolved accordingly. Otherwise, the default behavior is to shift and emit a warning.
- There is no "expect" directive: The "dangling
else
problem" may be solved by declaringelse
to be right-associative.
I might like to add the canonical error-recovery algorithm. On the other hand, many modern languages stop parsing at the first error: This makes sense because often the "recovery" leaves too much to be desired. At any event, semantic analysis is shot if the syntax doesn't even parse.