A PEG parser for C#, using incremental code generators
- Support maintainable grammars as much as possible. Do everything possible so that grammar maintainers don't have to change their grammar at all due to parser generator limitations.
- Work exclusively on
ReadOnlyMemory<char>
instead of strings and substrings. - Implementation as a code generator: consumers define a
.peg
file and we generate the parser code on build with full cross-platform support. - Be reasonably efficient, both in terms of memory usage and execution speed.
- UTF8 support, if reasonably possible.
TatSu has great docs. Note that TatSu specifically disallows semantic actions, which in turn prevents code-based matching.
Automatic handling of left-recursive grammars require memoization. The default for most PEG generators these days is to default memoization to on, but allow turning it off on a per-rule basis.
Packrat Parsers Can Support Left Recursion (Warth, et al)
Packrat Parsers Can Handle Practical Grammars in Mostly Constant Space (Mizushima, et al)
Pegen: Validate implemenation of cut operator
Practical Packrat Parsing (Grimm)
Packrat Parsing: a Practical Linear-Time Algorithm with Backtracking (Ford)
Packrat Parsing: Simple, Powerful, Lazy, Linear Time (Ford)
Parsing Expression Grammars: A Recognition-Based Syntactic Foundation (Ford)
Pika parsing: reformulating packrat parsing as a dynamic programming algorithm solves the left recursion and error recovery problems (Hutchison) - interesting approach with good error recovery, but doesn't seem easily extendable via state to encompass some context-sensitivie grammars like PEG is.