Skip to content

bfoz/grammar-python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Grammar

This is an attempt to make a cross-parser grammar specification for parsers written in Python. The result is a very opinionated module that is cranky about the sad state of grammar languages. Grammar doesn't have a grammar file format, or even care what grammar file you use, so long as it can be translated into Grammar objects. The Grammar object hierarchy attempts to support all of the features supported by all of the major parsers. If you create a grammar using features that aren't supported by your favorite parser, then it's up to that parser to complain about the grammar that you tried to feed it.

Grammar doesn't attempt to mitigate left-recursive alternations; a parser is expected to either handle the grammar properly, or to report an appropriate error for any grammar element that it can't handle.

Grammars are Modules

Grammars are Modules, and the grammar rules are symbols defined in the grammar module.

The grammar module must define a method named root that returns the grammar's root rule.

from grammar import Alternation, Concatenation, Repetition

def root():
    """ The name of the root rule of the grammar """
    return my_root_rule

Rules

In grammar rules are composed of Alternations, Concatenations, Repetitions, Strings, and Python regular expression objects.

Strings match themselves, and regular expressions match whatever they match. Alternations match the longest of their alternations, Concatenations match their elements in order, and Repetitions match some number of something else.

choice = Alternation('abc', 'def', 'xyz')		# Matches one of "abc", or "def", or "xyz"
concatenated = Concatenation('abc', 'def', 'xyz')	# Matches "abcdefxyz"
Repetition.any(choice)					# Matches any number of "abc", or "def", or "xyz"

Lists

Many grammars, especially grammars for programming languages, make extensive use of lists of elements. For instance, the argument lists for function definitions result in grammar elements that are repetitions of some element separated by some other element. The resulting grammar can be messy, especially when the list elements are complex. To make life easier, grammar includes a convenience method for generating list-like grammar elements.

List('abc', separator=',')		# => Concatenation('abc', Concatenation(',', 'abc')).any)
List('abc', 'def', separator=',')	# => Concatenation(items := Alternation('abc', 'def'), Concatenation(',', items).any)

# Separators are optional
List('abc')				# => Repetition.any('abc')
List('abc', 'def')			# => Alternation('abc', 'def').any

Repetitions

Repetitions are used extensively in many grammars and the syntax above can be rather unwieldy, even when used sparingly. To make things easier, the rule objects created by both Alternation() and Concatenation() have convenience methods for creating repetitions of the underlying grammar elements

choice = Alternation('abc', 'def', 'xyz')

choice.any		# Matches any number of "abc", or "def", or "xyz"
choice.at_least(2) 	# Matches 2, or more, of "abc", or "def", or "xyz"
choice.one_or_more 	# Equivalent to choice.at_least(1)
choice.optional 	# Matches one of, or none of, "abc", or "def", or "xyz"

About

Python grammar creation

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages