Skip to content
This repository has been archived by the owner on Oct 28, 2023. It is now read-only.

HPG Grammar

Christophe VG edited this page Mar 6, 2017 · 2 revisions

The Human Parser Generator (HPG) accepts EBNF grammars, for which it can generate parsers.

HPG Grammar Design

It also supports most BNF syntactic sugar, but always requires a line termination, as required by EBNF. Besides that it adds extended functionality and support for easier grammar design:

Optional Sequence Separator

Still supported, but not required is the , (colon) in between parts of a sequence. This improves readability greatly.

Automatic Whitespace Consumption

All whitespace is automatically consumed, removing its explicit presence in rule definitions. Again, this improves readability of the grammar.

Use of the EBNF Extensions Mechanisme for Extractors

Using the EBNF extension support, ? ... ?, extractors are added to allow consumption of (regular expression) patterns.

Named Terminal Expressions

Identifiers, strings, and extractors can be given an alternate name using the <name> @ prefix. See upcoming information on the generator's conventions and implicit generation rules.

No Support for Spaces in Rule Names

Spaces in rule names (left hand side of rules) are not allowed. For now, e.g. use dashes. The generator will generate Pascal-cased names.

Alternative Syntax Support

A few alternative syntax options are available. These were added because it was easy to do and allows for easier importing of existing grammars.

To bring BNF closer, it is possible to use = in stead of ::= and to have diamond brackets < ... > surrounding rule names.

Because we support spanning rules over multiple lines, it is not possible to remove the rule terminator ;, but an alternative terminator . is also provided.

HPG's Grammar

The grammar for the Human Parser Generator EBNF-like notation looks like this:

(* Human Parser Generator grammar *)

grammar                     = { rule } ;

rule                        = [ _ @ "<" ] identifier [ _ @ ">" ]
                              ( _ @ "::=" | _ @ "=" )
                              expression
                              ( _ @ ";" | _ @ "." )
                            ;

expression                  = alternatives-expression
                            | non-alternatives-expression
                            ;

alternatives-expression     = non-alternatives-expression "|" expression ;

non-alternatives-expression = sequential-expression
                            | atomic-expression
                            ; 

sequential-expression       = atomic-expression [ _ @ "," ] non-alternatives-expression ;

atomic-expression           = nested-expression
                            | terminal-expression
                            ;

nested-expression           = optional-expression
                            | repetition-expression
                            | group-expression
                            ;

optional-expression         = "[" expression "]" ;
repetition-expression       = "{" expression "}" ;
group-expression            = "(" expression ")" ;

terminal-expression         = identifier-expression
                              | string-expression
                              | extractor-expression
                              ;

identifier-expression       = [ name ] [ _ @ "<" ] identifier [ _ @ ">" ] ;
string-expression           = [ name ] string ;
extractor-expression        = [ name ] "?" "/" pattern "/" "?" ;

name                        = identifier "@" ;

identifier                  = ? /([A-Za-z_][A-Za-z0-9-_]*)/ ? ;
string                      = ? /"([^"]*)"|^'([^']*)'/ ? ;
pattern                     = ? /(.*?)(?<keep>/\s*\?)/ ? ;

_                           = ? /\(\*.*?\*\)/ ? ;

Check the repository's version of hpg.bnf for the latest version.