Skip to content

An implementation of LALR(1) parser generator in Java with synthesized attributes and parsing automaton visualization

Notifications You must be signed in to change notification settings

UnrealEugene/lalr-parser-generator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LARL(1) Parser Generator

Java implementation of LALR(1) parser generator with supported synthesized attributes written as a home assignment for parsing course in ITMO University.

Features

  • Generate parsers for LALR(1) grammars
  • Parsing automaton visualization
  • Regular expressions in tokens
  • Synthesized attributes
  • TODO: Ambiguous grammar check

Examples

Consider a simple grammar for balanced bracket sequences:

S0 -> S;

S  -> '(' S ')' S;
S  -> ε;

The following parsing automaton can be obtained from this grammar:

Grammar parsing automaton

The following AST can be built from the input (())():

Abstract syntax tree

An example of the more complex grammar for arithmetic expressions with usage of synthesized attributes to calculate expression value:

E0 [long val] -> E {val = $0.val;};

E [long val]  -> E '+' T {val = $0.val + $2.val;} | E '-' T {val = $0.val - $2.val;} | T {val = $0.val;};
T [long val]  -> T '*' F {val = $0.val * $2.val;} | T '/' F {val = $0.val / $2.val;} | F {val = $0.val;};
F [long val]  -> P '**' F {val = (long) Math.pow($0.val, $2.val);} | P {val = $0.val;};
P [long val]  -> r'[+-]?([1-9][0-9]*|0)' {val = Long.parseLong($0);} | '(' E ')' {val = $1.val;};

@skip r'[ \n\r\t]+';

Usage

To generate LALR(1) parser based on a context free grammar first you need to specify grammar productions in form described in grammar for grammars. After that the GenerateParser class needs to be called with path to grammar as its first argument and, possibly, more optional arguments:

-o <path>: output path for generated sources (required)
-i <path>: output path for parsing automaton images (by default images are not generated)
-n <name>: grammar name used as generated class prefix (default: grammar file name)
-p <name>: generated classes package (default: empty)

After a successful execution three Java classes will be generated in specified output path: Lexer, Parser and Token.

Note: If you want to generate parsing automaton images, you'll need to install GraphViz and then provide a path to GraphViz executable in config.properties file.

About

An implementation of LALR(1) parser generator in Java with synthesized attributes and parsing automaton visualization

Topics

Resources

Stars

Watchers

Forks