ANTLR4 Camp to get into touch with Lexer, Parser, Visitor and Listener
In this example you'll learn how to use Lexer and Parser to build up an expression grammar to calculate following results:
a = 5
b = 6
c = 7
((a+b)/3-c)
- Grammar
- Lexer
- Parser
- Listener
- Visitor
- Error Strategy
A grammar is a subset of a language. Its following the syntax of a language and its limited by it as well.
A language is a set of valid sentences. A sentence is made up phrases. A Phrase is made up of subphrases and vocabulary symbols.
Lexer is the process of grouping characters into words or symbols. It tokenizes the input.
To define tokens:
INT: [0-9]+;
NEWLINE: '\r'? '\n';
Lexer can be separated in own Lexer.g4
file. Its defined by lexer grammar name
.
Parser is the process of building up the data structure of a sentence. It recognizes languages.
stat:
expr NEWLINE # printExpr
| ID EQUAL expr NEWLINE # assign
| NEWLINE # blank;
Parser can be separated in own parser.g4
file. Its defined by parser grammar name
. Don't forget to reference the corresponding lexer file.
options {
tokenVocab = ExpressionLexer;
language = Java;
}
The generated Listener class provides enter and exit methods for each rule. It also defines where in the walk ParseTreeWalker
calls the enter and exit method.
public class ExpressionParserListener extends ExpressionParserBaseListener
The generated Visitor class provides a controlled walk through our tree. It explicitly calls methods to visit children.
public class ExpressionParserVisitor extends ExpressionParserBaseVisitor<Integer>
Implementing visit funtions:
@Override
public Integer visitPrintExpr(ExpressionParser.PrintExprContext ctx) {
Integer value = visit(ctx.expr());
log.info("Value:" + value);
return 0;
}
Be aware to enable visitor creation of grammars.
<configuration>
<sourceDirectory>${basedir}/src/main/antlr4</sourceDirectory>
<visitor>true</visitor>
</configuration>
To define our on error strategy, we can extend the `BaseErrorListener. In general, only 4 different types of exceptions are thrown during the lexer and parser process.
public class ExpressionErrorStrategy extends BaseErrorListener
To set a custom error message, we can override the syntaxError
function and handle each error message individually.
@Override
public void syntaxError(Recognizer<?, ?> recognizer, Object o, int i, int i1, String s, RecognitionException e) {
super.syntaxError(recognizer, o, i, i1, "Message", e);
}
Be aware, that the RecognitionException can be null!
To retrieve the offending and expected token of an exception, we can use the Parser
or the CommonToken
, see example below:
handleInputMismatchException((CommonToken) o, (Parser) recognizer);
final String offendingToken = offendingSymbol.getText();
final String expectedToken = parser.getExpectedTokens().toString(parser.getVocabulary());