Grammar and Parser Debugging

Mathias edited this page Jan 26, 2011 · 6 revisions
Clone this wiki locally

There are a number of options for debugging a parboiled parser under development:

  1. The TracingParseRunner
  2. The Parse Tree
  3. The internal debugger of your IDE

The TracingParseRunner

A good first start for finding out “what’s happening” is to use the TracingParseRunner (or its Scala version) and take a look at its tracing log. You can limit the tracing to just the right part of the grammar and/or input text by specifying a filter predicate, preferably made up of the predefined predicate primitives defined in the Filters class (Java) or the Lines, Rules, Matched and/or Mismatched objects in Scala.

Java Example:

import static org.parboiled.common.Predicates.*;
import static org.parboiled.support.Filters.*;
...
CalculatorParser1 parser = Parboiled.createParser(CalculatorParser1.class);
TracingParseRunner<Integer> runner = new TracingParseRunner<Integer>(parser.InputLine(),
    and(rules(parser.Number(), parser.Parens()), not(rulesBelow(parser.Digits())))
);
ParsingResult<Integer> result = runner.run("2*(4+5");
System.out.println(runner.getLog());

Scala Example:

...
val runner = TracingParseRunner(parser.InputLine).filter(
  Lines(10 until 20) && Rules.below(parser.Factor) && !Rules.below(parser.Digits)
)
...

The resulting tracing would look something like this (matching the Java example from above):

InputLine/Expression/Term/Factor/Number/Digits, matched, cursor at 1:2 after "2"
..(4)../Number/Number_Action1, matched, cursor at 1:2 after "2"
..(4)../Number, matched, cursor at 1:2 after "2"
..(2)../Term/ZeroOrMore/FirstOf/Sequence/Factor/Number/Digits, failed, cursor at 1:3 after "2*"
..(7)../Number, failed, cursor at 1:3 after "2*"
..(6)../Factor/Parens/'(', matched, cursor at 1:4 after "2*("
..(7)../Parens/Expression/Term/Factor/Number/Digits, matched, cursor at 1:5 after "2*(4"
..(11)../Number/Number_Action1, matched, cursor at 1:5 after "2*(4"
..(11)../Number, matched, cursor at 1:5 after "2*(4"
...

The Parse Tree

If your parser is actually matching something but you don’t know why its happening in one way instead of an expected other way you should probably switch on parse tree building (see The Parse Tree) and print the parse tree with printNodeTree. If your tree is too verbose to be easily readable you can switch off uninteresting branches with @SuppressNode and/or @SuppressSubnodes annotations (or the respective RuleOption case object in your Scala parser).
Additionally or alternatively you can use the filtered printNodeTree overload to only print the node levels selected by a custom filter predicate.

IDE Debugging

Since parboiled relies on internal DSLs for rule specification you can also directly rely on the debugging capabilities of your Java IDE.
You can trace the parsing process by inserting debugging breakpoints at the right points in your grammar and follow the matching process manually. However, since parser rule methods often times consist of only one large return statement simply setting a breakpoint at this statement is usually not very helpful. Also, your rule method is executed during rule construction and not during rule execution, so even you could somehow set a breakpoint “somewhere inside” the rule expression it would not be hit during the running phase of your parser.

Additionally, in many cases parboiled for Java has to rewrite your rule methods in order to inject the proper code for action expression wrapping and so on, which means that your “original” rule method is actually never run and any breakpoint set in it is never hit. The easiest way to work around this problem is to call another simple (action) method and set a breakpoint there.

Consider this example:

Rule SomeRule() {
    Var<Integer> i = new Var<Integer>();
    return Sequence(
        OneOrMore(Digit()),
        i.set(Integer.parseInt(match())),
        SomeOtherRule(i)
    );
}

This method contains an action expression, which means that parboiled has to rewrite the method during parser extension and any breakpoints in it will not be hit. In order to still stop program execution when the method is run (i.e. during rule construction) simply create a debugging method and set a breakpoint there:

Rule SomeRule() {
    Var<Integer> i = new Var<Integer>();
    debug();
    return Sequence(
        OneOrMore(Digit()),
        i.set(Integer.parseInt(match())),
        SomeOtherRule(i)
    );
}

void debug() {
    System.out.println("BREAK"); // set breakpoint here
}

If you want to trace the rule execution during the runtime of the parser rather than during rule construction simply create debugging actions:

Rule SomeRule() {
    Var<Integer> i = new Var<Integer>();
    return Sequence(
        OneOrMore(Digit()),
        debug(prevText()),
        i.set(Integer.parseInt(match())),
        debug(i),
        SomeOtherRule(i)
    );
}

boolean debug(String s) {
    System.out.println(s); // set breakpoint here if required
    return true;
}

boolean debug(Var<Integer> i) {
    System.out.println(i.get()); // set breakpoint here if required
    return true;
}

You can pass the values that you would like to inspect directly to your debugging method.

One simple trick to ease parser debugging during grammar development is to keep a small debug method around in the parser with a permanently set breakpoint in it. This way you can always simply drop an action call to this method anywhere in your grammar and the next debugging run will properly stop at the right point during parser runtime. If you make your debug method take a Context parameter and call it with debug(getContext()) you have immediate access to the current parsing context right from the breakpoint:

boolean debug(Context context) {
    return true; // set permanent breakpoint here
}