Parser Action Expressions

Mathias edited this page Jan 26, 2011 · 4 revisions
Clone this wiki locally

parboiled for Java parsers can include parser actions anywhere in the rule definitions. These parser actions execute arbitrary custom logic at the respective points during the parsing process and can influence the matching process through their boolean result value.
They can be divided into the following four categories.

“Regular” objects implementing the Action interface

If you have large blocks of action code that you would like to encapsulate in their own classes you can do so and have your action classes implement the Action interface. You can then use instances of your action classes directly in your rule definitions, similar to this example:

class MyParser extends BaseParser<Object> {
    Action myAction = new MyActionClass();

    Rule MyRule() {
        return Sequence(
            ...,
            myAction,
            ...
        );
    }
}

If you follow this approach your action classes can also implement the SkippableAction to tell the parsing engine whether to skip the action when run inside syntactic predicates (respectively).

Anonymous inner classes implementing the Action interface

Many times your parser actions will only consist of a few lines of code, e.g. reading or writing Action Variables, operating on the value stack or calling some action method. Defining separate classes for all of these little pieces of code would likely introduce a lot of overhead and clutter up your project space. You could therefore use anonymous inner classes to move the action code right into the rule definition:

class MyParser extends BaseParser<Object> {

    Rule MyRule() {
        return Sequence(
            ...,
            new Action() {
                public boolean run(Context context) {
                    ...; // arbitrary action code
                    return true; // could also return false to stop matching the Sequence and continue looking for other matching alternatives
                }
            },
            ...
        );
    }
}

Explicit action expressions

Even though anonymous inner classes are an improvement over stand-alone action classes in terms of project space clutter they are still quite verbose and can make your rule definitions much harder to read. parboiled therefore supports an even shorter action notation that can be used in all cases where your action code can be reduced to just one boolean expression:

class MyParser extends BaseParser<Object> {

    Rule MyRule() {
        return Sequence(
            ...,
            ACTION(...), // the argument being the boolean expression to wrap
            ...
        );
    }
}

The BaseParser.ACTION method is a special marker method that tells parboiled to wrap the argument expression into a separate, automatically created Action class, similar to the anonymous inner class example from above. Your action expression can include reading access to local variables and method parameters, reading/writing access to non-private parser fields as well as calls to non-private parser methods.
Apart from being a lot more concise than an anonymous inner class an action expression has another important advantage. If the action expression being wrapped contains method calls on objects implementing the ContextAware interface parboiled will automatically insert setContext calls immediately before invoking the method. For example you might decide to move all your action code out of the parser class and into a separate action class in order to cleanly separate the two:

class MyParser extends BaseParser<Object> {
    MyActions actions = new MyActions();

    Rule MyRule() {
        return Sequence(
            ...,
            ACTION(actions.someAction()),
            ...
        );
    }
}

If MyActions implements the ContextAware interface parboiled will internally convert your code into something like this:

class MyParser extends BaseParser<Object> {
    MyActions actions = new MyActions();

    Rule MyRule() {
        return Sequence(
            ...,
            new Action() {
                public boolean run(Context context) {
                    actions.setContext(context);
                    return actions.someAction();
                }
            },
            ...
        );
    }
}

Note that BaseParser, the parent class of your parser class, itself inherits from BaseActions, which already implements ContextAware. This means that all action code living in action methods defined directly in your parser class automatically has access to the current context via getContext.

Implicit action expressions

When you use explicit action expressions with the BaseParser.ACTION wrapper you explicitly tell parboiled what boolean expression you would like to be “action wrapped”. However, in most cases parboiled can determine itself, what constructs in your rule definitions are action expressions. Most of the time you can therefore leave away the explicit ACTION wrapper. For example, the following rule contains a legal (albeit quite contrived) implicit action expression (assuming extractIntegerValue(String) and someObj.doSomething() are methods returning a boolean):

Rule Number() {
    return Sequence(
        OneOrMore(CharRange('0', '9')),
        Math.random() < 0.5 ? extractIntegerValue(match()) : someObj.doSomething()
    );
}

The internal logic parboiled uses to determine whether an expression is an implicit action expression, is the following:
Since the default rule creators defined in the BaseParser all have parameters of the general Java Object type the Java compiler automatically converts the result of a primitive boolean expression passed as an argument to a Boolean object (the process called “boxing”). It does so by invisibly inserting calls to Boolean.valueOf after the code of your boolean action expression. parboiled looks for these calls in your rule method byte code and treats the respective expression as an implicit action expression, if its result is directly used as an argument to a rule creating method.
Normally this process is very robust and should allow parboiled to properly recognize implicit action expressions without any problems. However, if for some reason you would like to disable the support for implicit action expressions you can do so with an @ExplicitActionsOnly annotation either on the rule method or the parser class (the latter disabling implicit action support for all rule methods).

Return Values

Action expressions are boolean expressions. The return value of such an expression is the way that actions can influence the parsing progress of the current rule. Whenever an action expression evaluates to false parsing will continue as if a hypothetical parsing rule taking the place of the action expression had failed. You can therefore think of action expressions as “rules” that can either succeed (match) or fail (not match), depending on their boolean return values.