Error handling improvements #48

alexrp · 2012-09-29T02:05:26Z

Just creating this as a sort of meta-bug. A couple of things that come to mind:

A solid error handling interface needs to be devised.
Error handling needs to be documented.
Error handling must give enough contextual information to give useful error messages to the user.
Error handling should preferably be pluggable in some way so that the library user can tell Pegged what to do on a failure (e.g. try to continue parsing at a sync point or similar*).

See http://www.ssw.uni-linz.ac.at/coco/Doc/UserManual.pdf - the documentation on the SYNC and WEAK keywords.

PhilippeSigaud · 2012-09-29T15:06:29Z

On Sat, Sep 29, 2012 at 4:05 AM, Alex Rønne Petersen <
notifications@github.com> wrote:

Just creating this as a sort of meta-bug. A couple of things that come to
mind:

A solid error handling interface needs to be devised.

Any idea?

Error handling needs to be documented.

Yes, once the previous step is done.

Error handling must give enough contextual information to give
useful error messages to the user.

I devised an 'expected' function that, given a rule produce an explanation
of what is (was) expected as input:

For example, "abc"* . gives:

"one or more (literal "abc") followed by (any character)."

Error handling should preferably be pluggable in some way so that
the library user can tell Pegged what to do on a failure (e.g. try to
continue parsing at a sync point or similar*).

See http://www.ssw.uni-linz.ac.at/coco/Doc/UserManual.pdf - the
documentation on the SYNC and WEAK keywords.

OK, I'm reading this now.

Groterik · 2014-04-29T14:36:47Z

Hi,
What about the 4-th point? Has Pegged any error recovery ability now? I have not found any information in the documentation.

PhilippeSigaud · 2014-04-29T17:24:13Z

Hi,

No there is no error recovery implemented. I could not find a generic
algorithm I was satisfied with (many error recovery algos in parsers work
for one language and not any language).

etcimon · 2014-04-29T17:29:55Z

Actually, I found a nice way to do some error handling. It requires changing it from mixin grammar() enum g = grammar() to generate a separate file with the contents of g. The Error handling is done by modifying this generated file by adding:

{
    import std.stdio;
    auto f = File("parser.log", "a");
    f.writeln(result);
}

after

template hooked(alias r, string name)
{
    static ParseTree hooked(ParseTree p)
    {
        ParseTree result;

        if (name in before)
        {
            result = before[name](p);
            if (result.successful)
                return result;
        }

        result = r(p);

It's also better for compile times to have a separate D file with the custom tree builder.

I use static if (__traits(compiles to check if the d file was generated. See here:

https://github.com/globecsys/asn1.d/blob/master/asn1/app.d

Here's an example of the generated log:
https://github.com/etcimon/dub/blob/master/parser.zip

Beware though I changed the interpretation of the Or! on my pegged fork to reflect BNF behavior (with a recursion limitation)

etcimon · 2014-04-29T17:34:04Z

The problem with error handling (which requires a log here) is that there's no way of knowing which kind of branching was really an error, because the grammar is very generic like Philippe stated. So I just read through a log to try and see what pathing pegged was going through.

PhilippeSigaud · 2014-04-29T17:35:01Z

On Tue, Apr 29, 2014 at 7:29 PM, Etienne Cimon notifications@github.comwrote:

Actually, I found a nice way to do some error handling. It requires moving
the mixin into a separate file and adding

{
import std.stdio;
auto f = File("parser.log", "a");
f.writeln(result);
}

after

template hooked(alias r, string name)
{
static ParseTree hooked(ParseTree p)
{
ParseTree result;
    if (name in before)
    {
        result = before[name](p);
        if (result.successful)
            return result;
    }

    result = r(p);
Hey, interesting! Thanks Etienne.
I'll need to reacquaint myself with Pegged, though :-)

It's also better for compile times to have a separate D file with the
custom tree builder.

I use static if (__traits(compiles to check if the d file was generated.
See here:

https://github.com/globecsys/asn1.d/blob/master/asn1/app.d

Here's an example of the generated log:
https://github.com/etcimon/dub/blob/master/parser.zip

I got the log file. Where can we see some error recovery?

Beware though I changed the interpretation of the Or! on my pegged fork
to reflect BNF behavior (with a recursion limitation)

So you introduce ambiguity?

etcimon · 2014-04-29T17:40:26Z

I got the log file. Where can we see some error recovery?

There's no error recovery there, it's a developer helper and reading through it can tell you what part of the grammar should be corrected and, if I were to implement recovery I'd just use semantic actions there to rewind or branch it correctly when I meet certain conditions, or I could modify the generated grammar file.

So you introduce ambiguity?

Yes, I don't do it on CTFE because pegged compiles grammars using a PEG format, but I added a static HashMap to detect recursions and I register a success tree the same way it registers the best failure (longest ParseTree input length), I return with a success once I reach the end of the Or!-checking loop with the best success. I also avoid recording failures when there's already a success (using a simple success flag to detect it).

You can see it here:
https://github.com/globecsys/Pegged/commit/3140bb261c7ba7a49255cb26743a38947a6f410f#diff-9636c02c9bc3adce58c26a22d989fe12

PhilippeSigaud · 2014-04-29T17:43:28Z

Yes, I don't do it on CTFE because pegged compiles grammars using a PEG
format, but I added a static HashMap to detect recursions and I register a
success tree the same way it registers the best failure (longest ParseTree
input length), I return with a success once I reach the end of the Or!-checking
loop with the best success. I also avoid recording failures when there's
already a success (using a simple success flag to detect it).

I suppose it also works for Or's inside And's inside Or's?

You can see it here:

globecsys@3140bb2#diff-9636c02c9bc3adce58c26a22d989fe12https://github.com/globecsys/Pegged/commit/3140bb261c7ba7a49255cb26743a38947a6f410f#diff-9636c02c9bc3adce58c26a22d989fe12

I'll have a look, thanks for the link.

etcimon · 2014-04-29T17:46:26Z

I suppose it also works for Or's inside And's inside Or's?

I don't do it for And because they're similar to && in grammar type (one fails, the rule fails), but I was quite in dismay when the first success of an Or returned and none of the others were checked as best-suited. There's so many scenarios of a Var rule that made an Or! succeed when in fact it was going to make underlying And fail on an unrecognized character that followed because it should have been a Type (but the Type rule was after the Var rule in Or!).

etcimon · 2014-04-29T18:12:35Z

Oh, I needed recursionMap to limit recursions because BNF would have rules such as this:

Type < BuiltinType / ReferencedType / ConstrainedType

ConstrainedType < Type Constraint / TypeWithConstraint

which made Pegged stack overflow, and also BNF would use it to generate lists. I limited it to 2 to allow it minimally but it should be limited to dozens to fully support recursive listing schemes.

BNF:
SymbolList ::= Symbol | SymbolList "," Symbol
which translates to PEG:
SymbolList < Symbol / SymbolList "," Symbol
or without recursion
SymbolList < Symbol ("," Symbol)*

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error handling improvements #48

Error handling improvements #48

alexrp commented Sep 29, 2012

PhilippeSigaud commented Sep 29, 2012

Groterik commented Apr 29, 2014

PhilippeSigaud commented Apr 29, 2014

etcimon commented Apr 29, 2014

etcimon commented Apr 29, 2014

PhilippeSigaud commented Apr 29, 2014

etcimon commented Apr 29, 2014

PhilippeSigaud commented Apr 29, 2014

etcimon commented Apr 29, 2014

etcimon commented Apr 29, 2014

Error handling improvements #48

Error handling improvements #48

Comments

alexrp commented Sep 29, 2012

PhilippeSigaud commented Sep 29, 2012

Groterik commented Apr 29, 2014

PhilippeSigaud commented Apr 29, 2014

etcimon commented Apr 29, 2014

etcimon commented Apr 29, 2014

PhilippeSigaud commented Apr 29, 2014

etcimon commented Apr 29, 2014

PhilippeSigaud commented Apr 29, 2014

etcimon commented Apr 29, 2014

etcimon commented Apr 29, 2014