Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error handling improvements #48

Open
alexrp opened this issue Sep 29, 2012 · 10 comments
Open

Error handling improvements #48

alexrp opened this issue Sep 29, 2012 · 10 comments

Comments

@alexrp
Copy link
Contributor

alexrp commented Sep 29, 2012

Just creating this as a sort of meta-bug. A couple of things that come to mind:

  1. A solid error handling interface needs to be devised.
  2. Error handling needs to be documented.
  3. Error handling must give enough contextual information to give useful error messages to the user.
  4. Error handling should preferably be pluggable in some way so that the library user can tell Pegged what to do on a failure (e.g. try to continue parsing at a sync point or similar*).
@PhilippeSigaud
Copy link
Owner

On Sat, Sep 29, 2012 at 4:05 AM, Alex Rønne Petersen <
notifications@github.com> wrote:

Just creating this as a sort of meta-bug. A couple of things that come to
mind:

  1. A solid error handling interface needs to be devised.

Any idea?

  1. Error handling needs to be documented.

Yes, once the previous step is done.

  1. Error handling must give enough contextual information to give
    useful error messages to the user.

I devised an 'expected' function that, given a rule produce an explanation
of what is (was) expected as input:

For example, "abc"* . gives:

"one or more (literal "abc") followed by (any character)."

  1. Error handling should preferably be pluggable in some way so that
    the library user can tell Pegged what to do on a failure (e.g. try to
    continue parsing at a sync point or similar*).

OK, I'm reading this now.

@Groterik
Copy link

Hi,
What about the 4-th point? Has Pegged any error recovery ability now? I have not found any information in the documentation.

@PhilippeSigaud
Copy link
Owner

Hi,

No there is no error recovery implemented. I could not find a generic
algorithm I was satisfied with (many error recovery algos in parsers work
for one language and not any language).

@etcimon
Copy link

etcimon commented Apr 29, 2014

Actually, I found a nice way to do some error handling. It requires changing it from mixin grammar() enum g = grammar() to generate a separate file with the contents of g. The Error handling is done by modifying this generated file by adding:

{
    import std.stdio;
    auto f = File("parser.log", "a");
    f.writeln(result);
}

after

template hooked(alias r, string name)
{
    static ParseTree hooked(ParseTree p)
    {
        ParseTree result;

        if (name in before)
        {
            result = before[name](p);
            if (result.successful)
                return result;
        }

        result = r(p);

It's also better for compile times to have a separate D file with the custom tree builder.

I use static if (__traits(compiles to check if the d file was generated. See here:

https://github.com/globecsys/asn1.d/blob/master/asn1/app.d

Here's an example of the generated log:
https://github.com/etcimon/dub/blob/master/parser.zip

Beware though I changed the interpretation of the Or! on my pegged fork to reflect BNF behavior (with a recursion limitation)

@etcimon
Copy link

etcimon commented Apr 29, 2014

The problem with error handling (which requires a log here) is that there's no way of knowing which kind of branching was really an error, because the grammar is very generic like Philippe stated. So I just read through a log to try and see what pathing pegged was going through.

@PhilippeSigaud
Copy link
Owner

On Tue, Apr 29, 2014 at 7:29 PM, Etienne Cimon notifications@github.comwrote:

Actually, I found a nice way to do some error handling. It requires moving
the mixin into a separate file and adding

{
import std.stdio;
auto f = File("parser.log", "a");
f.writeln(result);
}

after

template hooked(alias r, string name)
{
static ParseTree hooked(ParseTree p)
{
ParseTree result;

    if (name in before)
    {
        result = before[name](p);
        if (result.successful)
            return result;
    }

    result = r(p);

Hey, interesting! Thanks Etienne.
I'll need to reacquaint myself with Pegged, though :-)

It's also better for compile times to have a separate D file with the
custom tree builder.

I use static if (__traits(compiles to check if the d file was generated.
See here:

https://github.com/globecsys/asn1.d/blob/master/asn1/app.d

Here's an example of the generated log:
https://github.com/etcimon/dub/blob/master/parser.zip

I got the log file. Where can we see some error recovery?

Beware though I changed the interpretation of the Or! on my pegged fork
to reflect BNF behavior (with a recursion limitation)

So you introduce ambiguity?

@etcimon
Copy link

etcimon commented Apr 29, 2014

I got the log file. Where can we see some error recovery?

There's no error recovery there, it's a developer helper and reading through it can tell you what part of the grammar should be corrected and, if I were to implement recovery I'd just use semantic actions there to rewind or branch it correctly when I meet certain conditions, or I could modify the generated grammar file.

So you introduce ambiguity?

Yes, I don't do it on CTFE because pegged compiles grammars using a PEG format, but I added a static HashMap to detect recursions and I register a success tree the same way it registers the best failure (longest ParseTree input length), I return with a success once I reach the end of the Or!-checking loop with the best success. I also avoid recording failures when there's already a success (using a simple success flag to detect it).

You can see it here:
https://github.com/globecsys/Pegged/commit/3140bb261c7ba7a49255cb26743a38947a6f410f#diff-9636c02c9bc3adce58c26a22d989fe12

@PhilippeSigaud
Copy link
Owner

Yes, I don't do it on CTFE because pegged compiles grammars using a PEG
format, but I added a static HashMap to detect recursions and I register a
success tree the same way it registers the best failure (longest ParseTree
input length), I return with a success once I reach the end of the Or!-checking
loop with the best success. I also avoid recording failures when there's
already a success (using a simple success flag to detect it).

I suppose it also works for Or's inside And's inside Or's?

You can see it here:

globecsys@3140bb2#diff-9636c02c9bc3adce58c26a22d989fe12https://github.com/globecsys/Pegged/commit/3140bb261c7ba7a49255cb26743a38947a6f410f#diff-9636c02c9bc3adce58c26a22d989fe12

I'll have a look, thanks for the link.

@etcimon
Copy link

etcimon commented Apr 29, 2014

I suppose it also works for Or's inside And's inside Or's?

I don't do it for And because they're similar to && in grammar type (one fails, the rule fails), but I was quite in dismay when the first success of an Or returned and none of the others were checked as best-suited. There's so many scenarios of a Var rule that made an Or! succeed when in fact it was going to make underlying And fail on an unrecognized character that followed because it should have been a Type (but the Type rule was after the Var rule in Or!).

@etcimon
Copy link

etcimon commented Apr 29, 2014

Oh, I needed recursionMap to limit recursions because BNF would have rules such as this:

Type < BuiltinType / ReferencedType / ConstrainedType

ConstrainedType < Type Constraint / TypeWithConstraint

which made Pegged stack overflow, and also BNF would use it to generate lists. I limited it to 2 to allow it minimally but it should be limited to dozens to fully support recursive listing schemes.

BNF:
SymbolList ::= Symbol | SymbolList "," Symbol
which translates to PEG:
SymbolList < Symbol / SymbolList "," Symbol
or without recursion
SymbolList < Symbol ("," Symbol)*

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants