Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repair not following grammar #280

Closed
duskybomb opened this issue Mar 13, 2022 · 5 comments
Closed

Repair not following grammar #280

duskybomb opened this issue Mar 13, 2022 · 5 comments

Comments

@duskybomb
Copy link

Hi, I was playing with your amazing tool and used a grammar very similar to the "calculator evaluator" given in Quickstart Guide.

I tried an arbitrary buggy calculator program: 1 + * ( 1 + ) ( + * 1 + 1 * ) ( * 1 )

These were the suggested repair sequences:

Parsing error at line 1 column 5. Repair sequences found:
   1: Delete *
   2: Insert INT
start (line: 1, col: 5), end (line: 1, col:6)

Parsing error at line 1 column 13. Repair sequences found:
   1: Delete ), Shift (, Delete +, Delete *
   2: Delete ), Shift (, Insert INT, Delete +
   3: Insert INT, Shift ), Delete (, Delete +
   4: Delete ), Shift (, Insert INT, Shift +, Insert INT
   5: Insert INT, Shift ), Delete (, Shift +, Insert INT
   6: Insert INT, Shift ), Delete (, Shift +, Delete *
   7: Delete ), Shift (, Insert INT, Shift +, Delete *

Parsing error at line 1 column 29. Repair sequences found:
   1: Insert INT, Shift ), Delete (
   2: Delete ), Shift (, Insert INT
start (line: 1, col: 29), end (line: 1, col:30)

However, the repaired program we get using the first repair sequence is: 1 + ( 1 + ( 1 + 1 * INT ) * 1 ), which doesn't conform to the grammar and the tool rejects the repair upon sending it through again. (gives error). Is there something that I missed?

calc.y

%start Expr
%%
Expr -> Result<u64, ()>:
      Expr 'PLUS' Term { Ok($1? + $3?) }
    | Term { $1 }
    ;

Term -> Result<u64, ()>:
      Term 'MUL' Factor { Ok($1? * $3?) }
    | Factor { $1 }
    ;

Factor -> Result<u64, ()>:
      'PARENOPEN' Expr 'PARENCLOSE' { $2 }
    | 'INT'
      {
          let v = $1.map_err(|_| ())?;
          parse_int($lexer.span_str(v.span()))
      }
    ;
%%
// Any functions here are in scope for all the grammar actions above.

fn parse_int(s: &str) -> Result<u64, ()> {
    match s.parse::<u64>() {
        Ok(val) => Ok(val),
        Err(_) => {
            eprintln!("{} cannot be represented as a u64", s);
            Err(())
        }
    }
}

calc.l

%%
[0-9]+ "INT"
\+ "PLUS"
\* "MUL"
\( "PARENOPEN"
\) "PARENCLOSE"
[\t ]+ ;
@ltratt
Copy link
Member

ltratt commented Mar 13, 2022

If I try this in lrpar/examples/calc_actions I get:

>>> 1 + * ( 1 + ) ( + * 1 + 1 * ) ( * 1 )
Parsing error at line 1 column 5. Repair sequences found:
   1: Delete *
   2: Insert INT
Parsing error at line 1 column 13. Repair sequences found:
   1: Delete ), Shift (, Delete +, Delete *
   2: Delete ), Shift (, Insert INT, Delete +
   3: Insert INT, Shift ), Delete (, Delete +
   4: Delete ), Shift (, Insert INT, Shift +, Insert INT
   5: Insert INT, Shift ), Delete (, Shift +, Delete *
   6: Insert INT, Shift ), Delete (, Shift +, Insert INT
   7: Delete ), Shift (, Insert INT, Shift +, Delete *
Parsing error at line 1 column 29. Repair sequences found:
   1: Insert INT, Shift ), Delete (
   2: Delete ), Shift (, Insert INT
<evaluation aborted>

as you do. If I then follow its suggestions (noting that "Insert INT" means "insert a value of type INT" so I'll use 2 as an example) I get:

>>> 1 + ( 1 + ( 1 + 1 * 2 ) * 1 )  
Result: 5

which looks OK to me.

I guess it's the "insert INT" bit that causes you to make an input that can't be parsed? There is a UI issue here: sometimes we refer to lexeme values (e.g. Delete )) and sometimes lexeme types (e.g. Insert INT). I flapped around a bit on this originally because there's not an easy way in plain text to make clear that e.g. Insert ''' means "Insert a single quote".

The user can use the %epp feature to make the difference between these clearer though, perhaps, it would be better if the default was something like Insert <INT>. That would solve the "single quote" (and related problem), and I think the angle bracket problem is less of an issue.

@duskybomb
Copy link
Author

Thank you for such an elaborate response. I don't know how did I miss that this was due to inserting INT and not an syntactic error. Extremely sorry for that!

However, I have come across the repaired program not conforming to grammar issue with one of the repairs being:

1 + ( 1 + ( INT + INT * 1 + 1 * ( * 1 )
Input program: 1 + * ( 1 + ) ( + * 1 + 1 * ) ( * 1 )
If I follow the repair sequence: 1, 4, 2 from the one mentioned in my first post.

1: Delete *
4: Delete ), Shift (, Insert INT, Shift +, Insert INT
2: Delete ), Shift (, Insert INT

I hope this isn't another silly mistake from my side.

@duskybomb
Copy link
Author

If I use the %epp feature and mention %epp INT "1", I still get the <evaluation aborted> message. Is it because of timeout or something else?

@ltratt
Copy link
Member

ltratt commented Mar 13, 2022

Ah, you can't intermix repair sequences in that way: depending on which repair sequence you take, the rest of the parse (and the parse errors) can be different. There's no practical way around this, unfortunately, without hitting a combinatorial explosion.

@duskybomb
Copy link
Author

Thank you for the clarification!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants