Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some issue with error reporting #332

Open
tailhook opened this issue Nov 29, 2021 · 0 comments
Open

Some issue with error reporting #332

tailhook opened this issue Nov 29, 2021 · 0 comments
Labels

Comments

@tailhook
Copy link

I've stripped down my example to the following grammar, expressed in English: source text can contain multiple items separated by newline or comment (double slash //), each item is identifier followed by whitespace-separated numbers.

Here are tree versions of a grammar:

use combine::parser::char::{digit, space, letter};
use combine::parser::repeat::{repeat_until};
use combine::{Stream, Parser, EasyParser};
use combine::{eof, token, many1, sep_by, value};
use combine::{many, skip_many1, attempt};


fn id<I: Stream<Token=char>>() -> impl Parser<I, Output=String> {
    many(letter())
}

fn ws<I: Stream<Token=char>>() -> impl Parser<I, Output=()> {
    skip_many1(space())
}

fn num<I: Stream<Token=char>>() -> impl Parser<I, Output=String> {
    many1(digit())
}

fn comment<I: Stream<Token=char>>() -> impl Parser<I, Output=()> {
    attempt((token('/'), token('/')).silent()).with(value(()))
}

fn newline<I: Stream<Token=char>>() -> impl Parser<I, Output=()> {
    token('\n').with(value(())).expected("newline")
}

fn main() {

    let mut parser1 = many::<Vec<_>, _, _>(
        id()
        .and(many::<Vec<_>, _, _>(ws().with(num())))
        .and(comment().or(newline())),
    );

    let mut parser2 = many::<Vec<_>, _, _>(
        id()
        .and(repeat_until::<Vec<_>, _, _, _>(
            ws().with(num()),
            comment().or(newline()),
        ))
        .and(comment().or(newline())),
    );

    let mut parser3 = many::<Vec<_>, _, _>(
        id()
        .skip(ws())
        .and(sep_by::<Vec<_>, _, _, _>(num(), ws()))
        .and(comment().or(newline()))
    );

    let s = r#"a 123/2"#;
    let err1 = parser1.easy_parse(s)
         .map_err(|e| e.map_position(|p| p.translate_position(s)))
         .unwrap_err();
    let err2 = parser2.easy_parse(s)
         .map_err(|e| e.map_position(|p| p.translate_position(s)))
         .unwrap_err();
    let err3 = parser3.easy_parse(s)
         .map_err(|e| e.map_position(|p| p.translate_position(s)))
         .unwrap_err();
    println!("{}\n{}\n{}", err1, err2, err3);
}

The output is:

Parse error at 6
Unexpected `2`
Unexpected `/`
Expected `whitespace`, `digit` or `newline`

Parse error at 5
Unexpected ` `
Expected `letter`

Parse error at 6
Unexpected `2`
Unexpected `/`
Expected `whitespace` or `newline`

Note in variant 1:

  1. Two unexpected's, / is at wrong position, 2 is not the erroneous character. Looks like a bug?
  2. Position is the position of the character after the erroneous one
  3. Expected digit is wrong. There needs to be whitespace between (or newline, or comment which is silenced)

Note in variant 2:

  1. Unexpected space is at a different position.
  2. Erroneous position is (surprisingly) right
  3. letter can't be here, note that even if I remove the outermost many (i.e. only support single item, so there are no letters possible after initial whitespace), this parser also reports letter.

Note in variant 3:

  1. Same issues as with "variant 1" for position and "unexpected"s
  2. "expected" set is fine

Are there any bugs, or am I misunderstanding parsers somehow? Also why there is such a difference between sep_by, repeat_until and many?

@Marwes Marwes added the bug label Nov 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants