Grammar ambiguity between {action} and {count}. #74

kevinmehall · 2015-04-07T01:11:39Z

Actions that consist of returning a single integer are ambiguous with the repetition count syntax introduced by #20.

number -> u32
  = "one" {1}
  / "two" {2}
  / "three" {3}

Those should be actions, but are parsed as repeat counts.

One option that minimally solves the problem is to simply drop x {5} in favor of x {5, 5}. A bare comma is not valid Rust, but the distinction between actions and the remaining { , } forms may be hard for humans to parse.

Does anyone have ideas for a new syntax for the bounded-repeat functionality?

The text was updated successfully, but these errors were encountered:

kevinmehall · 2015-04-07T01:24:29Z

An alternate syntax could, but doesn't have to, address #47

buster · 2015-05-19T13:16:46Z

I Kevin,

i have another related issue:

for an IMAP protocol parser there is that notion of a literal.
A literal is the string {X} followed by CRLF followed by X bytes.
So i would like to use X in a repition like so:

literal -> &'input str
    = "{" number "}" CRLF (CHAR8{number} { match_str })

Is there a way to do that?

jonas-schievink · 2015-05-19T13:48:25Z

This is not supported, since you can't do variable repetition. But I'm sure there is a way to do this using a conditional capture (which is apparently still undocumented). I'll try to write a grammar that does this when I find time.

jonas-schievink · 2015-05-19T14:36:23Z

Okay, here you go:

#![feature(plugin, str_char, collections)]
#![plugin(peg_syntax_ext)]

peg! grammar(r#"

number -> usize
    = [0-9]+ { match_str.parse().unwrap() }

pos -> usize
    = &. { start_pos }

#[pub]
literal -> &'input str
    = "{" n:number "}\r\n" start:pos ( . {?
        println!("{}", start_pos - start);
        if start_pos - start < n { Ok(()) } else { Err("") }
    } )* {? if match_str.len() - start == n { Ok(match_str) } else { Err("literal") } }

"#);

fn main() {
    println!("{:?}", grammar::literal("{8}\r\n12345678"));
}

buster · 2015-05-20T09:53:58Z

Thanks, that's great!
Now i had problems with the {} in the middle of a string to be parsed and also i didn't want to have the "{8}\r\n" in the output so i came up with this based on your example:

literalnumber -> usize
    = [0-9]+ { match_str.parse().unwrap() }

pos -> usize
    = &. { start_pos }

literal -> &'input str
    = "{" n:literalnumber "}\r\n" start:pos ( . {?
        println!("{}", start_pos - start);
        if start_pos - start < n { Ok(()) } else { Err("") }
    } )* {?
    let head = format!("{{{}}}\r\n", n).len();
    if match_str.len() - head == n { Ok(&match_str[head..]) } else { Err("literal") } }

This successfully parses my test strings and also removes the {}\r\n part:
"{10}\r\n1234567890"
"{1}\r\n1"
"{0}\r\n"
"{2}\r\n"""
"{4}\r\n 1"

Thanks! 👍

Mingun · 2016-11-12T11:31:42Z

As I see, this project inspired by pegjs. I proposed unambiguous and intuitive and clear syntax for that project for the description of number of repetitions. Besides, in syntax are supported also a separator between elements and an opportunity to take the number of repetitions from earlier parser data (thus, allowing to parse grammars like number_with_quantity_of_elements; element element element...). I hope, it will help you to solve this problem.

PS. Also I noted that in grammar the type of returned values is appropriated to rules though actually it would be more logical to appropriate it to actions, they give the typified result. If you are interested in making grammar of more logical, then it is possible to look how it was made by me for the pegjs project (at present in a repository the preliminary version and sometime I will lick it into shape. But I think, the idea shall be clear. For code generation we inference types of all rules from types of actions. For actions types need to be set obviously).

Closes #74

kevinmehall · 2016-11-20T21:34:31Z

I'm changing the syntax to foo*<n,m> in 0.4 to avoid the ambiguity, while reserving foo<x,y> for OMeta-like template arguments.

I like @buster and @Mingun's suggestion of variable repetition counts from a previously captured variable, and while it is not in the "context free" language class, I don't think it poses any problems for PEG. Opened #143.

kevinmehall mentioned this issue Jul 14, 2015

Add an explanation for expression{n, m} to README.md #107

Closed

kevinmehall added a commit that referenced this issue Nov 20, 2016

Replace ambiguous "foo"{repcount} syntax with "foo"*<repcount>

e2d686e

Closes #74

kevinmehall mentioned this issue Nov 20, 2016

Variable repetition counts for bounded-repeat x*<a> #143

Closed

kevinmehall closed this as completed Nov 20, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Grammar ambiguity between {action} and {count}. #74

Grammar ambiguity between {action} and {count}. #74

kevinmehall commented Apr 7, 2015

kevinmehall commented Apr 7, 2015

buster commented May 19, 2015

jonas-schievink commented May 19, 2015

jonas-schievink commented May 19, 2015

buster commented May 20, 2015

Mingun commented Nov 12, 2016

kevinmehall commented Nov 20, 2016

Grammar ambiguity between {action} and {count}. #74

Grammar ambiguity between {action} and {count}. #74

Comments

kevinmehall commented Apr 7, 2015

kevinmehall commented Apr 7, 2015

buster commented May 19, 2015

jonas-schievink commented May 19, 2015

jonas-schievink commented May 19, 2015

buster commented May 20, 2015

Mingun commented Nov 12, 2016

kevinmehall commented Nov 20, 2016