-
Notifications
You must be signed in to change notification settings - Fork 789
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Errors are often suppressed #160
Comments
Hi! Backtracking is an essential feature of parser combinators, and the In the reduced test case you provide, I'm thinking of adding a new combinator that could work like that:
The idea is to provide a list of patterns (so things like There's also the problem of the format of patterns. Here, I match on specific error codes, but nom parsers can return an error chain instead. As an example, in the second test case, the error returned by BTW, I saw the parser combinators library you're developing, it looks really cool! The use of associated types is tricky :) |
Indeed, controlling backtracking is one of the trickier areas of parser libraries. Something like prolog cut is very useful to indicate commitment points, especially for error reporting. Your proposal looks interesting, but I'm a bit concerned that it'll end up being everywhere, since the default is that errors are ignored, so if you want them propagated they need a cut. Perhaps errors could be divided into backtracking ones (suppressed by default) and aborting ones (propagated by default). Over in https://github.com/asajeffrey/wasm/blob/master/src/parser/combinators.rs#L46, I made failure explicitly say whether backtracking was allowed or not, but this would be quite a significant rewrite to the code. The way lifetime polymorphism was being used was rewritten by @eddyb, who managed to remove all the associated types, but still make everything lifetime polymorphic, and allow polymorphic functions like map. The library was heavily influenced by nom, as you can probably tell! |
I split off my parser combinator library, and released it as a crate (https://crates.io/crates/parsell/). It was interesting seeing a slightly different bit of the design space to nom, it seems like there's a trade-off for streaming parsers between minimizing memory allocation and allowing backtracking. It looks to me like parsell went down the route of reducing backtracking by only handling LL(1) parsers, whereas nom allows arbitrary backtracking at the cost of more space usage. Does this seem like a fair summary to you? |
Hey, I have given some thought to your problem, and found a way. It is a bit hackish, and I know you got your own library to worry about now, but you might find it interesting. The basic idea is to change the result type of parsers from It is doable by replacing Then you have the At last, you have So, as I said, it is a bit hackish, but the ugly macros can be stored away from the eyes, it does not change the code much, and the overhead of What do you think? |
I have the same issue as @asajeffrey. The solution proposed by @Geal works well but since you have to wrap into What do you think of introducing a kind of |
Same problem. |
in the context of #356, if I changed the error type, I could probably fit in there a "non backtrackable error" that can contain the same thing as a regular error. |
Over in peresil, I believe I've had similar issues and ideas. Two main points of interest:
|
there is an error accumulation feature in nom, but it's now optional, since a lot of formats won't use it and it makes parsers slower. From what I saw, defining whether an error is recoverable depends on where it was generated in the parsing tree. |
That's wonderful! Perhaps this is just an inability of me to find the appropriate documentation or macros. I can only find references to keeping around a single error (as part of an error chain, perhaps) when reading the error management. Searching through master's API docs for Given a parser like this: named!(part<&str, &str>, recognize!(do_parse!(
add_error!(ErrorKind::Custom(10), tag!("b")) >>
add_error!(ErrorKind::Custom(11), tag!("c")) >>
()
)));
named!(example<&str, &str>, recognize!(do_parse!(
opt!(add_error!(ErrorKind::Custom(1), tag!("a"))) >>
opt!(add_error!(ErrorKind::Custom(2), part)) >>
add_error!(ErrorKind::Custom(3), tag!("z")) >>
()
)));
assert!(example("az").is_done());
assert!(example("bcz").is_done());
assert!(example("abcz").is_done());
assert!(example("z").is_done()); When given the input When given the input Would you kindly point me to some examples or implementation code I could read to achieve this? I'd be happy to provide a draft of what I implement for the error handling document.
That's believable. I might have "solved" that by making separate error variants for errors that occur in one location or the other, trading off annotating the grammar in one way for another. |
so now there are two ways to return an error, one of those is called "failure", and will stop going through other branches in |
If I create a parser which uses any of the choice combinators (
alt!
,opt!
,many0!
, etc.) then errors are suppressed from the child parser. For example:If you feed this the input "(ac)" then the error 42 from foo is suppressed, and instead you get the error when tag!(")") fails to match "ac)". Here are tests I'd expect to pass, but which don't:
This is a problem with backtracking: there's no way to stop backtracking outside of the current named! function. The program behaves properly if we inline foo into foos, but this doesn't scale too well.
The text was updated successfully, but these errors were encountered: