Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upError reporting and recovery #16
Comments
This was referenced Aug 21, 2018
This comment has been minimized.
This comment has been minimized.
|
For custom diagnostics, @eternaleye is suggesting using "spines", made up of vertebrae, each standing in for an incomplete rule/SPPF node, and providing access to its completed children (on the left of the error) and child vertebra (overlapping the error). We can even provide some typed APIs, although the level of detail (i.e. type safety) has to be balanced with ergonomics. We could have, for each enum /*partial::*/Child<T: ?Sized> {
Complete(super::Handle<T>),
Partial(/*partial::*/Handle<T> /* aka Verterbra<T> */),
NotStarted,
}This way, the custom diagnostics could pattern-match on e.g.: Expr::Cast { expr: Child::Complete(expr), ty: Child::Partial(ty) } |
This comment has been minimized.
This comment has been minimized.
|
Potentially useful paper to look into: https://arxiv.org/abs/1804.07133 |
eddyb commentedAug 21, 2018
One of the simplest things we could do is keep a buffer of "attempted input matches" at the "most advanced input location", as we parse, keeping only the entries with the largest starting point.
Then they could be reported as an "expected one of ..." error.
rustcitself does something similar:use x.An optimization over this would be to not keep that buffer until there's an error, and then only redo the bit of the parse that errored, this time buffering input matches.
Another technique that would help localize and constrain an error, is to use backward parsing (from #13) after an error in forward parsing, to find the "longest valid prefix and suffix" of the input, and if the most advanced failures in both directions get close together, then the syntax error can be localized to even one token/character.
Error recovery could be done for a localized error by either:
f(x.)could recover asCall(Var("f"), Field(Var("x"), "")), instead of showing up asf(x.a)or similar (from picking a character that'd work) - this would be useful to IDEsAll approaches to recovery for a GLL parser can involve some amount of non-determinism, allowing multiple recovery possibilities to continue through, and picking the best outcome through heuristics.