Skip to content

Commit

Permalink
Earley predict on complete input
Browse files Browse the repository at this point in the history
An original (maybe overeager!) optimization was included with Earley
parsing. If there was no remaining input text, then no more predictions
from the grammar were attempted. In most cases, this saves Earley a
some work. But the BNF crate supports empty production
rules. This invalidates this optimization, because an empty production
may still be successful even with no remaining input text.

This commit only removes this optimization built on that false
assumption. It would be possible to reintroduce this improvement, but
*only* for grammars without any empty productions.
  • Loading branch information
CrockAgile committed Sep 13, 2022
1 parent e5a894b commit e31217e
Showing 1 changed file with 23 additions and 4 deletions.
27 changes: 23 additions & 4 deletions src/earley.rs
Original file line number Diff line number Diff line change
Expand Up @@ -502,10 +502,6 @@ impl<'gram> Iterator for ParseIter<'gram> {
match matching {
// predict
Some(matching @ Term::Nonterminal(_)) => {
// no need to predict for more input if input is complete
if input_range.is_complete() {
break;
}
let predictions = predict(matching, &input_range, &self.grammar);
self.state_arena.alloc_extend(predictions);
}
Expand Down Expand Up @@ -590,6 +586,29 @@ mod tests {
assert_eq!(parses.count(), 2);
}

#[test]
fn parse_complete_empty() {
let grammar: Grammar = "<start> ::= \"hi\" <empty>
<empty> ::= \"\""
.parse()
.unwrap();

let input = "hi";

let parses = parse(&grammar, input);
assert_eq!(parses.count(), 1);
}

#[test]
fn parse_empty() {
let grammar: Grammar = "<start> ::= \"\"".parse().unwrap();

let input = "";

let parses = parse(&grammar, input);
assert_eq!(parses.count(), 1);
}

// (source: <https://loup-vaillant.fr/tutorials/earley-parsing/recogniser>)
// Sum -> Sum [+-] Product
// Sum -> Product
Expand Down

0 comments on commit e31217e

Please sign in to comment.