Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a test case for issue #290 #343

Closed
wants to merge 6 commits into from
Closed
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 54 additions & 0 deletions lrtable/src/lib/statetable.rs
Original file line number Diff line number Diff line change
Expand Up @@ -141,6 +141,7 @@ pub struct StateTable<StorageT> {
tokens_len: TIdx<StorageT>,
conflicts: Option<Conflicts<StorageT>>,
final_state: StIdx<StorageT>,
rule_usage: Vec<(RIdx<StorageT>, bool)>,
}

#[derive(Clone, Copy, Debug, PartialEq)]
Expand Down Expand Up @@ -191,6 +192,10 @@ where
let mut reduce_reduce = Vec::new();
let mut shift_reduce = Vec::new();
let mut final_state = None;
let mut rule_usage = grm
.iter_rules()
.map(|ridx| (ridx, false))
.collect::<Vec<_>>();

for (stidx, state) in sg
.iter_closed_states()
Expand Down Expand Up @@ -226,6 +231,8 @@ where
match pidx.cmp(&r_pidx) {
Ordering::Less => {
reduce_reduce.push((pidx, r_pidx, stidx));
let ridx = grm.prod_to_rule(pidx);
*rule_usage.get_mut(usize::from(ridx)).unwrap() = (ridx, true);
actions[off] = StateTable::encode(Action::Reduce(pidx));
}
Ordering::Greater => reduce_reduce.push((r_pidx, pidx, stidx)),
Expand Down Expand Up @@ -261,6 +268,7 @@ where
let off = (usize::from(stidx) * usize::from(nt_len)) + usize::from(s_ridx);
debug_assert!(gotos[off] == 0);
// Since 0 is reserved for no entry, encode states by adding 1
*rule_usage.get_mut(usize::from(s_ridx)).unwrap() = (s_ridx, true);
gotos[off] = usize::from(*ref_stidx) + 1;
}
Symbol::Token(s_tidx) => {
Expand All @@ -279,6 +287,7 @@ where
*ref_stidx,
&mut shift_reduce,
stidx,
&mut rule_usage,
);
}
Action::Accept => panic!("Internal error"),
Expand Down Expand Up @@ -370,6 +379,7 @@ where
tokens_len: grm.tokens_len(),
conflicts,
final_state: final_state.unwrap(),
rule_usage,
})
}

Expand Down Expand Up @@ -484,6 +494,21 @@ where
pub fn conflicts(&self) -> Option<&Conflicts<StorageT>> {
self.conflicts.as_ref()
}

pub fn rules_used(&self) -> Vec<RIdx<StorageT>> {
self.rule_usage
.iter()
.filter(|(_, used)| *used)
.map(|(ridx, _)| *ridx)
.collect::<Vec<RIdx<StorageT>>>()
}
pub fn rules_unused(&self) -> Vec<RIdx<StorageT>> {
self.rule_usage
.iter()
.filter(|(_, used)| !*used)
.map(|(ridx, _)| *ridx)
.collect::<Vec<RIdx<StorageT>>>()
}
}

fn actions_offset<StorageT: PrimInt + Unsigned>(
Expand Down Expand Up @@ -541,6 +566,7 @@ fn resolve_shift_reduce<StorageT: 'static + Hash + PrimInt + Unsigned>(
stidx: StIdx<StorageT>, // State we want to shift to
shift_reduce: &mut Vec<(TIdx<StorageT>, PIdx<StorageT>, StIdx<StorageT>)>,
conflict_stidx: StIdx<StorageT>, // State in which the conflict occured
rule_usage: &mut [(RIdx<StorageT>, bool)],
) where
usize: AsPrimitive<StorageT>,
{
Expand All @@ -551,6 +577,8 @@ fn resolve_shift_reduce<StorageT: 'static + Hash + PrimInt + Unsigned>(
// If the token and production don't both have precedences, we use Yacc's default
// resolution, which is in favour of the shift.
actions[off] = StateTable::encode(Action::Shift(stidx));
let ridx = grm.prod_to_rule(pidx);
*rule_usage.get_mut(usize::from(ridx)).unwrap() = (ridx, true);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this marking things as used when there is a conflict resolved in their favor may not be right,
in cases when there is some subsequent conflict with the production that it was resolved in favor of

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must admit that I have no idea!

Copy link
Collaborator Author

@ratmice ratmice Aug 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I'm going to try and avoid relying on that, if we look at what is still reachable after conflicts have been removed.
and use the forward pass merely for rules which are plain unused. It seems like a less fragile than trying to rely the above.

The actual problem I was running into with the case above, is the wrong pidx, and it seems much more difficult (if actually possible) to go from stidx back to a pidx, which working based on what was discarded it appears doesn't need to happen.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From memory, a StIdx can represent multiple productions. I'm not sure if that helps you either way!

shift_reduce.push((tidx, pidx, conflict_stidx));
}
(Some(token_prec), Some(prod_prec)) => {
Expand All @@ -565,6 +593,8 @@ fn resolve_shift_reduce<StorageT: 'static + Hash + PrimInt + Unsigned>(
}
(AssocKind::Right, AssocKind::Right) => {
// Right associativity is resolved in favour of the shift.
let ridx = grm.prod_to_rule(pidx);
*rule_usage.get_mut(usize::from(ridx)).unwrap() = (ridx, true);
actions[off] = StateTable::encode(Action::Shift(stidx));
}
(AssocKind::Nonassoc, AssocKind::Nonassoc) => {
Expand All @@ -579,6 +609,8 @@ fn resolve_shift_reduce<StorageT: 'static + Hash + PrimInt + Unsigned>(
}
Ordering::Greater => {
// The token has higher level precedence, so resolve in favour of shift.
let ridx = grm.prod_to_rule(pidx);
*rule_usage.get_mut(usize::from(ridx)).unwrap() = (ridx, true);
actions[off] = StateTable::encode(Action::Shift(stidx));
}
Ordering::Less => {
Expand Down Expand Up @@ -990,4 +1022,26 @@ D : D;
Err(e) => panic!("Incorrect error returned {:?}", e),
}
}

#[test]
fn rule_usage() {
let grm = YaccGrammar::new(
YaccKind::Original(YaccOriginalActionKind::GenericParseTree),
"
%start A
%%
A: | 'a' | C;
B: 'b';
C: 'c';
D: B;
",
)
.unwrap();
let sg = pager_stategraph(&grm);
let st = StateTable::new(&grm, &sg).unwrap();
assert!(st.rules_used().contains(&grm.rule_idx("A").unwrap()));
assert!(st.rules_used().contains(&grm.rule_idx("C").unwrap()));
assert!(st.rules_unused().contains(&grm.rule_idx("B").unwrap()));
assert!(st.rules_unused().contains(&grm.rule_idx("D").unwrap()));
}
}
4 changes: 4 additions & 0 deletions nimbleparse/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -20,3 +20,7 @@ lrlex = { path="../lrlex", version="0.12" }
lrpar = { path="../lrpar", version="0.12" }
lrtable = { path="../lrtable", version="0.12" }
num-traits = "0.2"

[features]
default = []
test_known_failures = []
80 changes: 80 additions & 0 deletions nimbleparse/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -229,3 +229,83 @@ fn main() {
process::exit(1);
}
}

#[cfg(test)]
mod test {
use cfgrammar::{
yacc::{YaccGrammar, YaccKind, YaccOriginalActionKind},
RIdx, Span,
};
use lrlex::{DefaultLexeme, LRNonStreamingLexerDef, LexerDef};
use lrpar::{lex_api::NonStreamingLexer, parser::AStackType, RTParserBuilder};
use num_traits::ToPrimitive;
use std::sync::atomic::{AtomicU8, Ordering};

static COUNT: AtomicU8 = AtomicU8::new(0);

type TestAction<'a> = &'a dyn Fn(
RIdx<u32>,
&dyn NonStreamingLexer<DefaultLexeme, u32>,
Span,
std::vec::Drain<AStackType<DefaultLexeme, ()>>,
(),
);

fn test_action(
_ridx: RIdx<u32>,
_lexer: &dyn NonStreamingLexer<DefaultLexeme, u32>,
_span: Span,
_astack: std::vec::Drain<AStackType<DefaultLexeme, ()>>,
_param: (),
) {
let x = COUNT.load(Ordering::SeqCst);
if x < std::u8::MAX {
COUNT.store(x + 1, Ordering::SeqCst);
} else {
panic!("actions executed too many times")
}
}

#[test]
fn test_issue_290_recursive_rules() {
let mut lexerdef = LRNonStreamingLexerDef::<DefaultLexeme, u32>::from_str(
r"
%%
a 'a'
[\t\n ] ;",
)
.unwrap();
// These rules have conflicts which when ignored and passed input
// can lead to calling the action in an infinite loop.
//
// Yacc seems to ignore these silently, while bison produces a "useless rule" warning.
let srcs = [
r"%%
Start: Bar;
Foo: 'a' | ;
Bar: Foo | Foo Bar;
",
r"%%
Start: Bar;
Foo: 'a' | ;
Bar: Foo | Foo Baz;
Baz: Foo | Foo Bar;
",
];
for src in srcs {
let grm = YaccGrammar::new(YaccKind::Original(YaccOriginalActionKind::NoAction), src)
.unwrap();
let (_, state_tbl) = lrtable::from_yacc(&grm, lrtable::Minimiser::Pager).unwrap();
let rule_ids = grm
.tokens_map()
.iter()
.map(|(&n, &i)| (n, usize::from(i).to_u32().unwrap()))
.collect();
lexerdef.set_rule_ids(&rule_ids);
let lexer = lexerdef.lexer("a");
let pb = RTParserBuilder::new(&grm, &state_tbl);
let actions: &[TestAction] = &[&test_action, &test_action, &test_action, &test_action];
pb.parse_actions(&lexer, actions, ());
}
}
}