-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make the compiler produce minimal DFAs #91
Comments
Tagging @V0ldek for notifications |
Looking into this issue. So far:
It seems to me that the minimzation will only take place within checkpoint-created boundaries. Does this seem correct? I intend to look further into this issue in the coming weeks. |
Current checkpoint-based code is basically a hack to produce "somewhat-minimal" automata. In particular it can be proven that without wildcards the checkpoints are enough for the result to be truly minimal (Lemma 5.1. in https://v0ldek.com/masters). The property is that once you reach a checkpoint you can forget all the other states, which means that a checkpoint will always be a unique state in any resulting DFA. So minimizing between them should be enough. BTW, you can make your life testing this easier with https://paperman.name/semigroup/. It's a tool for automata properties, and allows you to minimize them. If you run For example: just r -c '$..a.*.*..b.*.*' -v will output as one of the logs: [rsonpath_lib::query::automaton] NFA: s0.a -> s0, s1;
s0.b -> s0;
s0.X -> s0;
s1.a -> s2;
s1.b -> s2;
s1.X -> s2;
s2.a -> s3;
s2.b -> s3;
s2.X -> s3;
s3.b -> s3, s4;
s3.a -> s3;
s3.X -> s3;
s4.a -> s5;
s4.b -> s5;
s4.X -> s5;
s5.a -> s6;
s5.b -> s6;
s5.X -> s6; You can copy-paste it into semigroup to get this nice graph: (By convention we mark the wildcard as capital X, since the tool only accepts letters) So if you come up with an interesting case to test you can easily find the minimal automaton to verify the result. |
Is your feature request related to a problem? Please describe.
After #7 the compiler no longer necessarily outputs a minimal DFA. It's still a correct DFA, just not minimal.
Describe the solution you'd like
The algorithm in
query::automaton::minimizer
needs to be rewritten to minimize the automaton.Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Example query that produces a non-minimal automaton:
$..a.*.*..b.*.*
. In the resulting automaton below states 3 and 4 are equivalent.The text was updated successfully, but these errors were encountered: