Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternative matching strategies for some patterns #5

Open
kaya3 opened this issue Sep 4, 2022 · 2 comments
Open

Alternative matching strategies for some patterns #5

kaya3 opened this issue Sep 4, 2022 · 2 comments
Labels
enhancement New feature or request

Comments

@kaya3
Copy link
Owner

kaya3 commented Sep 4, 2022

The current implementation uses a pair of DFAs to keep track of all matches of all input patterns in the grid. This is suboptimal for 1x1 patterns, because each 1x1 pattern is matched by many different DFA states, causing duplication of the match handling code; something similar occurs for patterns of height 1.

Additionally, the DFAs can get quite big when they have to match patterns larger than 4x4; for example the compiled output for the BasicDungeonGrowth model is about 833KB in size, simply because it has a single 5x5 pattern with 8 symmetries. Likewise, the PacMan model compiles to about 872KB, about 75% of which is attributable to just three input patterns. This is also a performance issue, since the time required to update the DFA states scales with the size of the largest input pattern, even if only one cell in the grid is changed.

Some better strategies:

  • For 1x1 input patterns, there is no need for DFAs at all; matching them is trivial.
  • For other patterns of height 1, matches can be handled directly by the "row DFA", reducing the number of patterns that the "col DFA" needs to recognise.
  • For large input patterns, revert to either a direct search or something like the algorithm used in the original MarkovJunior project (source link).

In the latter case, control-flow analysis may allow such input patterns to be ignored in parts of the program where matches will no longer be used.

@kaya3 kaya3 added the enhancement New feature or request label Sep 4, 2022
@kaya3
Copy link
Owner Author

kaya3 commented Sep 27, 2022

Commit b5130a560f8809fcdf5a13707e000b5fbb29a6eb handles patterns of height 1 in the row DFA, which allows the column DFA to recognise fewer patterns but also allows its alphabet to be reduced (since there are fewer pattern-rows that it needs to distinguish). This includes 1x1 patterns; it may or may not still be worth handling those separately.

The compiled outputs for BasicDungeonGrowth.mjr and PacMan.mjr are now 481KB (42% smaller) and 290KB (67% smaller) respectively.

@kaya3
Copy link
Owner Author

kaya3 commented Sep 29, 2022

With commit 7e70d8313674668a76419148bbc220b7db091030, duplication in the match-handling code is now avoided; the algorithm now uses bitmasks to determine which match handlers to dispatch. This also avoids having to remove and re-add matches which weren't broken by a change. This reduces the compiled output sizes for the sample programs by a little bit more, though the DFA transition table sizes are still the main issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant