Use FastInteractiveParser
Subclass to Improve CFGFSM
Performance
#622
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Improve performance of lark_lark_self_grammar.lark.test-True (tests -True meaning the cache already is populated, simulating a second run.)
39 tokens / second -> 63 tokens / second
TODO:
functools.lru_cache
toRegexFSM
to ImproveCFGFSM
performance #621Problem
Substantial time is spent in
InteractiveParser.accepts()
/InteractiveParser.feed_token()
because it deepcopiesToken
objects along with all their properties. However that metadata isn't ever mutated, so it's unnecessary.(see https://github.com/lark-parser/lark/blob/master/lark/parsers/lalr_interactive_parser.py)
Solution
Create a global (class variable)
copy_memo
which stores the Tokens which would have been copied. Don'tdeepcopy
Tokens, retrieve from thecopy_memo
instead.New Benchmark
Profile: