New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch FST to DoubleArrayTrie #76
Conversation
Need rust version >= 1.46.0 to build
This branch passes
|
The result of Cargo bench. Improve the speed tokenize methods :)
|
After upgrading yada 0.3.1, there is no errors!! |
This is the result of
|
And dict.fst data sizes.
|
And remove FST name in this repository
I got an error building lindera-tantivy. The Tokenizer has to clone-able. Related: takuyaa/yada#8 |
Need Tokenizer as a Cloneable
Fixed Cloneable. |
It's ready to review I think. I'll check other builders after the new version released. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks for the good work! |
Switch FST library to yada (Double Array Trie) in PrefixDict.
Need rust version >= 1.46.0 to build.
This PR breaks the data structure from the previous version.
Still a work in progress. Remain tasks are here:
.fst
file to.da
Check/test lindera-serverChange other builders, e.g. neologd, unidic