Improve performance changes (draft) #18
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I was playing our with the library so I created a few changes I could possibly split into multiple PRs.
Would like to hear you thoughts about this
Changes for improved performance
add_document
useCow
for hash keys.hashbrown
Hashmap instead ofstd::collections
. Faster for small hash keys. Though does not provide the same level of HashDoS resistanceTermData
query_terms
property is replaced withquery_terms_len
to prevent unnecessary copies of all query terms during query (not benchmarked yet) (Breaking change for theScoreCalculator
trait)Bench results on my computer
add_100k_docs
Master
301.19 ms
This branch
221.24 ms
API change ideas
Filter
is removed, since theTokenizer
can just be seen as a preprocessing step to indexation and query and can do anything that theFilter
does. One argument less to worry about. (Breaking change)