-
-
Notifications
You must be signed in to change notification settings - Fork 276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: tolerance option not behaving as hoped #480
Comments
I fear that's a known issue. We're performing the Levenshtein edit distance on words living in the same prefix bucket, rather than performing the edit distance calculation on trees. For instance, searching for I'll be putting a bounty on this bug, thanks for opening it! /bounty 500 |
|
Hey @micheleriva, I would like to work on this issue. Can you please assign this issue to me? Options |
Hey, I have a solution Options |
Note: The user @mnmt7 is already attempting to complete issue #480 and claim the bounty. If you attempt to complete the same issue, there is a chance that @mnmt7 will complete the issue first, and be awarded the bounty. We recommend discussing with @mnmt7 and potentially collaborating on the same solution versus creating an alternate solution. |
@ttillberg is this open to work? |
@ogil7190 yes |
/attempt #480 Options |
/attempt #480 Options |
💡 @SP321 submitted a pull request that claims the bounty. You can visit your org dashboard to reward. |
🎉🎈 @SP321 has been awarded $500! 🎈🎊 |
Fixed with v1.2.11 |
Thanks for the amazing lib and clear documentation! I'm looking at using Orama to search local chat messages (typically involving a few words up to several sentences).
Using
@orama/orama ^1.2.3
I'm getting fast a correct results for exact and prefixed matching however however typos don't seem to work the way I was hoping. I'm probably missing the obvious but testing thetolerance
parameter against an example in the docs returns poor results. So I'm wondering what could be wrong.Looking at the following example.
https://docs.oramasearch.com/usage/search/introduction#typo-tolerance
If I grab a slightly bigger database:
https://github.com/erik-sytnyk/movies-list/blob/master/db.json
here's my playground (all output is in the console):
https://codesandbox.io/p/sandbox/keen-knuth-9wql22?file=/src/main.ts:65,28
I've played with other options, such as the tokenizer, stemming, relevance, threshold but without luck. What am I missing?
The text was updated successfully, but these errors were encountered: