Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to tweak handling of spaces? #5

Closed
tomas opened this issue Oct 5, 2022 · 5 comments
Closed

How to tweak handling of spaces? #5

tomas opened this issue Oct 5, 2022 · 5 comments
Labels
question Further information is requested

Comments

@tomas
Copy link

tomas commented Oct 5, 2022

Hi, your library looks awesome! Kudos to you.

I was trying the demo, but I couldn't find a way to tell uFuzzy to include results that contain spaces in between the search query. I want to get results like "Super Markup Man" or "Super Mario" when querying for "superma" (just remove the space between super and ma in the example query).

image

Is this possible?

@leeoniya
Copy link
Owner

leeoniya commented Oct 5, 2022

thanks!

  • intraIns controls how many extra chars are allowed between each char within a term, so you can set it to some big number, or Infinity. deleting the limit in the demo sets it to Infinity
  • intraChars controls which chars can occur between chars within a term, so you can add a space to the regex: [a-z\d ]

https://leeoniya.github.io/uFuzzy/demos/compare.html?libs=uFuzzy&search=superma&intraIns=inf&intraChars=[a-z\d%20]

depending on your corpus, this may or may not be great. effectively it treats the whole thing as one big term with a bunch of junk in between. you'll also lose the benefit boosting multiple terms that fall at expected bounds (whitespace, punct, etc). you can do a bit better by swapping out the sorting function for one that prioritizes other aspects of matches to maybe get better ordering than you get in the demo.

there is currently no counter in the info object for how many matched chars landed on word boundaries (only full terms). i'll see if this can be added without too much overhead, which should help with what a custom sorting function can consider.

but you also get stuff like this, which i don't think is avoidable with the settings necessary for the behavior you want.

image

for cases where you just want to mash keys without much thought, uFuzzy is probably not going to satisfy. fuzzysort, QuickScore and others do better in this case.

@leeoniya leeoniya added enhancement New feature or request question Further information is requested labels Oct 5, 2022
@leeoniya
Copy link
Owner

leeoniya commented Oct 5, 2022

if you don't allow intraIns to be arbitrarily large, you get much better results:

https://leeoniya.github.io/uFuzzy/demos/compare.html?libs=uFuzzy&search=superma&intraIns=1&intraChars=%5Ba-z%5Cd%20%5D

image

@leeoniya leeoniya removed the enhancement New feature or request label Oct 5, 2022
@tomas
Copy link
Author

tomas commented Oct 5, 2022

I see. intraIns: 1 plus adding a space to the regex did the trick. Thanks for the quick response!

@tomas tomas closed this as completed Oct 5, 2022
@arctica
Copy link

arctica commented Jul 24, 2023

@leeoniya It seems like in intraMode=1 the tollerance to spaces does not work.
Searching for supermariobros does not find anything when SingleError mode is selected.

https://leeoniya.github.io/uFuzzy/demos/compare.html?libs=uFuzzy&search=supermariobros&intraIns=1&intraMode=1&intraChars=[a-z\d%20]
vs
https://leeoniya.github.io/uFuzzy/demos/compare.html?libs=uFuzzy&search=supermariobros&intraIns=1&intraMode=0&intraChars=[a-z\d%20]

@yan42685
Copy link

@leeoniya It seems like in intraMode=1 the tollerance to spaces does not work. Searching for supermariobros does not find anything when SingleError mode is selected.

https://leeoniya.github.io/uFuzzy/demos/compare.html?libs=uFuzzy&search=supermariobros&intraIns=1&intraMode=1&intraChars=[a-z\d%20] vs https://leeoniya.github.io/uFuzzy/demos/compare.html?libs=uFuzzy&search=supermariobros&intraIns=1&intraMode=0&intraChars=[a-z\d%20]

Same here. And thanks for your info which helps me find the reason that causes the unexpected results :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants