Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Fuzzy Search (DONE) #57

Closed
LuukvE opened this issue Apr 7, 2012 · 7 comments
Closed

Feature Request: Fuzzy Search (DONE) #57

LuukvE opened this issue Apr 7, 2012 · 7 comments

Comments

@LuukvE
Copy link

LuukvE commented Apr 7, 2012

Dotdreaming edited the List wiki asking for Fuzzy Search to be added.
Fuzzy search is a feature which allows users to make mistakes when typing their search query.
The application will still find close matches. You can tweak the threshold by adding fuzzyThreshold to your list options.

I used code provided by the Diff Match Patch project at http://code.google.com/p/google-diff-match-patch/

You can see a live demo of this commit at bijtel.com/Sociale-Kaart.
However, that app sorts on Organisation name & keywords. Keywords are only viewable when you click on a record.
(don't get confused ^^)

@LuukvE LuukvE closed this as completed Apr 7, 2012
@javve
Copy link
Owner

javve commented Apr 8, 2012

Wow, this looks really good! I'll take a deeper look into the code and then get back to you @LuukvE !

@LuukvE
Copy link
Author

LuukvE commented Apr 8, 2012

One problem with fuzzySearch, at this moment, is that it does not sort the list on relevance. Which is quite important, especially when many records are found.

Another option is to only use fuzzysearch when no other records are found. A "notFound" event could also be added to the list object, in order to make developers able to notify the app users.

@javve
Copy link
Owner

javve commented Apr 9, 2012

Would it be possible to make the fuzzy search to a plugin?

And one more extra feature that would be awesome is that it could be possible to match a search over many columns at once? (I guess its just to concatenate the values and search). The reason for this request is the case where you have items looking like this

var item = {
   firstName: "Jonny",
   lastName: "Stromberg"
};

And then searches for "Jonny St" eg.

@LuukvE
Copy link
Author

LuukvE commented Apr 9, 2012

I added a commit which (optionally) splits the searchstring into arguments and iterates over all selected collumns. But this approach is I think not performant enough. I added a debounce to the search method to improve performance.

I agree, it is best to move all of this to a plugin, the plugin will probably just overwrite the default search method and perhaps do some pre-processing on all values in the list. A more advanced feature would be to sort on relevance.

@joeybaker
Copy link

Thanks @LuukvE – this is great!

I hadn't heard of the google diff-match tool before so I did some quick research: It uses the bitap algorithm which implements Levenshtein distance. I'm not a expert on this field, but from having played with some of these before, seems to me that this should be both speedy and pretty accurate.

Thanks!

@LuukvE
Copy link
Author

LuukvE commented Apr 9, 2012

Yes, my search actually went the other way around, first finding the Levenshtein distance and then the Bitap algorithm, and finally the Diff-match-patch project by Neil Fraser.

I ran some tests and the fuzzySearch method seems to be quicker than a regular expression search (using Chrome). Besides that we can use the Bitap score to sort the list based on relevance, which will make searching tremendously better. At this moment, especially with large (1000+) recordsets and short search strings the algoritm gets false-positives. Those should be lower in the list. Just like going to page 1000 of any Google search will only return nonsense.

@joeybaker
Copy link

sweet. I'm using list.js with datasets that can potentially load 1000s of records and hadn't gotten around to fixing the search problem yet. You've saved me a ton of time – thanks!

I think I'd vote with @javve – this might be better as a plugin. I can see a good use case for not wanting search to be sorted by relevancy and/or the extra power isn't needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants