GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
Already on GitHub? Sign in to your account
Hi,I do the following using fuzzy wuzzy,
choices = ["BestBuy", "ebay", "overstock", "rakuten","sears"]
match1 = process.extractOne("ebay - asdlfjlksj ", choices)
match2 = process.extractOne("thebay - asdlfjlksj ", choices)
if I print match1 and match2 I get the following.
match1: ('ebay', 90)
match2: ('ebay', 90).
As you can see match1 should be a closer match but both have a ratio of 90, is there anyway to use to use this library to give more weightage to whole words and there by more match ratio to match1 or is this a bug?
processor.extractOne takes a kw param for a custom scorer, by default it uses WRatio which may just not be great for your specific application. You can experiment with others. This is not a bug with the library, so I'm going to close the issue.
There are definitely some scoring algorithms (not implemented) that will tokenize a string and then only give 'credit' for a complete token match. If you implemented one I'd happily consider adding it to the library.