Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Issue with scores. #19

Closed
altruist123 opened this Issue Mar 13, 2013 · 1 comment

Comments

Projects
None yet
2 participants

Hi,I do the following using fuzzy wuzzy,

choices = ["BestBuy", "ebay", "overstock", "rakuten","sears"]
match1 = process.extractOne("ebay - asdlfjlksj ", choices)
match2 = process.extractOne("thebay - asdlfjlksj ", choices)

if I print match1 and match2 I get the following.

match1: ('ebay', 90)
match2: ('ebay', 90).

As you can see match1 should be a closer match but both have a ratio of 90, is there anyway to use to use this library to give more weightage to whole words and there by more match ratio to match1 or is this a bug?

Thanks.

Owner

acslater00 commented Mar 13, 2013

processor.extractOne takes a kw param for a custom scorer, by default it uses WRatio which may just not be great for your specific application. You can experiment with others. This is not a bug with the library, so I'm going to close the issue.

There are definitely some scoring algorithms (not implemented) that will tokenize a string and then only give 'credit' for a complete token match. If you implemented one I'd happily consider adding it to the library.

@acslater00 acslater00 closed this Mar 13, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment