Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract matched phrase #259

spooknik opened this issue Jan 10, 2020 · 2 comments

Extract matched phrase #259

spooknik opened this issue Jan 10, 2020 · 2 comments


Copy link

@spooknik spooknik commented Jan 10, 2020

Would be really handy to be able to return the matched phrase from the extract functions.

For example:

>>> choices = ["Atlanta Falcons", "New York Jets", "New York Giants", "Dallas Cowboys"]

>>> process.extractMatch("New york Jets are a sportball team.", choices)
        ['New York Jets', 'New york Jets', '91'] 

This comment has been minimized.

Copy link

@lutzen101 lutzen101 commented Jan 13, 2020

In my project I use SpaCy to classify named entities in my data. In order to recognize "custom" entities properly I would like to match against a dictionary using fuzzywuzzy. When I get the "matched phrase" back as a matching result, I can use this information to create an entity from it. Now, I have to build custom logic in order to get the matched phrase which is obviously not that efficient.


This comment has been minimized.

Copy link

@spooknik spooknik commented Jan 13, 2020

Thanks for the reply and explaining your workflow.

For my purposes, I just want a match so I can replace the found phrase with the search phrase. I found that fuzzy-search does exactly what I wanted. It will return the matched phrase in start and end indices which I can extract and use in the replace function.

term = "New York Jets"
text = "New york Jets are a sportball team."

matches = find_near_matches(term, text, max_l_dist=max_distance)
phrase = ([text[m.start:m.end] for m in matches])
['New york Jets']
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
2 participants
You can’t perform that action at this time.