PhuzzyMatcher

Combination of the RapidFuzz library with Spacy PhraseMatcher

I needed a way to match phrases using a fuzzy approach to prepare an annotated dataset to train a model. After searching in forums etc., I realized that I was not the only one looking for this to work, so I share my solution here.

Installation

Clone the repo:

git clone https://github.com/jackmen/fuzzy_spacy.git

Create a virtual environment and activate:

python3 -m venv your_env_name
cd your_env_name
source bin/activate

Install requiremets:

cd /your_path/fuzzy_spacy/
pip3 install -r requirements.txt

Run jupyter notebook

jupyter notebook

Usage

Follow the steps presented in the jupyter notebook. To run the PhuzzyMatcher on a batch of documents, check the spacy documentation (https://spacy.io).

Comment

The code execution is not fast, but was sufficient for my use-case to annotate documents to train a NER model.

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
match_lists		match_lists
notebooks		notebooks
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PhuzzyMatcher

Combination of the RapidFuzz library with Spacy PhraseMatcher

Installation

Usage

Comment

About

Releases

Packages

Contributors 3

Languages

License

jackmen/PhuzzyMatcher

Folders and files

Latest commit

History

Repository files navigation

PhuzzyMatcher

Combination of the RapidFuzz library with Spacy PhraseMatcher

Installation

Usage

Comment

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages