Skip to content
Combination of the FuzzyWuzzy library with Spacy PhraseMatcher
Jupyter Notebook Python
Branch: master
Clone or download
Latest commit 7ca9080 Jan 26, 2020
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
match_lists Added 'and' Jan 4, 2020
notebooks Added markdown Jan 5, 2020
scripts Added stopwords functionality Jan 4, 2020
.gitignore Added venv to gitignore Jan 4, 2020
LICENSE Create LICENSE Dec 17, 2019
README.md Update README.md Jan 26, 2020
requirements.txt Fresh file Jan 5, 2020

README.md

PhuzzyMatcher

Combination of the FuzzyWuzzy library with Spacy PhraseMatcher

I needed a way to match phrases using a fuzzy approach to prepare an annotated dataset to train a model. After searching in forums etc., I realized that I was not the only one looking for this to work, so I share my solution here.

Installation

Clone the repo:

git clone https://github.com/jackmen/fuzzy_spacy.git

Create a virtual environment and activate:

python3 -m venv your_env_name
cd your_env_name
source bin/activate

Install requiremets:

cd /your_path/fuzzy_spacy/
pip3 install -r requirements.txt

Run jupyter notebook

jupyter notebook

Usage

Follow the steps presented in the jupyter notebook. To run the PhuzzyMatcher on a batch of documents, check the spacy documentation (https://spacy.io).

Comment

The code execution is not fast, but was sufficient for my use-case to annotate documents to train a NER model.

You can’t perform that action at this time.