Python Shell
Latest commit edd1dd4 Jul 18, 2016 @josegonzalez josegonzalez committed on GitHub Merge pull request #124 from medecau/addtox
Tox config file and a few updates

README.rst

https://travis-ci.org/seatgeek/fuzzywuzzy.svg?branch=master

FuzzyWuzzy

Fuzzy string matching like a boss. It uses Levenshtein Distance to calculate the differences between sequences in a simple-to-use package.

Requirements

  • Python 2.4 or higher
  • difflib
  • python-Levenshtein (optional, provides a 4-10x speedup in String Matching)

Installation

Using PIP via PyPI

pip install fuzzywuzzy

Using PIP via Github

pip install git+git://github.com/seatgeek/fuzzywuzzy.git@0.11.0#egg=fuzzywuzzy

Adding to your requirements.txt file (run pip install -r requirements.txt afterwards)

git+ssh://git@github.com/seatgeek/fuzzywuzzy.git@0.11.0#egg=fuzzywuzzy

Manually via GIT

git clone git://github.com/seatgeek/fuzzywuzzy.git fuzzywuzzy
cd fuzzywuzzy
python setup.py install

Usage

>>> from fuzzywuzzy import fuzz
>>> from fuzzywuzzy import process

Simple Ratio

>>> fuzz.ratio("this is a test", "this is a test!")
    97

Partial Ratio

>>> fuzz.partial_ratio("this is a test", "this is a test!")
    100

Token Sort Ratio

>>> fuzz.ratio("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")
    91
>>> fuzz.token_sort_ratio("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")
    100

Token Set Ratio

>>> fuzz.token_sort_ratio("fuzzy was a bear", "fuzzy fuzzy was a bear")
    84
>>> fuzz.token_set_ratio("fuzzy was a bear", "fuzzy fuzzy was a bear")
    100

Process

>>> choices = ["Atlanta Falcons", "New York Jets", "New York Giants", "Dallas Cowboys"]
>>> process.extract("new york jets", choices, limit=2)
    [('New York Jets', 100), ('New York Giants', 78)]
>>> process.extractOne("cowboys", choices)
    ("Dallas Cowboys", 90)

Known Ports

FuzzyWuzzy is being ported to other languages too! Here is one port we know about: