Installation

Finisher is a lightweight autocompletion library for Python. It can be used in situations where you do not want to add additional dependencies such as SOLR or Cloudsearch to provide autocompletion functionality.

Installation

pip install finisher

Up Front Example

import json

from finisher import DictStorageAutoCompleter


# This can be anything
with open("movie_titles.json", "rb") as f:
    movie_titles_json = f.read()
    movie_titles = json.loads(movie_titles_json)

arbitrary_cache = {}
autocompleter = DictStorageAutoCompleter(arbitrary_cache)
autocompleter.train_from_strings(movie_titles)
corrected_tokens = autocompleter.correct_phrase("big lebewski")
guessed_phrases = autocompleter.guess_full_strings(corrected_tokens)
'''
    Results are:
        [
            "big lebowski (usa)",
            "big, bigger, and biggest trucks and diggers (book w/ dvd)",
            "build it bigger: biggest warship",
            "big foot (a.k.a. bigfoot)",
            "bigger than big (sumo mack)",
            "bigger than big (activate)",
            "big 3",
            "biggest fan",
            "big lebowski (universal/ widescreen)",
            "build it bigger: big easy rebuild"
        ]
'''

Practical Example

The above example uses a dictionary to cache data since recomputing mappings will clearly be expensive. This works in a single process or on a web server running a single worker, but in practical terms you will need something like Redis to allow for storage between processes. Redis example is below:

from finisher import RedisStorageAutoCompleter
import redis

redis_client = redis.from_url("http://localhost:6379")
autocompleter = RedisStorageAutoCompleter(redis_client)
autocompleter.train_from_strings(movie_titles)
corrected_tokens = autocompleter.correct_phrase("big lebewski")
guessed_phrases = autocompleter.guess_full_strings(corrected_tokens)

Training a model

A model can be updated incrementally:

arbitrary_cache = {}

autocompleter = DictStorageAutoCompleter(arbitrary_cache)
autocompleter.train_from_strings(movie_titles[:1000])

autocompleter = DictStorageAutoCompleter(arbitrary_cache)
autocompleter.train_from_strings(movie_titles[1000:])

And if we need to re-train the model you could either:

autocompleter.bust_cache()

or just change the variable used for caching.

Spellcheck

It should be noted that spellcheck is supported as well. Consistent with the above example:

autocompleter = RedisStorageAutoCompleter(redis_client)
corrected_tokens = autocompleter.correct_phrase(
    "my fvaorite moive is expandables bcause schwarzanagger is cool"
)
'''
    output:
        [
            'my',
            'favorite',
            'moive',
            'is',
            'expendables',
            'cause',
            'schwarzenegger',
            'is',
            'cool'
        ]
'''

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
example_usage		example_usage
finisher		finisher
.gitignore		.gitignore
.travis.yml		.travis.yml
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Installation

Up Front Example

Practical Example

Training a model

Spellcheck

About

Releases

Packages

Languages

slobdell/Finisher

Folders and files

Latest commit

History

Repository files navigation

Installation

Up Front Example

Practical Example

Training a model

Spellcheck

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages