public
Description: Spellcorrector app for Django
Homepage:
Clone URL: git://github.com/peterbe/django-spellcorrector.git
Peter Bengtsson (author)
Sun Nov 08 11:38:43 -0800 2009
name age message
file .gitignore Wed May 27 01:37:22 -0700 2009 adding the fluff around the interesting app [Peter Bengtsson]
file README.md Sun Nov 08 11:38:43 -0800 2009 added is_loaded() [Peter Bengtsson]
file __init__.py Wed May 27 01:37:22 -0700 2009 adding the fluff around the interesting app [Peter Bengtsson]
directory exampleapp/ Wed May 27 01:37:22 -0700 2009 adding the fluff around the interesting app [Peter Bengtsson]
file manage.py Wed May 27 01:37:22 -0700 2009 adding the fluff around the interesting app [Peter Bengtsson]
file settings.py Wed May 27 01:37:22 -0700 2009 adding the fluff around the interesting app [Peter Bengtsson]
directory spellcorrector/ Sun Nov 08 11:38:43 -0800 2009 added is_loaded() [Peter Bengtsson]
file urls.py Wed May 27 01:37:22 -0700 2009 adding the fluff around the interesting app [Peter Bengtsson]
README.md

About django-spellcorrector

The spellcorrector app is basically a class that you instanciate and with it you can do very fast spellcorrections similar to how Google does it. The inspiration of this came from Peter Norvig and he release his code under the MIT license so this is released under that license too.

You don't need a word file to load up to start with. You'll basically just use the content you have in your site. There's no point training on words that you don't have because there's no point in being able to do a spellcorrection for something that doesn't exist.

The exampleapp included in this repo should basically show how you can use the spellcorrector. For the impatient, here's some code that wraps that up:

    from spellcorrector.views import Spellcorrector

    sc = Spellcorrector()
    sc.load() # nothing will happen the first time

    sc.train(u"peter")
    print sc.correct(u"petter") # will print peter
    sc.save()

    sc2 = Spellcorrector()
    sc2.load()
    print sc2.correct(u"petter") # will print peter

Do note, the spellcorrector is only able to do spellcorrection of 2 edit-distances of words shorter than 9 characters. E.g. "xetter"->"peter" but !"xengtssun"->"bengtsson". The reason for this is that working out all those combinations for large words becomes very slow. For really long words it would take up to a second.

A good idea is to create a Spellcorrector instance in for example your views.py but not loading it until it's actually needed. Example views.py:

    spellcorrector = Spellcorrector()
...
def search(request):
    if not spellcorrector.is_loaded():
        spellcorrector.load()
    print spellcorrector.correct(request.GET['q'])