Every repository with this icon (
Every repository with this icon (
| name | age | message | |
|---|---|---|---|
| |
.gitignore | Wed May 27 01:37:22 -0700 2009 | |
| |
README.md | Sun Nov 08 11:38:43 -0800 2009 | |
| |
__init__.py | Wed May 27 01:37:22 -0700 2009 | |
| |
exampleapp/ | Wed May 27 01:37:22 -0700 2009 | |
| |
manage.py | Wed May 27 01:37:22 -0700 2009 | |
| |
settings.py | Wed May 27 01:37:22 -0700 2009 | |
| |
spellcorrector/ | Sun Nov 08 11:38:43 -0800 2009 | |
| |
urls.py | Wed May 27 01:37:22 -0700 2009 |
About django-spellcorrector
The spellcorrector app is basically a class that you instanciate and
with it you can do very fast spellcorrections similar to how Google
does it. The inspiration of this came from Peter
Norvig and he release his code
under the MIT
license so this
is released under that license too.
You don't need a word file to load up to start with. You'll basically just use the content you have in your site. There's no point training on words that you don't have because there's no point in being able to do a spellcorrection for something that doesn't exist.
The exampleapp included in this repo should basically show how you can use the spellcorrector. For the impatient, here's some code that wraps that up:
from spellcorrector.views import Spellcorrector
sc = Spellcorrector()
sc.load() # nothing will happen the first time
sc.train(u"peter")
print sc.correct(u"petter") # will print peter
sc.save()
sc2 = Spellcorrector()
sc2.load()
print sc2.correct(u"petter") # will print peter
Do note, the spellcorrector is only able to do spellcorrection of 2 edit-distances of words shorter than 9 characters. E.g. "xetter"->"peter" but !"xengtssun"->"bengtsson". The reason for this is that working out all those combinations for large words becomes very slow. For really long words it would take up to a second.
A good idea is to create a Spellcorrector instance in for example your views.py but not loading it until it's actually needed. Example views.py:
spellcorrector = Spellcorrector()
...
def search(request):
if not spellcorrector.is_loaded():
spellcorrector.load()
print spellcorrector.correct(request.GET['q'])







