-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to add new language #26
Comments
Sure, the steps to generating a new language are fairly straight forward:
If your data in in a dictionary form, you can load it like so: from spellchecker import SpellChecker
spell = SpellChecker(language=None)
spell.word_frequency.load_dictionary(file_to_dictionary)
spell.export(location_for_export) If you only have txt files with words, etc, you can just load those words directly and have spellchecker build the word frequency for you: from spellchecker import SpellChecker
spell = SpellChecker(language=None)
spell.word_frequency.load_text_file(path_to_text_file)
spell.export(location_for_export) Once you have exported the dictionary (really a word frequency list), you can then load that dictionary when you wish to use spellchecker: from spellchecker import SpellChecker
spell = SpellChecker(language=None, local_dictionary=location_from_export) |
Thanx for the clear instructions,I had successfully loaded my text file. |
That is likely due to a few different possible issues.
# return those that are within the specified distance
print(spell.candidates(word))
Honestly, I have never tried this with non-latin character languages so I am unsure how it will perform. |
@MukhtarShaima Let me know if you are still having issues, otherwise, I am going to close this one! Thanks! |
Hi, From my understanding, we can load JSON formatted dictionaries or text documents that will be used for building the frequency list. I would like to directly use the word frequency lists available here (Word Frequency): https://github.com/hermitdave/FrequencyWords/tree/master/content/2018/fi These are txt files containing frequencies. Is there a way to directly load such files or do I need to convert them to JSON first? Thanks for your help! |
Will you please give me clear instructions or steps ,so that I can add Urdu language,as I'm not able to download the Urdu file from that link which you mentioned.
The text was updated successfully, but these errors were encountered: