-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Finding per-dictionary character classes #17
Comments
@JMLas, any comments? |
As far as our use in LyX is concerned, we have a bug open somewhere telling that we should use this kind of information ;) But it never occured to me that the information was available in the dictionaries. I always thought that emacs used its own per-language list of special characters. |
Thanks, that's interesting. Emacs assumes [:alpha:] for ispell and aspell dictionaries, but for hunspell it parses the dictionary files to get the information. However, hunspell does not make this information available via their APIs, as far as I can tell. [Comment edited to fix a couple of errors.] |
Hunspell has |
Use C99-style declarations. Remove check of whether text to be checked is in Hebrew, as hspell already does this (and in fact it’s not what we want: words in non-Hebrew are treated as “empty” and therefore correct; this will have to be dealt with by having the Enchant back-end reject words not in Hebrew, but probably it’s better to have generic code to do this which detects words that contain non-word characters for the given dictionary; however, that will require the implementation of issue AbiWord#17).
Add enchant_dict_get_extra_word_characters, which returns a string of non-letter characters that may occur in words, and enchant_dict_is_word_character, which checks whether the given character is valid as the first, last, or internal character in a word.
Fix issue #17: add new APIs for per-dictionary character classes
Fixed by PR #139. |
Enchant seems to provide no way to find a given dictionary's definition of code points that can appear in a word (lazily, its word character classes).
Emacs's spelling code requires two essential pieces of information:
For most of the underlying spelling engines, such as aspell and hunspell, this information can be found in the underlying dictionaries (and Emacs does so when using them directly).
However, Enchant does not seem to expose this information. This begs the question, what do other Enchant clients do?
The text was updated successfully, but these errors were encountered: