You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When make suggestion for languages like Estonian, lots of improbable words are suggested because very short compounds are possible. These suggestion even prevent showing the correct suggestion.
There are two issues here:
The number of compounds allowed are not currently limited.
Short compound segments are too cheap (longer words should be preferred).
The text was updated successfully, but these errors were encountered:
Missing a letter from a word is a very common type but in that case it seems that enabling compound words works against us.
I have other examples but I considered this relevant. Can we do something to avoid this category of problems?
I was thinking that if we could have an extra setting that would allow compound words only if they contain one word with at least 4 chars would reduce considerably the number of errors.
Sorry for not being clear, this issue stems from Hunspell dictionaries that has flags for marking words as compoundable (begin, end, middle). The spell checker supports them. But, if care is not taken by the dictionary writer, unusual and unhelpful suggestions can result.
allowCompoundWords is something else, it allows combining any combination of words found in the dictionary together. It was designed to reduce the number of false positives from the common practice of programmers to just glue words together when making variable and function names.
allowCompoundWords has been a never ending source of issues. As a result, it has been removed from all standard dictionary definitions and its use discouraged. It is better to create a common word compounds for programmers dictionary that would have an explicit list of allowed compounds.
Please note, custom dictionaries support compound annotations.
compound-word-list.txt
*error**code*
This allows things like: errorcode and codeerror. But also errorerrorerror.
Info
When make suggestion for languages like Estonian, lots of improbable words are suggested because very short compounds are possible. These suggestion even prevent showing the correct suggestion.
There are two issues here:
The text was updated successfully, but these errors were encountered: