Skip to content

Affix Rules: Undocumented Hunspell Constraints

TrnsltLife edited this page May 21, 2015 · 10 revisions

Undocumented Hunspell Constraints

Combining Prefixes and Suffixes

  1. If a single suffix and a single prefix are combined, they can be combined in any order.
  2. No double-crossing! The affixation path may not bounce back and forth across the root word. That is to say, the following two affixation paths are invalid:
    1. WORD -> PREFIX -> SUFFIX -> PREFIX
    2. WORD -> SUFFIX -> PREFIX -> SUFFIX
  3. As long as <needAffix...> and/or <circumfix.../> flags are not involved, two prefixes and one suffix, or two suffixes and one prefix, can be combined in any order, except for double-crossing as described in 2 above.
  4. When starting with a prefix rule and crossing over to two suffix rules, or starting with a prefix rule and crossing over to two suffix rules, all of the rules must bear the cross="true" attribute.

Summary of Valid Affixation Paths for Complex Suffixes

  1. WORD -> PREFIX
  2. WORD -> SUFFIX
  3. WORD -> PREFIX -> SUFFIX
  4. WORD -> SUFFIX -> PREFIX
  5. WORD -> INNER-SUFFIX -> OUTER-SUFFIX -> PREFIX
  6. WORD -> PREFIX -> INNER-SUFFIX -> OUTER-SUFFIX

Summary of Valid Affixation Paths for Complex Prefixes

  1. WORD -> PREFIX
  2. WORD -> SUFFIX
  3. WORD -> PREFIX -> SUFFIX
  4. WORD -> SUFFIX -> PREFIX
  5. WORD -> INNER-PREFIX -> OUTER-PREFIX -> SUFFIX
  6. WORD -> SUFFIX -> INNER-PREFIX -> OUTER-PREFIX

HunspellXML Sample Demonstrating Valid Affixation Paths for Complex Suffixes

<hunspell>
<suppress blankLines="true" comments="true" metadata="true" myBlankLines="true" myComments="true"/>

	<affixFile>
		<settings>
			<languageCode>en_US</languageCode>
			<characterSet>UTF-8</characterSet>
			<flagType>long</flagType>
		</settings>
		
		<affixes>
			<!-- 1. WORD -> PREFIX -->
			<prefix flag="1p">
				<rule add="re" />
			</prefix>
			
			<!-- 2. WORD -> SUFFIX -->
			<suffix flag="2s">
				<rule add="es" />
			</suffix>
			
			<!-- 3. WORD -> PREFIX -> SUFFIX -->
			<prefix flag="3p" cross="true">
				<rule add="re" combineFlags="3s"/>
			</prefix>
			<suffix flag="3s" cross="true">
				<rule add="s"/>
			</suffix>
			
			<!-- 4. WORD -> SUFFIX -> PREFIX -->
			<suffix flag="4s" cross="true">
				<rule add="s" combineFlags="4p"/>
			</suffix>
			<prefix flag="4p" cross="true">
				<rule add="re"/>
			</prefix>
			
			<!-- 5. WORD -> INNER-SUFFIX -> OUTER-SUFFIX -> PREFIX -->
			<suffix flag="5i">
				<rule add="able" remove="e" combineFlags="5o"/>
			</suffix>
			<suffix flag="5o" cross="true">
				<rule add="y" remove="e" combineFlags="5p"/>
			</suffix>
			<prefix flag="5p" cross="true">
				<rule add="un"/>
			</prefix>
			
			<!-- 6. WORD -> PREFIX -> INNER-SUFFIX -> OUTER-SUFFIX -->
			<prefix flag="6p" cross="true">
				<rule add="dis" combineFlags="6i"/>
			</prefix>
			<suffix flag="6i" cross="true">
				<rule add="ful" combineFlags="6o"/>
			</suffix>
			<suffix flag="6o" cross="true">
				<rule add="ly" />
			</suffix>
			
			<!-- Invalid Rule Set! 7. WORD -> INNER-SUFFIX -> PREFIX -> OUTER-SUFFIX -->
			<suffix flag="7i" cross="true">
				<rule add="able" combineFlags="7p"/>
			</suffix>
			<prefix flag="7p" cross="true">
				<rule add="in" combineFlags="7o"/>
			</prefix>
			<suffix flag="7o" cross="true">
				<rule add="y" remove="e"/>
			</suffix>
		</affixes>
	</affixFile>
	
	<tests>
		<!-- 1. WORD -> PREFIX -->
		<good>redo</good>
		<!-- 2. WORD -> SUFFIX -->
		<good>does</good>
		<!-- 3. WORD -> PREFIX -> SUFFIX -->
		<good>react reacts</good>
		<bad>acts</bad>
		<!-- 4. WORD -> SUFFIX -> PREFIX -->
		<good>plays replays</good>
		<bad>replay</bad>
		<!-- 5. WORD -> INNER-SUFFIX -> OUTER-SUFFIX -> PREFIX -->
		<good>believe believable believably unbelievably</good>
		<bad>unbelieve unbelievable</bad>
		<!-- 6. WORD -> PREFIX -> INNER-SUFFIX -> OUTER-SUFFIX -->
		<good>respect disrespect disrespectful disrespectfully</good>
		<bad>respectful respectfully</bad>
		
		<!-- Invalid Rule Set! 7. WORD -> INNER-SUFFIX -> PREFIX -> OUTER-SUFFIX -->
		<!-- Based on rule set 7, you might expect these words to be identified as correctly spelled:
		     tract, tractable, intractable, intractably
		     In actuality, intractably will be flagged as misspelled. 
			 Rule set 7 incorrectly attempts to cross from suffix to prefix and back to suffix, which is invalid.
			 It could be fixed by rearranging the rules to INNER-SUFFIX -> OUTER-SUFFIX -> PREFIX.
		-->

		<good>tract tractable intractable intractably</good>
		<bad>intract</bad>
	</tests>	

	<dictionaryFile>
		<words>
			do/1p
			do/2s
			act/3p
			play/4s
			believe/5i
			respect/6p
			tract/7i
		</words>
	</dictionaryFile>
</hunspell>

Combining Prefixes and Suffixes while using the <needAffix...> flag

  1. If you want to apply three levels of <needAffix...> flags, there are only two valid paths:
    1. If two-suffix mode (complex suffixes) is enabled as it is by default: WORD -> INNER-SUFFIX -> OUTER-SUFFIX -> PREFIX
    2. When using the <complexPrefixes/> setting to enable two-prefix mode: WORD -> INNER-PREFIX -> OUTER-PREFIX -> SUFFIX

Use of the Circumfix Flag

  1. The circumfix flag can be used with a prefix followed by a suffix, or with a suffix followed by a prefix.
  2. Any use of additional suffixes or prefixes is invalid.
  3. Any combination with the <needAffix...> flag is invalid.