Skip to content
Branch: master
Find file History
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
..
Failed to load latest commit information.
AmbiguousPairs.txt
CorrectionRules.txt Added a few rules to CorrectionRules. Jul 6, 2015
DisambigTwograms.txt first commit Feb 14, 2013
FusingRules.txt Added nowhere. Jul 8, 2015
HyphenRules.txt first commit Feb 14, 2013
MainDictionary.txt
PersonalNames.txt Updated rules. Dec 8, 2013
PlaceNames.txt Adding Placenames file to rulesets. Jun 1, 2014
ReadMe.txt
SyncopeRules.txt first commit Feb 14, 2013
VariantSpellings.txt Removed hav -> have. Jul 22, 2015
logvalues.tsv first commit Feb 14, 2013
romannumerals.txt first commit Feb 14, 2013

ReadMe.txt

ReadMe.txt

I'm not going to try to explain all the rulesets; it would be a lot of work, and they're mostly either self-explanatory from the filename or explained implicitly in the associated code.

I will warn you that I know there are errors in CorrectionRules.txt. You won't have to look hard there to find a few incorrect rules. I've tried to maximize the number of correct rules and minimize the number of dubious ones. But the best balance I could find there was a balance that increased the number of accurate corrections at the cost of allowing a *few* incorrect rules in. Just so you're aware.

Also, some rules in CorrectionRules.txt turn one word into two. E.g. 
thatthe		=>	that the
You can’t perform that action at this time.