Doing some word annalysis concerning the 'I before E except after C' rule
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


summary of words files:

.words was generated using:
grep -iP -e '(ie|ei)' /usr/share/dict/words > .words

.wordsNoDupe was then generated using .words
grep -ivP -e '(ier|iest|ed|ing|s|tion)$' .words > .wordsNoDupe

.wordsNoCaps was then generated:
grep -vP -e '^[A-Z]' .wordsNoDupe > .wordsNoCaps

grep -vP -e '-' .wordsNoCaps > .wordsNoHyphen

Known issues:
some deeper analysis needs to happen.
I realise that with the dupe ommissions there are some legitimate words being removed that are not actually duplicates of anything.
I'm not sure how big that number is though.

there is a large chunk of words that are legitimate followers of i before e excluded by omitting 'ier' and 'iest'.