Simple table of word frequencies, derived from Google Ngram corpora.
words-all.txt is a tab-separated file, one word per line, followed by the total number of times the word was seen in Google's scanned books from the past century. Any word with a capital letter was ignored.
words.txt is a subset of words-all.txt, corresponding to words found in /usr/share/dict/words on Mac OS X 10.7. Note that the words containing capital letters will not be found in this file. The program selecter.pl can be used to compile similar subsets.
These files were based on the 1-gram files in the 20120701 release of Google Ngram's corpora. Individual files for each letter were created with the freqaz script, which calls the freq.pl script. Then these individual files were sorted, and then merged with sort -m into words-all.txt.