This program looks up the etymologies of words in a text file and color-codes the words according to their origin. It allows a writer to view the register of her writing at a glance.
This code was born out of an interest in the relationship between etymology – the origin of a word – and the register of people's writing. You can find more information about the rationale behind Etymology Marker here: is the main script. It takes "usertext.txt" as an input. Make sure to save the file with UTF-16 encoding so special characters like em dashes don't get turned into gibberish like this: –.

Etymology Marker strips words of their punctuation, prefixes, and suffixes, then looks each word up in etymologyDictionary.json. If it finds a match, it adds HTML tags to the word that will color-code it according to its etymolgy.

If it doesn't find an etymology, then it looks for Greek roots from GreekRootsList.json inside the word. If it still doesn't find a match, it leaves the word alone.

Etymology Marker outputs the HTML file "markedUp.html". and contain the data inside their respective JSON files plus code to dump to the JSON files. I found these were a convenient way to view and edit the JSON files.

I used the following resources to write this code:

Word etymologies


Lists of common English words to check Etymology Marker's performance

What needs to be done

I plan to make Etymology Marker into a Web app. If you want to take a crack at this yourself, that's cool. Go for it! If you have any other improvements or suggestions, that is also cool.