This is a tool in progress for analysing the etymological background of entire sections text. Instead of looking at a single word at a time, allows for whole document scanning. This library is able to process large text documents quickly, for example Ulysses by James Joyce can be processed in about a tenth of a second on a modern laptop.
This will not differentiate situations were the same word will have the multiple different backgrounds. Such as the word 'or', which has the usage of "tea or coffee" and "gold/yellow". One has germanic etymology and the other has a latin etymology. In general, this is a hard problem to get around.
- Database connection to Etymological Database
- Summary of language use
- Base Performance
- Command Line Interface (CLI)
- Word Filtering
- Summery Statistics
- Detailed Statistics
- More Performance
The bundled data is licensed under the Creative Commons ShareAlike 3.0 License, or the Gutenberg Project License. Please see the data
directory for more information.