Confusion Words

The python code in this repository is part of my master's thesis.

the code is distributed under MIT license (for more information see LICENSE file)
every subfolder contains an individual algorithm
for detailed instructions see README.md files in subfolders
for further explanations see master's thesis "Evaluation computerlinguistischer Verfahren zur Erkennung von Confusion-Word-Fehlern" (Christoph Jansen, August 2015, HTW Berlin)

Dependencies

All algorithms can be trained on Brown corpus. Run prepare_data.py to automatically download corpus to home directory and generate word2vec model.

cd confusion-words
python3 prepare_data.py

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
lstm_classification		lstm_classification
lstm_language_model		lstm_language_model
lstm_word2vec_language_model		lstm_word2vec_language_model
lstm_word_level_language_model		lstm_word_level_language_model
transformation_based_rule_learning		transformation_based_rule_learning
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
prepare_data.py		prepare_data.py