tmw - Topic Modeling Workflow
What is tmw?
tmw is a python module for topic modeling, including some preprocessing of texts and some postprocessing of topic model data. This set of functions is experimental in nature and quality.
- tmw has been developed for and tested only on Linux (Ubuntu 14.04).
- Python 3 (tested with 3.4), Mallet (tested with 2.0.7) and TreeTagger with the desired parameter files.
- Python 3 packages numpy, pandas, matplotlib, lxml, scipy, seaborn, wordcloud.
- Download the module files tmw.py (the module) and tmw_config.py (the configuration file) to a convenient location
- Follow the steps outlined in the twm tutorial: https://www.penflip.com/c.schoech/tmw-tutorial
Most of the code has been written by Christof Schöch, with significant contributions by Daniel Schlör. The project is inspired by Allan Riddell's TaTOM tutorials. We are grateful to all the existing packages and code this project reuses.