A sample workflow using LDA, CorEx, Anchors, and Word Embeddings. If you want to get to the heart of the matter, see topic_model_example, but the devil is in the details.
This repository includes the following files:
- preprocessing_example: morphological processing and phrase detection via collocations
- preprocessing_word2vec: trains a Word2Vec model on the grocery reviews corpus
- topic_model_example: topic modelling using Latent Dirichlet Allocation (LDA), basic Correlation Explanation (CorEx), anchored CorEx, and topic enrichment using word embeddings, and topic aggregations
- helpers/helper_base: helper functions for reading and rendering data
- helpers/helper_prep: helper functions for text preprocessing
- helpers/helper_model helper functions for topic modelling