SeaNMF

This the implementation of the paper

Tian Shi, Kyeongpil Kang, Jaegul Choo and Chandan K. Reddy, "Short-Text Topic Modeling via Non-negative Matrix Factorization Enriched with Local Word-Context Correlations", In Proceedings of the International Conference on World Wide Web (WWW), Lyon, France, April 2018. PDF

Requirements

Tokenize with NLTK, SpaCy or CoreNLP
Remove special characters.
Remove stop-words.
Edit the argument of data_process.py
Run python3 data_process.py to prepare the document-term matrix and vocabulary.

Run python3 vis_topic.py to calculate the PMI and visualize the top keywords in each topic.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
__pycache__		__pycache__
data		data
seanmf_results		seanmf_results
LICENSE		LICENSE
README.md		README.md
data_process.py		data_process.py
model.py		model.py
train.py		train.py
utils.py		utils.py
vis_topic.py		vis_topic.py