Skip to content

A Dirichlet Multinomial Mixture Model-based Approach for Short Text Clustering

Notifications You must be signed in to change notification settings

junyachen/GSDMM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

GSDMM

A Dirichlet Multinomial Mixture Model-based Approach for Short Text Clustering

Python 3.7

The datasets are in format of JSON like follows:
{"text": "centrepoint winter white gala london", "cluster": 65}
{"text": "mourinho seek killer instinct", "cluster": 96}
{"text": "roundup golden globe won seduced johansson voice", "cluster": 72}
{"text": "travel disruption mount storm cold air sweep south florida", "cluster": 140}
{"text": "wes welker blame costly turnover", "cluster": 89}
......

Citation

Please cite the following paper for the data usage:

@article{chen2019nonparametric, title={A nonparametric model for online topic discovery with word embeddings}, author={Chen, Junyang and Gong, Zhiguo and Liu, Weiwen}, journal={Information Sciences}, volume={504}, pages={32--47}, year={2019}, publisher={Elsevier} }

About

A Dirichlet Multinomial Mixture Model-based Approach for Short Text Clustering

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published