a python implementation of latent dirichlet allocation(lda) using variational EM algorithm
Python
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
images
README.md
dataset.txt
dataset2.txt
infer.txt
infer2.txt
main.py
stopwords.dic

README.md

LDA (Latent Dirichlet Allocation)

This is a python implementation of LDA using variational EM algorithm.

The following picture shows the top 10 words in the ten topics generated by this algorithm over 16 sentences about one piece on wikipedia.

res

The code contains both the training of the model and predicting topic of new documents.

The following picture shows the top 10 words in the 30 topics (set K = 30) generated by this algorithm over 5000 chinese sina social news.

res2

Author