Skip to content
PLSA implementation via EM algorithm
Pull request Compare This branch is 5 commits behind hitalex:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
test
texts
README.md
document-topic.txt
main.py
pip-log.txt
plsa.py
plsa.pyc
stopwords.txt
stopwords_shortlist.txt
topic-word.txt
utils.py
utils.pyc

README.md

This is a PLSA (Probabilistic Latent Semantic Analysis) implementation via the EM (Expectation-Maximization) algorithm.

Current issues:

  1. The code are not well tested, so it may contain bugs. The test text are in the folder ./texts and ./test.
  2. The code seems not working well with small datasets, such as ./test

Reference:

EM introduction: http://blog.tomtung.com/2011/10/em-algorithm

PLSA introduction: http://blog.tomtung.com/2011/10/plsa

My lda-with-gibbs repo

Note:

A Tutorial on Probabilistic Latent Semantic Analysis by Liangjie Hong is not a very good PLSA introduction material. There are some known bugs.

Something went wrong with that request. Please try again.