Skip to content

Latest commit

 

History

History
40 lines (30 loc) · 1.81 KB

README.md

File metadata and controls

40 lines (30 loc) · 1.81 KB
  1. Mallet Extension

In Mallet package, it only contains two topic Models--LDA and Hierachical LDA. So I tried to implement some useful topic modeling method on it.

Model:

  • Hierarchical Dirichlet Process with Gibbs Sampling. (in HDP folder)
  • Inference part for hLDA. (in hLDA folder)

Usage:

  1. This is an extension for Mallet, so you need to have Mallet's source code first.
  2. put HDP.java,HDPInferencer.java and HierarchicalLDAInferencer.java in src/cc/mallet/topics folder.
  3. If you are going to run HDP, make sure you include knowceans package in your project.
  4. run HDPTest.java or hLDATest.java will give you a demo for a small dataset in data folder.

References:

  1. Scikit-learn Extension

Scikit-learn doesn't have any topic models yet, so I modified Matthew D. Hoffman's onlineldavb into scikit-learn format.

Model:

  • online LDA with variational EM. (In LDA folder)

Usage:

  1. Make sure scikit-learn is installed.
  2. The onlineLDA model is in lda.py.
  3. For a quick exmaple, run python lda_example.py will fit a 10 topics model with 20 News Group dataset.

Reference: