Skip to content
Gibbs sampler for the Hierarchical Latent Dirichlet Allocation topic model
Jupyter Notebook Python
Branch: master
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
bbc Removed unused test data Oct 3, 2017
hlda Updated readme and BBC example Oct 4, 2017
notebooks Updated example to dump the hlda object Oct 6, 2017
.gitignore updated gitignore Oct 3, 2017
LICENSE.txt Code restructuring for packaging Oct 3, 2017
README.md
setup.cfg updated setup.cfg Oct 3, 2017
setup.py

README.md

Hierarchical Latent Dirichlet Allocation

Hierarchical Latent Dirichlet Allocation (hLDA) addresses the problem of learning topic hierarchies from data. The model relies on a non-parametric prior called the nested Chinese restaurant process, which allows for arbitrarily large branching factors and readily accommodates growing data collections. The hLDA model combines this prior with a likelihood that is based on a hierarchical variant of latent Dirichlet allocation.

Hierarchical Topic Models and the Nested Chinese Restaurant Process

The Nested Chinese Restaurant Process and Bayesian Nonparametric Inference of Topic Hierarchies

Implementation

  • hlda/sampler.py is the Gibbs sampler for hLDA inference, based on the implementation from Mallet having a fixed depth on the nCRP tree.

Installation

  • Simply use pip install hlda to install the package.
  • An example notebook that infers the hierarchical topics on the BBC Insight corpus can be found in notebooks/bbc_test.ipynb.
You can’t perform that action at this time.