Skip to content

Latest commit

 

History

History
17 lines (10 loc) · 1.21 KB

README.md

File metadata and controls

17 lines (10 loc) · 1.21 KB

microscopes-lda

A Python package for finding unobserved structure in unstructed data.

This package contains an implementation of the nonparametric (HDP) latent Dirichlet allocation (LDA) model described by Teh et al in Hierarchal Dirichlet Processes (Journal of the American Statistical Association 101: pp. 1566–1581). Unlike the original LDA model, nonparametric LDA does not require the user to select a number of topics. Instead, the number of topics is inferred from the data using a hierarchal Dirichlet process prior.

The current kernel follows the sampling scheme described in Section 5.1 Posterior sampling in the Chinese restaurant franchise. In the future, we may support the other kernels described in Teh's paper.

Numerical computation is implemented in C++ for efficiency.

Installation

OS X and Linux builds of microscopes-lda are released to Anaconda.org. Installing them requires Conda. To install the current release version run:

$ conda install -c datamicroscopes -c distributions microscopes-lda