Skip to content
This repository has been archived by the owner. It is now read-only.
[hibernating] Dynamic topic models
Python
Branch: develop
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
doc
horizont
.gitignore
.gitmodules
.travis.yml
AUTHORS
LICENSE
MANIFEST.in
README.rst
setup.py

README.rst

In hiatus. Currently evaluating best way to approach the DTM model. Using the Pólya-Gamma augmentation and the original DTM formulation is complicated and might not give better performance than simpler models (e.g., using truncated Pitman-Yor Processes).

NOTE: The implementation of LDA has been broken out (and refined) into lda.

NOTE: If you're interested in implementing the dynamic topic model using Pólya-Gamma, most of the hard work has been done: https://github.com/HIPS/pgmult

horizont: Topic models in Python

https://travis-ci.org/ariddell/horizont.png

horizont implements a number of topic models. Conventions from scikit-learn are followed.

The following models are implemented using Gibbs sampling.

  • Latent Dirichlet allocation (Blei et al., 2003; Pritchard et al., 2000)
  • (Coming soon) Logistic normal topic model
  • (Coming soon) Dynamic topic model (Blei and Lafferty, 2006)

Getting started

horizont.LDA implements latent Dirichlet allocation (LDA) using Gibbs sampling. The interface follows conventions in scikit-learn.

>>> import numpy as np
>>> from horizont import LDA
>>> X = np.array([[1,1], [2, 1], [3, 1], [4, 1], [5, 8], [6, 1]])
>>> model = LDA(n_topics=2, random_state=0, n_iter=100)
>>> doc_topic = model.fit_transform(X)  # estimate of document-topic distributions
>>> model.components_  # estimate of topic-word distributions

Requirements

Python 2.7 or Python 3.3+ is required. The following packages are also required:

GSL is required for random number generation inside the Pólya-Gamma random variate generator. On Debian-based sytems, GSL may be installed with the command sudo apt-get install libgsl0-dev. horizont looks for GSL headers and libraries in /usr/include and /usr/lib/ respectively.

Cython is needed if compiling from source.

Important links

License

horizont is licensed under Version 3.0 of the GNU General Public License. See LICENSE file for a text of the license or visit http://www.gnu.org/copyleft/gpl.html.

You can’t perform that action at this time.