Skip to content
/ LDA Public
forked from nanjunxiao/LDA

Three open source versions of LDA with collapsed Gibbs Sampling, modified by nanjunxiao

Notifications You must be signed in to change notification settings

GerHobbelt/LDA

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

LDA: Latent Dirichlet Allocation

This repository includes three open source versions of LDA with collapsed Gibbs Sampling, modified by nanjunxiao.

GibbsLDA++ single thread,written in C++

ompi-lda multi-node/multi-threads, written in C++

online_twitter_lda multi-threads,written in Python

collapsed Gibbs LDA reference : my blog

What's New

1. GibbsLDA++

fixed bugs:

1). memory leakage. 'delete[] p' instead of 'delete p',when p points to an Array.

2). Array out of bound. (double)random() / RAND_MAX in [0,1]

int topic = (int)(((double)random() / RAND_MAX) * K);  -->  int topic = (int)(((double)random() / RAND_MAX + 1) * K);
double u = ((double)random() / RAND_MAX) * p[K - 1];   -->  double u = ((double)random() / RAND_MAX + 1) * p[K - 1];

2. ompi-lda

fixed bug:

1). infer.cc bugs.

2). rm 'sampler.UpdateModel(corpus)' in lda.cc.

add features:

1). add theta twords file output.

2). add partial boost's hpp/cpp in include dir, so can make directly.

3. online_twitter_lda

add features:

1). add theta phi mat file output.

TODO

ompi-lda

1). twordsnum can configure.

2). rewrite cmd_flag without boost, so can remove include dir.

3). rewrite makefile.

About

Three open source versions of LDA with collapsed Gibbs Sampling, modified by nanjunxiao

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C++ 99.8%
  • C 0.2%
  • HTML 0.0%
  • Python 0.0%
  • Perl 0.0%
  • Batchfile 0.0%