Skip to content
master
Go to file
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
LDA
 
 
 
 
 
 

README.md

Topic Modeling using Latent Dirichlet Allocation

A Parallel Stochastic Collapsed Variational Bayesian Inference for LDA (SCVB0) implementation in C++ using OpenMP.

We have implemented a parallel implementation of SCVB0 algorithm proposed by James Foulds et al. We refer Stochastic collapsed variational Bayesian inference for latent Dirichlet allocation. All the notations used are same as mentioned in this paper.

We have used New York Times dataset available at UCI Machine Learning Repository.

The dataset is divided into minibatches of size 100 documents each.

We have used OpenMp to parallelize the execution of algorithm. All the minibatches are divided among the available number of processors and then algorithm is executed parallelly. Results are updated in global matrices nPhi, nTheta and nZ

We have analyzed the perplexity convergence on KOS and NIPS datasets available on the same webpage as that of NYT dataset.

Use following commands to execute the code

Compile: make

Execute: $ ./fastLDA docword.txt iterations NumOfTopics

About

Parallel Stochastic Collapsed Variational Bayesian Inference for LDA (SCVB) implementation in C++ using OpenMP

Resources

Releases

No releases published

Packages

No packages published

Languages

You can’t perform that action at this time.