Latent Dirichlet Allocation (LDA)

LDA is used to classify texts to a specific topic. It builds topics per document model and words per topic model. We describe latent Dirichlet allocation (LDA), a generative probabilistic model for collections of discrete data such as text corpora. LDA is a three-level hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. Each topic is, in turn, modeled as an infinite mixture over an underlying set of topic probabilities. In the context of text modeling, the topic probabilities provide an explicit representation of a document. Read more in "Latent Dirichlet Allocation" paper.

Fig.1 - The intuitions behind latent Dirichlet allocation. (image taken here)

Result

We used a dataset for news report. You can see the result of generated topics for the gieven report here:

Fig.2 - The output result for generated topics.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
assets		assets
.DS_Store		.DS_Store
Latent Dirichlet Allocation (LDA).ipynb		Latent Dirichlet Allocation (LDA).ipynb
README.md		README.md
abcnews-date-text.csv		abcnews-date-text.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Latent Dirichlet Allocation (LDA)

Result

About

Releases

Packages

Languages

soheil-mp/Latent-Dirichlet-Allocation-LDA

Folders and files

Latest commit

History

Repository files navigation

Latent Dirichlet Allocation (LDA)

Result

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages