Support asymmetric Dirichlet prior optimization #5

feiga · 2015-11-13T12:27:38Z

The current released lightlda doesn't support asymmetric Dirichlet prior optimization. However, our internal practice show it would be useful to get better model with such feature (Also see this).

If anyone is interested in contributing this feature, please reply or contact us through email. We can collaborate on this.

hiyijian · 2016-01-04T13:17:41Z

Hi, guys. Thank you for your amazing work on large scale LDA.
On the other hand, I think model quality is as important as scalability. So I am very intresting in improving it. It is exciting to know asymmetric Dirichlet prior could help. Would you please to share some experience on this? I will try my best to contribute

hiyijian · 2016-01-19T04:32:31Z

Hi, guys,
I finished to try to add this new feature in PR#22
This PR supports asymmetric alpha in following steps:

Add two extra tables to Multiverso. One is topic frequency table, a matrix to count each topics’ frequency. The other one is doc length table, a row to count how many document is with length k.
Initialize the two extra tables with random initialized documents
Learn alpha distribution with the two extra table every 5 iterations
Build alias table for leanred alpha distribution
Sample topics with learned alpha distribution and alias table. Meanwhile, update countings of topic frequency table if necessary

To use this new feature, please just run with an extra option "-num_alpha_iterations".

Please notice that there are two TODOs. One is Evaluation in asymmetric prior mode, the other is Inference with asymmetric prior.

feiga · 2016-01-19T08:44:04Z

Thanks, Jianyi! I will review the code.

hiyijian · 2016-01-20T06:23:58Z

@feiga , I am sorry that I made a mistake when updating topic-frequency-table. I fixed it and commit to PR#22.

feiga added enhancement help wanted labels Nov 13, 2015

feiga mentioned this issue Dec 31, 2015

Inferencing of new/unseen documents #14

Open

hiyijian mentioned this issue Jan 7, 2016

doc2vec vs lda hiyijian/doc2vec#1

Open

bitdjg mentioned this issue Jun 30, 2016

Program received segment fault in configuration and initialization #35

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support asymmetric Dirichlet prior optimization #5

Support asymmetric Dirichlet prior optimization #5

feiga commented Nov 13, 2015

hiyijian commented Jan 4, 2016

hiyijian commented Jan 19, 2016

feiga commented Jan 19, 2016

hiyijian commented Jan 20, 2016

Support asymmetric Dirichlet prior optimization #5

Support asymmetric Dirichlet prior optimization #5

Comments

feiga commented Nov 13, 2015

hiyijian commented Jan 4, 2016

hiyijian commented Jan 19, 2016

feiga commented Jan 19, 2016

hiyijian commented Jan 20, 2016