Skip to content
This repository has been archived by the owner on Sep 11, 2020. It is now read-only.

Improvement in Learning Rate and Topics Learned? #54

Open
dbl001 opened this issue Jun 21, 2019 · 0 comments
Open

Improvement in Learning Rate and Topics Learned? #54

dbl001 opened this issue Jun 21, 2019 · 0 comments

Comments

@dbl001
Copy link
Collaborator

dbl001 commented Jun 21, 2019

I have experimented with adjustments to the 'lda_loss' function:
E.g. Lda2vec.py:

            normalized = tf.nn.l2_normalize(self.mixture.topic_embedding, axis=1)
            loss_lda = self.lmbda * fraction * self.prior() + (self.learning_rate*tf.reduce_sum(tf.matmul(normalized, normalized, adjoint_b = True, name="topic_matrix")))

This change to the lda-loss learning algorithm reduces the correlation between topics in the topic_embedding matrix.

Also, this NIPS paper discusses a methodology for quantifying LDA performance, specifically, by measuring: word intrusion and topic intrusion.

http://users.umiacs.umd.edu/~jbg/docs/nips2009-rtl.pdf

Please experiment and let me know what you find.

Topic Similarity Matrix after 33 Epochs:
image

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant