-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] Topic coherence update 3 #793
Conversation
@tmylk should I add the benchmark testing notebooks for 20NG and Movies dataset too? |
Correcting the tests. |
@tmylk I've added the benchmark testing notebook on movies dataset. |
@@ -91,7 +101,7 @@ def __init__(self, model=None, topics=None, texts=None, corpus=None, dictionary= | |||
else: | |||
self.dictionary = dictionary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the point of checking if isinstance(model.id2word, FakeDict):
above? why is it note enough to check for None?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None
doesn't work here actually. When no word->id mapping is provided while creating an LdaModel this line is called which returns a FakeDict
from here. So this code:
tm1 = LdaModel(corpus=corpus, num_topics=2)
if tm1.id2word is None:
print 'aye'
else:
print 'naye'
actually prints naye
but if I change it to isinstance(tm1.id2word, FakeDict)
it outputs correctly.
Am I correct here?
@tmylk I've addressed your initial comments. I hope I've addressed them correctly. |
@tmylk should I change all |
05a1e4d
to
20e2d6d
Compare
@@ -8,12 +8,12 @@ | |||
This module contains functions to compute confirmation on a pair of words or word subsets. | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explain how the indirect confirmation measures work and why they are useful similar to the competing car brands explanation in the paper.
for top_words in topics: | ||
s_one_one_t = [] | ||
for w_prime in top_words: | ||
w_prime_index = int(np.where(top_words == int(w_prime))[0]) # To get index of w_prime in top_words |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
@tmylk I've addressed your comments. |
a15e5c3
to
0ca2672
Compare
Merged in 6f53b31 |
Changes:
c_uci
coherence measure.c_npmi
coherence measure.window_size
parameter to CoherenceModel init.