You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While getting the top topic terms for my lda model, I get numbers instead of words. The corpus doesn't contain any numbers. Other models return the top words correctly, but for my corpus the results are always numbers.
Code for training:
dictionary=Dictionary(texts) # texts list has documents split by wordsdictionary.filter_extremes(no_above=0.8, no_below=3)
dictionary.compactify()
corpus= [dictionary.doc2bow(text) fortextintexts]
model=LdaMulticore(corpus=corpus, num_topics=num_topics, passes=5)
Versions
version 3.8.1
Please provide the output of:
I used this code to get the top words, which works for other models.
defget_topic_top_words(lda_model, topic_id, nr_top_words=5):
""" Returns the top words for topic_id from lda_model. """id_tuples=lda_model.get_topic_terms(topic_id, topn=nr_top_words)
word_ids=np.array(id_tuples)[:,0]
words=map(lambdaid_: lda_model.id2word[id_], word_ids)
returnwords
The text was updated successfully, but these errors were encountered:
arshad115
changed the title
get_topic_terms returns weird numbers instead of words
id2word returns weird numbers instead of words with get_topic_terms
May 21, 2020
I forgot to pass the dictionary to the LDA model, so it prints out ids of the tokens instead of words. You can get the words from the dictionary by passing the id.
Problem description
While getting the top topic terms for my lda model, I get numbers instead of words. The corpus doesn't contain any numbers. Other models return the top words correctly, but for my corpus the results are always numbers.
Code for training:
Versions
version 3.8.1
Please provide the output of:
I used this code to get the top words, which works for other models.
Output:
The text was updated successfully, but these errors were encountered: