Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Machine Reading Tea Leaves #10

Closed
amritbhanu opened this issue Apr 11, 2016 · 0 comments
Closed

Machine Reading Tea Leaves #10

amritbhanu opened this issue Apr 11, 2016 · 0 comments
Labels

Comments

@amritbhanu
Copy link
Contributor

Machine Reading Tea Leaves: Automatically Evaluating Topic Coherence and Topic Model Quality

[bibtex](@inproceedings{lau2014machine,
title={Machine Reading Tea Leaves: Automatically Evaluating Topic Coherence and Topic Model Quality.},
author={Lau, Jey Han and Newman, David and Baldwin, Timothy},
booktitle={EACL},
pages={530--539},
year={2014}
})

General:

  • A good paper which gives rational about the topics instability

Measures:

  • notion of topic “coherence”, and proposed an automatic method for estimating topic coherence based on pairwise pointwise mutual information (PMI) between the topic words
  • direct appraoch, asking people about topics, indirect approach by evaluating PMI, CP.
  • To create gold-standard coherence judgements, they used Amazon Mechanical Turk

Problems:

  • perplexity correlates negatively with topic interpretability

Research Question:

  • word intrusion measures topic interpretability differently to observed coherence

Terminologies:

  • topic coherence, the semantic interpretability of the top terms usually used to describe discovered topics
  • “intruder word”, which has low probability in the topic of interest, but high probability in other topics
This was referenced Apr 11, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant