Read this #4

amritbhanu · 2016-04-05T16:52:05Z

Reading tea leaves Machine Reading Tea Leaves #10
http://gradworks.umi.com/37/24/3724413.html Improving the Usability of Topic Models #12
- is there software available?
http://irserver.ucd.ie/bitstream/handle/10197/6482/insight_publication.pdf?sequence=1 Coherence of Descriptors #11
- coherence
http://jis.sagepub.com/content/early/2015/12/05/0165551515617393.full
- does this poaper has method to evaluate cluster?
look up hyperparameter optimization in lda (whats the wei paper says?) LDA-GA #8
https://linis.hse.ru/data/2014/02/21/1331648934/websci_01.pdf issue no Stability of topics #9
paper

IN SE, do people tune lda parameters or even check stability? Find top 5 cited papers?
https://github.com/joelgrus/data-science-from-scratch/blob/master/code/natural_language_processing.py
http://cdn.oreillystatic.com/oreilly/booksamplers/9781491901427_sampler.pdf

amritbhanu · 2016-04-11T15:05:17Z

do people tune lda parameters or even check stability?

Yes they do that, and in some of the papers they try to have good topic coherence. Not the word "stability". But it is in very initial phase, not much work seen.
Papers:

timm · 2016-04-12T02:54:48Z

But it is in very initial phase, not much work seen.

ok. now you've read the papers...

how to measure "coherence"
list 10 ways to improve them sorted by ease of implementation, who has tried what before, what might work best

amritbhanu · 2016-04-12T04:16:14Z

Will have to code the LDA from scratch so that we have flexibility. This will make any implementation easier.
Possible Approaches sorted by ease of implementation are:

Cohesion (intra) and separation (inter).
direct appraoch, asking people about topics using Amazon Mechanical Turk Machine Reading Tea Leaves #10
Perplexity shows how well topic-word and word-document distributions predict new test samples.
Computing the harmonic mean of posterior distribution.
pointwise mutual information (PMI) between the topic words
Each term was represented as a vector in a semantic space, with topic coherence calculated as mean pairwise vector similarity. (Word2vec)
symmetric Kullback–Leibler divergence

amritbhanu added the Work label Apr 5, 2016

amritbhanu closed this as completed Jul 14, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Read this #4

Read this #4

amritbhanu commented Apr 5, 2016 •

edited

Loading

amritbhanu commented Apr 11, 2016 •

edited

Loading

timm commented Apr 12, 2016

amritbhanu commented Apr 12, 2016

Read this #4

Read this #4

Comments

amritbhanu commented Apr 5, 2016 • edited Loading

amritbhanu commented Apr 11, 2016 • edited Loading

timm commented Apr 12, 2016

amritbhanu commented Apr 12, 2016

amritbhanu commented Apr 5, 2016 •

edited

Loading

amritbhanu commented Apr 11, 2016 •

edited

Loading