Swapping phi, token.frequency, vocab, and topic.proportion rda files removes some visualization features #53

Graybosch · 2016-02-02T21:25:02Z

Hello Carson,

I'd appreciate any thoughts on what might be causing an issue with your otherwise great visualization package.

One of your tutorials generates a beautiful Shiny application. I replaced your RDA files with my own - you had RDA files for phi, topic.proportion, token.frequency, and vocab - and got a picture of the topic regions but do not get lists of relevant terms for topics I click on. I also do not get barcharts of the breakdown of tokens for each topic, only a list of the overall most salient terms for the corpus.

I, initially, got a NaN error when the Shiny application tried to build. I built my model using super-fast Vowpal Wabbit for LDA. VW requires a vocabulary size to be a power of 2, plus 1, and so if your |Vocabulary| <> 2^N + 1 then you will have some rows of zeroes in phi. My guess is those zeros made the Kullback-Leibler divergence blow up. When I forced the zero entries in phi to equal 10^-6 the app ran and gave me a beautiful picture of the overlapping topics. However, when I selected a region, I no longer automatically got barcharts of relevant terms for that cluster. Said feature worked beautifully prior to my replacing your RDA files. The app does still tell me how much of the corpus comes from each topic and still does list the overall most salient terms.

I've checked my phi, topic.proportion, token.frequency, and vocab, I'd appreciate any thoughts on what might be causing the issue, thanks again for the great visualization package,

Anthony

Graybosch · 2016-02-16T21:36:39Z

This has been resolved by adding row and column headings to phi and by removing the rows corresponding to extra tokens added by VW's implementation of LDA

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Swapping phi, token.frequency, vocab, and topic.proportion rda files removes some visualization features #53

Swapping phi, token.frequency, vocab, and topic.proportion rda files removes some visualization features #53

Graybosch commented Feb 2, 2016

Graybosch commented Feb 16, 2016

Swapping phi, token.frequency, vocab, and topic.proportion rda files removes some visualization features #53

Swapping phi, token.frequency, vocab, and topic.proportion rda files removes some visualization features #53

Comments

Graybosch commented Feb 2, 2016

Graybosch commented Feb 16, 2016