Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Paper comparison #105

Merged
merged 213 commits into from Feb 12, 2019
Merged

Paper comparison #105

merged 213 commits into from Feb 12, 2019

Conversation

@diazrenata
Copy link
Contributor

@diazrenata diazrenata commented Jan 26, 2019

Here's a draft of the paper comparison vignette. Feedback would be welcome! But I'll also continue to comb through it, so no worries if it's not at the top of your list.

@diazrenata

This comment has been minimized.

Copy link
Contributor Author

@diazrenata diazrenata commented on 8c49916 Dec 13, 2018

I compared the paper data to the LDATS data. It looks like LDATS rodent table is the raw table (not adjusted for incomplete censuses). For now I'm working with the adjusted table, to avoid artificially low abundances for incompletely-trapped periods.

diazrenata added 3 commits Dec 13, 2018
LDATS chooses 6 topics but the paper had 4.
@diazrenata

This comment has been minimized.

Copy link
Contributor Author

@diazrenata diazrenata commented on 6f29b93 Dec 14, 2018

@juniperlsimonis, I know we’ve talked about the LDA model selection criterion and we changed it from the paper to LDATS. If I recall correctly, the paper used AIC and this summer we switched to AICc. Do you think that’s why they’re giving different #’s of topics?

I looked into testing that theory by running LDATS::LDA_select but switching to AIC. I think I need to change the measurer argument of LDA_controls_list(). I’ve run into two things there - 1) it actually looks to me like the default is measurer=AIC, so is that actually AICc? and 2) measurer = 'AICc' does not work:

Error in get(as.character(FUN), mode = "function", envir = envir) :
  object 'AICc' of mode 'function' was not found

When we were doing this in LDA-kratplots I think LDA_select was set up differently, so this was the call:

#### Select the best LDA (AICc) ####
  selected = LDATS:::LDA_select(lda_models = LDA_models, LDA_eval = quote(AIC), correction = TRUE,
                                LDA_selector = quote(min))

So - tl;dr - I’m wondering if you can help with

  1. how do I tell what the measurer function really is set to in LDA_controls_list?
  2. how would I change it, say, to AICc vs. AIC?
  3. if both the paper and the LDATS default are selecting based on AIC without correction, then I’m looking for a different explanation as to why the paper came up with 4 topics and LDATS finds 6, given the same data...

This comment has been minimized.

Copy link
Contributor

@juniperlsimonis juniperlsimonis replied Dec 14, 2018

thanks for digging into this!
the default of the current version of LDATS is also set up using standard AIC (not corrected AICc) again.
if you want to see what the measurer function is, LDA_controls_list()$measurer will get it for you
and if you want to change it, if the function doesn't exist already, you'd have to create it, like this (using a really silly measuring function of the number of the seed used):

data(rodents)
lda_data <- rodents$document_term_table
r_LDA <- LDA_set(lda_data, topics = 2, nseeds = 2)         

okee <- function(x){x@control@seed}
cl <- LDA_controls_list()
cl$measurer <- okee
select_LDA(r_LDA)  
select_LDA(r_LDA, control = cl)
select_LDA(r_LDA, LDA_controls_list(measurer = okee))

which means you can either create a control list object (cl here) and then pass it in, or define it as part of the arguments

i'm going to create a new comment to paste in some slack conversations about your 3rd question

This comment has been minimized.

Copy link
Contributor

@juniperlsimonis juniperlsimonis replied Dec 14, 2018

ok, so for the comparison, there was a fair amount of back-and-forth in slack messages about this, and it's kind of obnoxious that i can't just add you in to that thread (as far as i can tell), so this is just going to be me pasting in a bunch of stuff. tl;dr, the answer has to do with counting the number of parameters to correct the logLik with in the calculation of AIC. the version in the paper counted by hand, and over counted (included the gamma parameters in the count). the version in LDATS counts using what's generated by the LDA function, which is a proper accounting of the number of parameters in the model.

Me:
It's not a concern that the LDA returns an approximation of the marginal likelihood. In the general EM approach, and in the variational flavor of it, the MLE of the unknown parameters is determined by the marginal likelihood of the observations. That's pretty standard.

The marginal likelihood doesn’t inherently include a complexity penalty, it’s just the likelihood with some parameters integrated out. And the output of logLik isn’t at all useless, it just needs to be couched in its “variational approximation of the lower bound” caveat.

While the marginal likelihood doesn’t have a complexity penalty itself, however, the approximated likelihood that’s returned by the variational EM is necessarily lower than the “true” marginal likelihood (the likelihood of the model without the variation parameters), so I guess you could consider it in some senses “penalized”.

However, the degree of difference ("penalty") between the true likelihood and the variational approximation is the KL divergence between the variational distribution used and the true distribution. Notably here, the “true” distribution is not the unknown truth as it is during standard model comparisons (wherein, it’s a constant across the models). Rather, it is “the model without all of the variational parameters”, which does change across the different models (number of topics) in the VEM method.

Thus, while there is a level of “penalizing” via the variational approximation the likelihood, that’s actually not a bad thing, but it’s only a partial accounting for the complexity of the model and not sufficient to compare the resulting models when the number of topics change.

To fully penalize for the model complexity in comparison across models of different numbers of topics, the variational approximation of the lower bound of the marginal likelihood needs to be corrected for the remaining among-model variation in the number of parameters.

You all approached the remaining correction by penalizing the models for the full suite of variational parameters (rather than just those remaining after the variational approximation), which is an excessive correction. However, you all didn’t account for the relative small sample size, which works in the opposite direction (leads to over complex models, i.e. isn’t a stringent enough correction). It looks like the correction for the number of parameters is more impactful than the correction for the number of data points, so it’s likely that you all are more stringent than necessary/ideal overall.

For reference, a selection procedure that only penalizes for the remaining parameters and includes a small sample size correction picks 5 topics.

Dave
Juniper is correct: logLik is only marginalizing some of the parameters out (gamma) and not all of them (i.e. not beta)

  • This has do do with the distinction between “vanilla” variational inference and VEM, which I was missing.
  • The upshot is that AIC-like penalties for models for having more parameters in beta is totally legitimate, but that the extra complexity in gamma is already accounted for.
  • The good news is that we can still use an AIC-like penalty.
  • The bad news is that when Juniper tried penalizing beta-but-not-gamma, they ended up with an optimum of 6 topics (5 if they do AICc instead of AIC), which is different from what’s in the manuscript.

Morgan

Hi everyone. This is great. Thanks to both of you for working through this. Looks like things are exactly in the state that we thought when we made the manuscript changes and wrote the letter to the editor. As for how to proceed, nothing in this conversation changes my mind about the best route forward. Selecting the most appropriate number of topics is an evolving area. As @Juniper Simonis pointed out in our discussions with them, what we’ve done in this manuscript is still a step forward from Valle et al. As the paper has been going through the review process, Juniper has been thinking hard about improved approaches to ours and their work highlights the shortcomings with our approach. This is what will always happen in fast moving fields – it’s just normally we wouldn’t know about it at this phase because that would probably be happening in a different lab (the awesomeness and the curse of being the lab both developing and using a tool!). Since the manuscript has already been accepted and the advances are based on Junipers work I don’t think it’s appropriate to change the analyses at this phase. My suggestion is that we continue with the original plan as implemented in the revised manuscript and letter to the editor – we have caveated our (already improved) AIC approach appropriately, we know it is at worst conservative in selecting the number of topics. Finding more topics at worst means there are more changepoints, which doesn’t alter our core message – change happens in a complicated pattern that includes multiple rapid transitions. In our next LDA paper, we will update the AIC method based on Juniper’s ongoing work on this issue. That’s my preference, the ultimate decision is up to @echriste as the lead author on this. @dave.harris are you ok with this?

Dave
How does something like this sound?

We used AIC to select the appropriate number of community-types needed to parsimoniously describe the data (Appendix S2: Fig. S1). We calculated this value using the log-likelihood returned by the LDA function in the topicmodels package, penalized according to the number of free beta and gamma parameters. By penalizing the variational gamma parameters, we likely overestimated the number of parameters and therefore could have underestimated the optimal number of topics, but our qualitative time series results are robust to including more topics than the number selected.
I’m not thrilled with the way that I had to refer to beta and gamma by their awkward Greek names, but I’m not sure what the alternative is if we want to be specific.

Morgan
I like the revised response to the reviewer in the letter. Nicely done. I like the extra precision in the manuscript changes but we haven't mentioned beta and gamma before and without additional cascading edits to introduce exactly what those are, our comments are literally just focused on explaining things to Gavin and those of similar technical expertise levels. Everyone else will be seriously lost. I say we move Dave's proposed manuscript language into the Appendix (associate it with Fig S1). We return to the version Erica pasted above but we add an explicit apppendix reference to the sentence: We used a full penalty for all variational parameters (for details see Appendix 2; Fig S1). Erica, let me know when you've done this and then send me updated copies. Then I'll email Scott and send him the new materials as well so he can look at them if we wants. Great job everyone. I know this has been stressful for everyone and I appreciate everyone's patience - both with the discussion and with each other!

What's in the first paper now
We used AIC to select the appropriate number of community-types needed to parsimoniously describe the data (Appendix S2: Fig. S1). We calculated this value using the log-likelihood returned by the LDA function in the topicmodels package, penalized according to the number of free parameters. As described in Appendix S2, this procedure likely underestimated the optimal number of topics, but our qualitative time series results are robust to including extra topics.

This comment has been minimized.

Copy link
Contributor Author

@diazrenata diazrenata replied Dec 15, 2018

Ohhhh thanks!

So my take-away from the slack conversations is that the 6 v. 4 outcomes, specifically, can with reasonable certainty be attributed to the differences in how the parameters are being counted (without even getting into AICc vs AIC)? If that's accurate, I'll keep going with a side-by-side comparison of changepoint with both selected LDAs.

Thanks again - this is super helpful!!

This comment has been minimized.

Copy link
Contributor

@juniperlsimonis juniperlsimonis replied Dec 15, 2018

you're welcome!
to answer your question: yes your take away is correct!
for the side-by-side of the TS part of things, I'd be most curious to see what the new TS model does given the old selection of LDA model. which is to say "ok, we know they select different LDAs. pretend that they actually selected the same LDA (the one the original model did), do they give the same TS output?"

diazrenata added 2 commits Dec 17, 2018
@diazrenata

This comment has been minimized.

Copy link
Contributor Author

@diazrenata diazrenata commented on 6de9787 Dec 17, 2018

Hey @juniperlsimonis, I'm turning up an error when trying to feed LDA_on_TS the paper LDA and maybe you can help?

LDA_on_TS seems to think the paper LDA isn't the right kind of object. paper-comparison/changepoint_error.r should reproduce the errors I'm getting, & they're also in the comments.

The paper LDA code generates an LDA_VEM S4 object, which LDA_on_TS rejects because it's 'not an LDA object'. But I think it is? I also tried making it into a list of length one, but no luck.

Interestingly,
mode(ldamodel) == mode(ldats_ldas[[1]])
returns TRUE, where ldats_ldas is the output of LDATS::LDA_set.

Is there a way that I've missed to make the paper LDA_VEM (look like) a LDA/LDA_set object?

Thanks!!

This comment has been minimized.

Copy link
Contributor

@juniperlsimonis juniperlsimonis replied Dec 17, 2018

ay! good catch! so, the way to make ldamodel into a list that will work for TS_on_LDA you should do

ldamodel_list<- list(ldamodel)
class(ldamodel_list) <- c("LDA_set", "list")

in place of where you did

ldamodel_list <- c(ldamodel)
diazrenata added 20 commits Dec 17, 2018
5 changepoints is still the lowest-deviance
Saving the changepoint objects in the repo breaks git (?).

Setting up to save them outside the repo also lets me work in other branches without getting tangled up.
I'm saving the changepoint models to my computer, because if I put them in the repo it breaks git on my computer. I'm thinking of putting the .Rds files in Dropbox or some other workaround for large files.
Paper changepoint on paper lda vs LDATS lda.

At this point I find it easiest to open the image files in 'paper-changepoint-files' and put the LDATS lda ones side by side with the paper lda ones. I know this is... not ideal.

Both give same #'s of changepoints (4 or 6 - 6 has lower deviance, but I'm not sure if the changepoint is to be trusted for 6 cpts?) and roughly the same locations.

The LDATs LDA w/ 4 cpts has less bimodality in the 1990's  than the paper LDA.
6 cpts is not, in fact, ever the best model.
start to compare ldas with adjusted/nonadjusted data
store model objects in dropbox
Adjusted v. nonadjusted data frames give qualitatively but not exactly the same results with the paper LDA
diazrenata and others added 25 commits Jan 25, 2019
confliect resolution to facilitate rebase
conflict resolution for rebasing
conflict resolution for rebasing
…re to consolidate all .Rds files into one file to load
…ssary load/rm. For the most part display rather than run code in paper-comparison vignette. Keep load/rm (unevaluated) in changepoint code, to demo memory management.
@juniperlsimonis
Copy link
Contributor

@juniperlsimonis juniperlsimonis commented Feb 12, 2019

addresses #78

@juniperlsimonis juniperlsimonis merged commit 3b078ae into master Feb 12, 2019
4 checks passed
4 checks passed
codecov/patch 100% of diff hit (target 96.77%)
Details
codecov/project 97.18% (+0.4%) compared to 10bceb1
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants
You can’t perform that action at this time.