LRT for comparing non-mixed to mixed models #508

palday · 2021-04-12T17:48:47Z

Closes #491.

This only allows one non-mixed model and it has to be the first argument. That's fairly restrictive, but generally speaking, I want to avoid testing a massive series of increasingly complicated fixed-effects models and then adding random-effects onto that series.

palday · 2021-04-12T17:57:24Z

src/likelihoodratiotest.jl

+!!! note
+    For comparisons between mixed and non-mixed models, the deviance for the non-mixed
+    model is taken to be -2 log likelihood, i.e. omitting the additive constant for the
+    fully saturated model. This is in line with the computation of the deviance for mixed
+    models.


@dmbates I can feel you looking over my shoulder and shaking your head when I write something like "the deviance isn't really the deviance". 🙂 Do you have a better way of handling / labelling this?

The important point is that the difference in -2 log likelihood is the same as the difference in deviances would be if they could be calculated. Of course, now that I have written that I wonder if it makes sense when comparing a LinearModel and a LinearMixedModel. We know what the saturated model for a LinearModel is but we don't for a LinearMixedModel. My statement seems to indicate that the deviance for the saturated model in a LinearMixedModel would then have to be the same as that of a LinearModel but I am not sure that makes sense. I am starting to confuse myself.

I'm glad I'm not the only one who struggles with some of this stuff. 😄

I think the relevant bit here is actually that the test statistic for LRT is -2 log(L_1/L2) which then is just the difference of -2 log likelihood. Using the true deviance was always just a convenience because the additive constants cancel out, when both models have the same constant.

All that said, my question is: should we change the label of that column and/or the internal fields? I would prefer to keep the internal fieldnames, but changing the show method's output is trivial.

Hmm. For a linear mixed model it would be more accurate to label the column as -2 loglik. For a GLMM I think it should be deviance. I believe that for a Binomial GLMM that is not also Bernoulli (e.g. cbpp) the quantity in that column will not be -2 loglik

Ah, yes, that is true. And indeed the loglikelihood and deviance for GLMM are defined independently.

Let me see what it would take to add a conditional to the show method --I think we'll need to add something to the LikelihoodRatioTest struct because right now there is no notion of which model it came from.

codecov · 2021-04-12T18:03:46Z

Codecov Report

Merging #508 (cdd1369) into main (715fac5) will increase coverage by 0.14%.
The diff coverage is 98.63%.

❗ Current head cdd1369 differs from pull request most recent head dae16e2. Consider uploading reports for the commit dae16e2 to get more accurate results

@@            Coverage Diff             @@
##             main     #508      +/-   ##
==========================================
+ Coverage   93.99%   94.14%   +0.14%     
==========================================
  Files          26       26              
  Lines        2164     2236      +72     
==========================================
+ Hits         2034     2105      +71     
- Misses        130      131       +1

Impacted Files	Coverage Δ
src/linearmixedmodel.jl	`95.47% <96.15%> (+0.02%)`	⬆️
src/likelihoodratiotest.jl	`98.47% <100.00%> (+0.79%)`	⬆️
src/optsummary.jl	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 715fac5...dae16e2. Read the comment docs.

kleinschmidt

I think I understand why true nesting isn't checked (it's hard to do in general) but it feels like a potential footgun...I suggested a few places where you could be clearer in the docstrings and/or add some additional heuristics for non-formula-bearing GLMs.

src/likelihoodratiotest.jl

…nmixedlrt

dmbates · 2021-04-13T17:48:06Z

I think I understand why true nesting isn't checked (it's hard to do in general) but it feels like a potential footgun...I suggested a few places where you could be clearer in the docstrings and/or add some additional heuristics for non-formula-bearing GLMs.

There is a way of checking nesting - approximately checking due to floating point, of course. The columns of the model matrix for the linear model should be in the column space of m.X where m is the linear mixed model. If you take a QR decomposition of m.X and multiply the linear model's model matrix by Q' then in each of the columns of the product, the squared length of the last n - rank(m.X) positions should be negligible compared to the squared length of the first rank(m.X) positions.

palday · 2021-04-13T17:54:27Z

There is a way of checking nesting - approximately checking due to floating point, of course. The columns of the model matrix for the linear model should be in the column space of m.X where m is the linear mixed model. If you take a QR decomposition of m.X and multiply the linear model's model matrix by Q' then in each of the columns of the product, the squared length of the last n - rank(m.X) positions should be negligible compared to the squared length of the first rank(m.X) positions.

I like this idea (and that's what the tolerance kwarg is for, I suppose), but I think that should be a different PR because we can add that to the nesting checks for two MixedModels.

…nmixedlrt

dmbates · 2021-04-15T17:17:43Z

The CI failures on Julia v1.4 are because the newly added code calls contains which doesn't exist in v1.4.

Should we continue to test on 1.4? Is it because that is currently the LTS release?

kleinschmidt

I didn't look closely at the matrix checking stuff because it's a bit over my head but otherwise looks good.

src/likelihoodratiotest.jl

Co-authored-by: Dave Kleinschmidt <dave.f.kleinschmidt@gmail.com>

…into pa/unmixedlrt

NEWS.md

src/likelihoodratiotest.jl

Co-authored-by: Douglas Bates <dmbates@gmail.com>

palday · 2021-04-15T17:33:27Z

@dmbates The current LTS is 1.0! I'm happy to drop 1.4 and 1.5 with MixedModels 4.0, if for no other reason than CI speed. If/when 1.6 is declared LTS, my preference would be to have to LTS to change out tiered support architecture to "LTS" and "current release".

…into pa/unmixedlrt

@kleinschmidt

* LRT for comparing non-mixed to mixed models * nesting for LM/GLMM and GLM/LMM * NEWS entry * use _iscomparable instead of isnested * add missing dep * inner has no more preds than outer, thx @kleinschmidt * column span check * oops * julia 1.4 compat * condition LRT show method on G/LMM * use deviance for GLM and -2 loglikelihood for LMM * contains -> occursin (Julia 1.4 compat) * Apply suggestions from code review Co-authored-by: Dave Kleinschmidt <dave.f.kleinschmidt@gmail.com> * copy-paste error * reenable a shortcut * Apply suggestions from code review Co-authored-by: Douglas Bates <dmbates@gmail.com> Co-authored-by: Dave Kleinschmidt <dave.f.kleinschmidt@gmail.com> Co-authored-by: Douglas Bates <dmbates@gmail.com> (cherry picked from commit 365e5f7)

@kleinschmidt

* LRT for comparing non-mixed to mixed models (#508) * LRT for comparing non-mixed to mixed models * nesting for LM/GLMM and GLM/LMM * NEWS entry * use _iscomparable instead of isnested * add missing dep * inner has no more preds than outer, thx @kleinschmidt * column span check * oops * julia 1.4 compat * condition LRT show method on G/LMM * use deviance for GLM and -2 loglikelihood for LMM * contains -> occursin (Julia 1.4 compat) * Apply suggestions from code review Co-authored-by: Dave Kleinschmidt <dave.f.kleinschmidt@gmail.com> * copy-paste error * reenable a shortcut * Apply suggestions from code review Co-authored-by: Douglas Bates <dmbates@gmail.com> Co-authored-by: Dave Kleinschmidt <dave.f.kleinschmidt@gmail.com> Co-authored-by: Douglas Bates <dmbates@gmail.com> (cherry picked from commit 365e5f7) * patch bump * add Distribution and Link methods

palday added 3 commits April 12, 2021 19:42

LRT for comparing non-mixed to mixed models

4d23eb1

nesting for LM/GLMM and GLM/LMM

5accab7

NEWS entry

f8614cd

palday requested a review from kleinschmidt April 12, 2021 17:55

palday commented Apr 12, 2021

View reviewed changes

kleinschmidt reviewed Apr 12, 2021

View reviewed changes

src/likelihoodratiotest.jl Show resolved Hide resolved

src/likelihoodratiotest.jl Outdated Show resolved Hide resolved

src/likelihoodratiotest.jl Outdated Show resolved Hide resolved

palday added 4 commits April 12, 2021 23:08

use _iscomparable instead of isnested

f2c25d4

add missing dep

b8ea015

inner has no more preds than outer, thx @kleinschmidt

231f0e0

Merge branch 'main' of github.com:JuliaStats/MixedModels.jl into pa/u…

04bbda4

…nmixedlrt

palday requested a review from kleinschmidt April 12, 2021 21:49

palday added 6 commits April 13, 2021 23:02

column span check

01763e2

oops

b777afc

julia 1.4 compat

dae16e2

condition LRT show method on G/LMM

0371a0e

use deviance for GLM and -2 loglikelihood for LMM

b952a67

Merge branch 'main' of github.com:JuliaStats/MixedModels.jl into pa/u…

ecc9de3

…nmixedlrt

contains -> occursin (Julia 1.4 compat)

82636ff

kleinschmidt approved these changes Apr 15, 2021

View reviewed changes

src/likelihoodratiotest.jl Outdated Show resolved Hide resolved

src/likelihoodratiotest.jl Show resolved Hide resolved

src/likelihoodratiotest.jl Outdated Show resolved Hide resolved

src/likelihoodratiotest.jl Outdated Show resolved Hide resolved

palday and others added 4 commits April 15, 2021 17:26

Apply suggestions from code review

6c29191

Co-authored-by: Dave Kleinschmidt <dave.f.kleinschmidt@gmail.com>

copy-paste error

31cff45

Merge branch 'pa/unmixedlrt' of github.com:JuliaStats/MixedModels.jl …

af7b2e3

…into pa/unmixedlrt

reenable a shortcut

070cc73

dmbates approved these changes Apr 15, 2021

View reviewed changes

NEWS.md Outdated Show resolved Hide resolved

src/likelihoodratiotest.jl Outdated Show resolved Hide resolved

Apply suggestions from code review

eb6b59b

Co-authored-by: Douglas Bates <dmbates@gmail.com>

Merge branch 'pa/unmixedlrt' of github.com:JuliaStats/MixedModels.jl …

0d70a7b

…into pa/unmixedlrt

palday merged commit 365e5f7 into main Apr 15, 2021

palday deleted the pa/unmixedlrt branch April 15, 2021 18:06

palday mentioned this pull request Apr 15, 2021

BACKPORT: LRT for comparing non-mixed to mixed models #510

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LRT for comparing non-mixed to mixed models #508

LRT for comparing non-mixed to mixed models #508

palday commented Apr 12, 2021 •

edited

palday Apr 12, 2021

dmbates Apr 13, 2021

palday Apr 13, 2021

dmbates Apr 15, 2021

palday Apr 15, 2021

codecov bot commented Apr 12, 2021 •

edited

kleinschmidt left a comment

dmbates commented Apr 13, 2021

palday commented Apr 13, 2021

dmbates commented Apr 15, 2021

kleinschmidt left a comment

palday commented Apr 15, 2021

LRT for comparing non-mixed to mixed models #508

LRT for comparing non-mixed to mixed models #508

Conversation

palday commented Apr 12, 2021 • edited

palday Apr 12, 2021

Choose a reason for hiding this comment

dmbates Apr 13, 2021

Choose a reason for hiding this comment

palday Apr 13, 2021

Choose a reason for hiding this comment

dmbates Apr 15, 2021

Choose a reason for hiding this comment

palday Apr 15, 2021

Choose a reason for hiding this comment

codecov bot commented Apr 12, 2021 • edited

Codecov Report

kleinschmidt left a comment

Choose a reason for hiding this comment

dmbates commented Apr 13, 2021

palday commented Apr 13, 2021

dmbates commented Apr 15, 2021

kleinschmidt left a comment

Choose a reason for hiding this comment

palday commented Apr 15, 2021

palday commented Apr 12, 2021 •

edited

codecov bot commented Apr 12, 2021 •

edited