Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

D1 and D2 incorrectly remove additional interaction terms if main effects are in different order #420

Closed
DavidLukeThiessen opened this issue Jul 15, 2021 · 3 comments

Comments

@DavidLukeThiessen
Copy link

If we use the D1 or D2 functions to test a null model including interaction terms against a full model, but the null model has the main effects entered in a different order, then the D1 or D2 functions will give incorrect results. The results given are instead tests of the full model vs the null model with additional interaction terms removed.

library(mice)
#> 
#> Attaching package: 'mice'
#> The following object is masked from 'package:stats':
#> 
#>     filter
#> The following objects are masked from 'package:base':
#> 
#>     cbind, rbind
gen_dat <- function(n) {
  x1 <- rnorm(n)
  x2 <- rnorm(n)
  x3 <- rnorm(n)
  y <- rnorm(n)
  x1[rbinom(n,1,0.1)==1] <- NA
  x2[rbinom(n,1,0.1)==1] <- NA
  x3[rbinom(n,1,0.1)==1] <- NA
  data.frame(y=y,x1=x1,x2=x2,x3=x3)
}
set.seed(1)
dat <- gen_dat(100)
imps <- mice(dat,printFlag = FALSE)
mods_full <- with(imps, lm(y ~ (x1 + x2 + x3)^3))
mods_null1 <- with(imps, lm(y ~ x1 + x2 + x3 + (x1 + x2 + x3)^2))
mods_null2 <- with(imps, lm(y ~ x1 + x3 + x2 + (x1 + x2 + x3)^2))
# The two null models are equal except for the order of the terms
summary(pool(mods_null1))
#>          term    estimate  std.error  statistic       df    p.value
#> 1 (Intercept)  0.03987886 0.09726925  0.4099843 88.12181 0.68281249
#> 2          x1 -0.11491420 0.12690347 -0.9055245 43.21742 0.37020844
#> 3          x2 -0.01970677 0.10498217 -0.1877154 79.60035 0.85157747
#> 4          x3  0.06618384 0.10316874  0.6415106 31.77301 0.52579216
#> 5       x1:x2 -0.45955084 0.17344060 -2.6496152 17.73090 0.01644391
#> 6       x1:x3 -0.15321083 0.11105064 -1.3796483 67.55958 0.17224433
#> 7       x2:x3  0.05877360 0.10224793  0.5748146 62.19560 0.56749021
summary(pool(mods_null2))
#>          term    estimate  std.error  statistic       df    p.value
#> 1 (Intercept)  0.03987886 0.09726925  0.4099843 88.12181 0.68281249
#> 2          x1 -0.11491420 0.12690347 -0.9055245 43.21742 0.37020844
#> 3          x3  0.06618384 0.10316874  0.6415106 31.77301 0.52579216
#> 4          x2 -0.01970677 0.10498217 -0.1877154 79.60035 0.85157747
#> 5       x1:x2 -0.45955084 0.17344060 -2.6496152 17.73090 0.01644391
#> 6       x1:x3 -0.15321083 0.11105064 -1.3796483 67.55958 0.17224433
#> 7       x3:x2  0.05877360 0.10224793  0.5748146 62.19560 0.56749021
D1(mods_full,mods_null1) # Correct test
#>    test statistic df1 df2 dfcom   p.value       riv
#>  1 ~~ 2 0.6580279   1   4    92 0.4627646 0.4758117
D1(mods_full,mods_null2) # Incorrect test, note df1 = 2
#>    test statistic df1      df2 dfcom   p.value       riv
#>  1 ~~ 2 0.5156992   2 28.09346    92 0.6026262 0.3332141
D2(mods_full,mods_null1) # Correct
#>    test statistic df1      df2 dfcom   p.value       riv
#>  1 ~~ 2 0.2299279   1 19.60542    NA 0.6368829 0.8237909
D2(mods_full,mods_null2) # Incorrect
#>    test  statistic df1      df2 dfcom   p.value       riv
#>  1 ~~ 2 0.02217649   2 16.25775    NA 0.9780971 0.6747445

# The reported incorrect results are equivalent to testing the reduced
# model with additional interaction terms removed. The number of
# additional terms removed depends on where the final main effect from
# the full model appears in the null model.
mods_null3 <- with(imps, lm(y ~ x1 + x2 + x3 + x1*x2 + x1*x3))
# Equal to output from above
D1(mods_full,mods_null3)
#>    test statistic df1      df2 dfcom   p.value       riv
#>  1 ~~ 2 0.5156992   2 28.09346    92 0.6026262 0.3332141
D2(mods_full,mods_null3)
#>    test  statistic df1      df2 dfcom   p.value       riv
#>  1 ~~ 2 0.02217649   2 16.25775    NA 0.9780971 0.6747445
@gerkovink
Copy link
Member

Thanks for your question.

This is expected behaviour and is related to the internal model formula handling in lm() and the nature of the coefficients tests that are used. The models of your mods_null2 are not nested in mods_full in the low-level sense that the parameter names of are different. This also explains the drop of 2 df: the interaction x1:x2:x3 is dropped as it is not in the full model and the term x3:x2 is dropped as it is also not matching any full model parameter names.

The likelihood ratio test D3, unlike the multivariate Wald test (D1) and the combining test statistic (D2), does not rely on a strict parameter naming. This behaviour is conform model testing with anova().

@stefvanbuuren I propose that we add some lines to help of D1() and D2() detailing this strict requirement.

All the best,

Gerko

library(mice)
gen_dat <- function(n) {
  x1 <- rnorm(n)
  x2 <- rnorm(n)
  x3 <- rnorm(n)
  y <- rnorm(n)
  x1[rbinom(n,1,0.1)==1] <- NA
  x2[rbinom(n,1,0.1)==1] <- NA
  x3[rbinom(n,1,0.1)==1] <- NA
  data.frame(y=y,x1=x1,x2=x2,x3=x3)
}
set.seed(1)
dat <- gen_dat(100)

# check default model comparison consistency
fit <- lm(y ~ x1*x2*x3, data = dat)
fit0 <- lm(y~x1 + x2 + x3 + (x1 + x2 + x3)^2, data = dat)
fit2 <- lm(y~x1 + x3 + x2 + (x1 + x2 + x3)^2, data = dat)

anova(fit, fit0, test = "LRT")
#> Analysis of Variance Table
#> 
#> Model 1: y ~ x1 * x2 * x3
#> Model 2: y ~ x1 + x2 + x3 + (x1 + x2 + x3)^2
#>   Res.Df    RSS Df Sum of Sq Pr(>Chi)
#> 1     61 46.723                      
#> 2     62 47.907 -1   -1.1841   0.2137
anova(fit, fit2, test = "LRT")
#> Analysis of Variance Table
#> 
#> Model 1: y ~ x1 * x2 * x3
#> Model 2: y ~ x1 + x3 + x2 + (x1 + x2 + x3)^2
#>   Res.Df    RSS Df Sum of Sq Pr(>Chi)
#> 1     61 46.723                      
#> 2     62 47.907 -1   -1.1841   0.2137

# Name differences
names(coef(fit0)) %in% names(coef(fit))
#> [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE
names(coef(fit2)) %in% names(coef(fit))
#> [1]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE
coef(fit)
#> (Intercept)          x1          x2          x3       x1:x2       x1:x3 
#> -0.07553160 -0.15939596 -0.09645861 -0.07801288 -0.42659727 -0.21350269 
#>       x2:x3    x1:x2:x3 
#>  0.13094054 -0.17232897
coef(fit2)
#> (Intercept)          x1          x3          x2       x1:x2       x1:x3 
#> -0.05249388 -0.15748075 -0.04227937 -0.07545730 -0.46607063 -0.16750329 
#>       x3:x2 
#>  0.15378880

# Use D3 - Likelihood Ratio Test
imp <- mice(dat, 
            printFlag = FALSE)
fit <- with(imp, lm(y ~ x1*x2*x3))
fit0 <- with(imp, lm(y~x1 + x2 + x3 + (x1 + x2 + x3)^2))
fit2 <- with(imp, lm(y~x1 + x3 + x2 + (x1 + x2 + x3)^2))

D3(fit, fit0)
#>    test statistic df1      df2 dfcom   p.value       riv
#>  1 ~~ 2 0.6261685   1 18.43791    92 0.4388258 0.8718629
D3(fit, fit2)
#>    test statistic df1      df2 dfcom   p.value       riv
#>  1 ~~ 2 0.6261685   1 18.43791    92 0.4388258 0.8718629

Created on 2021-07-16 by the reprex package (v2.0.0)

@DavidLukeThiessen
Copy link
Author

Thank you for the explanation. That makes sense, but I'm surprised this non-nesting isn't detected and doesn't give an error or warning. Something like,

if(!all(names(coef(fit_null)) %in% names(coef(fit_full)))) stop("not nested")

But, since the other code to test if models are nested seems to come from mitml, I guess error/warning should be made there instead.

@stefvanbuuren
Copy link
Member

Thanks for noting this issue. Now added a warning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants