Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When working with mi(), brms can add three way interactions to the Stan model when only 2-way interactions are specified in the formula #1608

Closed
gbiele opened this issue Mar 2, 2024 · 5 comments
Labels
Milestone

Comments

@gbiele
Copy link
Sponsor

gbiele commented Mar 2, 2024

This is a bug-report.

The following model has only 2-way interactions between a variable with imputed data and a variable with complete data.

f =
  bf(A | mi() ~ mi(X) + mi(X):B + mi(X):mi(C)) +
  bf(X | mi() ~ 1) +
  bf(C | mi() ~ 1) +
  set_rescor (FALSE)

However, if one inspects the Stan code that brms generates for the model, one finds a three way interaction (I formatted the next lines to make the problem easy to spot):

mu_A[n]  = 
(bsp_A[1]) * Yl_X[n]  + 
(bsp_A[2]) * Yl_X[n] * Csp_A_1[n]  
(bsp_A[3]) * Yl_X[n] * Yl_C[n] * Csp_A_1[n]; 

brms adds multiplication with the variable without missing data (here * Csp_A_1[n], which is Bin the formula) to the interaction term that was specified in the brms model.

To see the bug, one has to inspect the Stan code, because if one fits the model, brms simply returns the posterior of bsp_A[3] as the coefficient for the interaction of mi(X):mi(C) even though in the Stan code it is the coefficient for an interaction mi(X):mi(C):B

This error only occurs if one specifies two 2-way interactions with an imputed variable where one of these interactions involves another imputed variable and the other involves a complete variable. If one specifies two 2-way interactions with an imputed variable and tow other variable that are both also imputed or that are both complete, the correct Stan code is generated (probably good to check this again).

Here is a full reproducible example that prints the problematic part of the model and shows that it is variable B that is added to the specified 2-way interaction.

library(brms)
library(magrittr)
N = 100
df = data.frame (A = rnorm(N), B = rnorm(N), C = rnorm(N), X = rnorm(N))
df$X[1] = NA; df$A[1] = NA; df$C [1] = NA
f =
  bf (A | mi() ~ mi(X) + mi(X):B + mi(X):mi(C)) +
  bf (X | mi() ~ 1) +
  bf (C | mi() ~ 1) +
  set_rescor (FALSE)
mu_A = make_stancode(f,df) %>% strsplit("\n") %>% .[[1]] %>% grep ("mu_A\\[n",., value = TRUE)
mu_A %>% strsplit("\\+") %>% .[[1]] %>% paste (collapse = "\n") %>% cat()
sdata = make_standata(f,df) 
sum(sdata$Csp_A_1 == df$B) == N
@paul-buerkner
Copy link
Owner

paul-buerkner commented Mar 2, 2024 via email

@gbiele
Copy link
Sponsor Author

gbiele commented Mar 4, 2024

I am not sure what you mean with "if none of the
predictors are observed?"

If one specifies interactions only with wariables thatr have missing data, the correct Stan code is generated:

N = 100
df = data.frame (A = rnorm(N), B = rnorm(N), C = rnorm(N), X = rnorm(N))
df$X[1] = NA; df$A[1] = NA; df$C [1] = NA
f =
  bf (A | mi() ~ mi(X) + mi(X):mi(B) + mi(X):mi(C)) +
  bf (B | mi() ~ 1) +
  bf (X | mi() ~ 1) +
  bf (C | mi() ~ 1) +
  set_rescor (FALSE)
mu_A = make_stancode(f,df) %>% strsplit("\n") %>% .[[1]] %>% grep ("mu_A\\[n",., value = TRUE)
mu_A %>% strsplit("\\+") %>% .[[1]] %>% paste (collapse = "\n") %>% cat()

@paul-buerkner
Copy link
Owner

I am sorry. That was a typo. I mean if all of the predictors are observed. I.e. if no mi() terms are used. Let me quickly check it.

@paul-buerkner
Copy link
Owner

Okay, I can reproduce the problem. Let me check what is going on.

@paul-buerkner paul-buerkner added this to the 2.21.0 milestone Mar 4, 2024
paul-buerkner added a commit that referenced this issue Mar 4, 2024
@paul-buerkner
Copy link
Owner

Should now be fixed. Thank again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants