When working with mi(), brms can add three way interactions to the Stan model when only 2-way interactions are specified in the formula #1608

gbiele · 2024-03-02T13:35:25Z

This is a bug-report.

The following model has only 2-way interactions between a variable with imputed data and a variable with complete data.

f =
  bf(A | mi() ~ mi(X) + mi(X):B + mi(X):mi(C)) +
  bf(X | mi() ~ 1) +
  bf(C | mi() ~ 1) +
  set_rescor (FALSE)

However, if one inspects the Stan code that brms generates for the model, one finds a three way interaction (I formatted the next lines to make the problem easy to spot):

mu_A[n]  = 
(bsp_A[1]) * Yl_X[n]  + 
(bsp_A[2]) * Yl_X[n] * Csp_A_1[n]  
(bsp_A[3]) * Yl_X[n] * Yl_C[n] * Csp_A_1[n];

brms adds multiplication with the variable without missing data (here * Csp_A_1[n], which is Bin the formula) to the interaction term that was specified in the brms model.

To see the bug, one has to inspect the Stan code, because if one fits the model, brms simply returns the posterior of bsp_A[3] as the coefficient for the interaction of mi(X):mi(C) even though in the Stan code it is the coefficient for an interaction mi(X):mi(C):B

This error only occurs if one specifies two 2-way interactions with an imputed variable where one of these interactions involves another imputed variable and the other involves a complete variable. If one specifies two 2-way interactions with an imputed variable and tow other variable that are both also imputed or that are both complete, the correct Stan code is generated (probably good to check this again).

Here is a full reproducible example that prints the problematic part of the model and shows that it is variable B that is added to the specified 2-way interaction.

library(brms)
library(magrittr)
N = 100
df = data.frame (A = rnorm(N), B = rnorm(N), C = rnorm(N), X = rnorm(N))
df$X[1] = NA; df$A[1] = NA; df$C [1] = NA
f =
  bf (A | mi() ~ mi(X) + mi(X):B + mi(X):mi(C)) +
  bf (X | mi() ~ 1) +
  bf (C | mi() ~ 1) +
  set_rescor (FALSE)
mu_A = make_stancode(f,df) %>% strsplit("\n") %>% .[[1]] %>% grep ("mu_A\\[n",., value = TRUE)
mu_A %>% strsplit("\\+") %>% .[[1]] %>% paste (collapse = "\n") %>% cat()
sdata = make_standata(f,df) 
sum(sdata$Csp_A_1 == df$B) == N

The text was updated successfully, but these errors were encountered:

paul-buerkner · 2024-03-02T14:07:02Z

thank you for the report! can you check what happens if none of the predictors are observed? Guido Biele ***@***.***> schrieb am Sa., 2. März 2024, 14:35:

…

This is a bug-report. The following model has only 2-way interactions between a variable with imputed data and a variable with complete data. f = bf(A | mi() ~ mi(X) + mi(X):B + mi(X):mi(C)) + bf(X | mi() ~ 1) + bf(C | mi() ~ 1) + set_rescor (FALSE) However, if one inspects the Stan code that brms generates for the model, one finds a three way interaction (I formatted the next lines to make the problem easy to spot): mu_A[n] = (bsp_A[1]) * Yl_X[n] + (bsp_A[2]) * Yl_X[n] * Csp_A_1[n] (bsp_A[3]) * Yl_X[n] * Yl_C[n] * Csp_A_1[n]; brms adds multiplication with the variable without missing data (here * Csp_A_1[n], which is Bin the formula) to the interaction term that was specified in the brms model. To see the bug, one has to inspect the Stan code, because if one fits the model, brms simply returns the posterior of bsp_A[3] as the coefficient for the interaction of mi(X):mi(C) even though in the Stan code it is the coefficient for an interaction mi(X):mi(C):B This *error only occurs if one specifies two 2-way interactions with an imputed variable where one of these interactions involves another imputed variable and the other involves a complete variable*. If one specifies two 2-way interactions with an imputed variable and tow other variable that are both also imputed or that are both complete, the correct Stan code is generated (probably good to check this again). Here is a full reproducible example that prints the problematic part of the model and shows that it is variable B that is added to the specified 2-way interaction. library(brms) library(magrittr) N = 100 df = data.frame (A = rnorm(N), B = rnorm(N), C = rnorm(N), X = rnorm(N)) df$X[1] = NA; df$A[1] = NA; df$C [1] = NA f = bf (A | mi() ~ mi(X) + mi(X):B + mi(X):mi(C)) + bf (X | mi() ~ 1) + bf (C | mi() ~ 1) + set_rescor (FALSE) mu_A = make_stancode(f,df) %>% strsplit("\n") %>% .[[1]] %>% grep ("mu_A\\[n",., value = TRUE) mu_A %>% strsplit("\\+") %>% .[[1]] %>% paste (collapse = "\n") %>% cat() sdata = make_standata(f,df) sum(sdata$Csp_A_1 == df$B) == N — Reply to this email directly, view it on GitHub <#1608>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADCW2AC4QKOPFJWRCFQRZRDYWHISVAVCNFSM6AAAAABEDDOTCOVHI2DSMVQWIX3LMV43ASLTON2WKOZSGE3DIOBRGA2DMMI> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

gbiele · 2024-03-04T07:53:32Z

I am not sure what you mean with "if none of the
predictors are observed?"

If one specifies interactions only with wariables thatr have missing data, the correct Stan code is generated:

N = 100
df = data.frame (A = rnorm(N), B = rnorm(N), C = rnorm(N), X = rnorm(N))
df$X[1] = NA; df$A[1] = NA; df$C [1] = NA
f =
  bf (A | mi() ~ mi(X) + mi(X):mi(B) + mi(X):mi(C)) +
  bf (B | mi() ~ 1) +
  bf (X | mi() ~ 1) +
  bf (C | mi() ~ 1) +
  set_rescor (FALSE)
mu_A = make_stancode(f,df) %>% strsplit("\n") %>% .[[1]] %>% grep ("mu_A\\[n",., value = TRUE)
mu_A %>% strsplit("\\+") %>% .[[1]] %>% paste (collapse = "\n") %>% cat()

paul-buerkner · 2024-03-04T07:56:35Z

I am sorry. That was a typo. I mean if all of the predictors are observed. I.e. if no mi() terms are used. Let me quickly check it.

paul-buerkner · 2024-03-04T08:00:16Z

Okay, I can reproduce the problem. Let me check what is going on.

paul-buerkner · 2024-03-04T08:32:24Z

Should now be fixed. Thank again!

paul-buerkner added the bug label Mar 4, 2024

paul-buerkner added this to the 2.21.0 milestone Mar 4, 2024

paul-buerkner added a commit that referenced this issue Mar 4, 2024

fix issue #1608

9e7d825

paul-buerkner closed this as completed Mar 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When working with mi(), brms can add three way interactions to the Stan model when only 2-way interactions are specified in the formula #1608

When working with mi(), brms can add three way interactions to the Stan model when only 2-way interactions are specified in the formula #1608

gbiele commented Mar 2, 2024

paul-buerkner commented Mar 2, 2024 via email

gbiele commented Mar 4, 2024

paul-buerkner commented Mar 4, 2024

paul-buerkner commented Mar 4, 2024

paul-buerkner commented Mar 4, 2024

When working with mi(), brms can add three way interactions to the Stan model when only 2-way interactions are specified in the formula #1608

When working with mi(), brms can add three way interactions to the Stan model when only 2-way interactions are specified in the formula #1608

Comments

gbiele commented Mar 2, 2024

paul-buerkner commented Mar 2, 2024 via email

gbiele commented Mar 4, 2024

paul-buerkner commented Mar 4, 2024

paul-buerkner commented Mar 4, 2024

paul-buerkner commented Mar 4, 2024