posterior predictive check for binomial glm with matrix response #644

richardjtelford · 2023-10-28T15:56:54Z

I've noticed that the posterior predictive check for a binomial glm is different depending on whether the response variable is a vector of proportions or a matrix of successes and failures.

Minimal reprex

tot <- rep(10, 100)
suc <- rbinom(100, prob = 0.9, size = tot)

df <- data.frame(tot, suc)

df$prop <- suc/tot

mod1 <- glm(cbind(suc, tot - suc) ~ 1 ,
            family = binomial,
            data = df)

#performance::check_model(mod1)
performance::check_posterior_predictions(mod1)

mod2 <- glm(prop ~ 1 ,
            family = binomial,
            data = df, 
            weights = tot)


performance::check_posterior_predictions(mod2)

I was, perhaps mistakenly, expecting these plots to look the same.

In performance::pp_check.glm, the result is calculated as

sapply(matrix_sim, function(i) i[, 1]/i[, 2], simplify = TRUE)

This is generating lots of Inf where the number of failures is zero.

Should this code not be calculating the proportion, with code such as

sapply(matrix_sim, function(i) i[, 1]/rowSums(i), simplify = TRUE)

The text was updated successfully, but these errors were encountered:

richardjtelford · 2023-10-28T18:49:09Z

Thanks for having a look at this. I appear to have given a partial fix. Looking again line 288

  out$y <- response[, 1] / response[, 2]

would also need to become

  out$y <- response[, 1] / rowSums(response)

so that it was calculated in the same way

richardjtelford · 2023-10-28T19:53:40Z

Thank you. That looks how I would expect it to.

Would be nice to improve the x-axis label, but not sure what would work and be easy

strengejacke · 2023-10-29T08:34:48Z

After fixing a bug in insight, this is how it would look like with the current implementation, and your suggested fix.

set.seed(1)
tot <- rep(10, 100)
suc <- rbinom(100, prob = 0.9, size = tot)
df <- data.frame(tot, suc)
df$prop <- suc / tot

mod1 <- glm(cbind(suc, tot - suc) ~ 1,
  family = binomial,
  data = df
)

mod2 <- glm(prop ~ 1,
  family = binomial,
  data = df,
  weights = tot
)

mod3 <- glm(cbind(suc, tot) ~ 1,
  family = binomial,
  data = df
)

mod4 <- glm(am ~ 1,
  family = binomial,
  data = mtcars
)

Mod1

Curent (mod1)

New (mod1)

Mod2

Curent (mod2)

New (mod2)

Mod3

Curent (mod3)

New (mod3)

Mod4

Curent (mod4)

New (mod4)

Fixes #644

This comment was marked as outdated.

Sign in to view

strengejacke self-assigned this Oct 29, 2023

strengejacke added a commit that referenced this issue Oct 29, 2023

posterior predictive check for binomial glm with matrix response

bfc8f8f

Fixes #644

strengejacke mentioned this issue Oct 29, 2023

posterior predictive check for binomial glm with matrix response #645

Merged

strengejacke closed this as completed in #645 Oct 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

posterior predictive check for binomial glm with matrix response #644

posterior predictive check for binomial glm with matrix response #644

richardjtelford commented Oct 28, 2023

This comment was marked as outdated.

richardjtelford commented Oct 28, 2023

This comment was marked as outdated.

richardjtelford commented Oct 28, 2023

strengejacke commented Oct 29, 2023 •

edited

Loading

posterior predictive check for binomial glm with matrix response #644

posterior predictive check for binomial glm with matrix response #644

Comments

richardjtelford commented Oct 28, 2023

This comment was marked as outdated.

richardjtelford commented Oct 28, 2023

This comment was marked as outdated.

richardjtelford commented Oct 28, 2023

strengejacke commented Oct 29, 2023 • edited Loading

Mod1

Curent (mod1)

New (mod1)

Mod2

Curent (mod2)

New (mod2)

Mod3

Curent (mod3)

New (mod3)

Mod4

Curent (mod4)

New (mod4)

strengejacke commented Oct 29, 2023 •

edited

Loading