Test get_variance for zero-inflation models #893

strengejacke · 2024-06-14T16:49:03Z

Related to #889

TL;DR

We have two questions:

How to calculate the distribution-specific variance for zero-inflated models? (or at least: what's more accurate, see this comment
Is it expected that the R2 is lower for zero-inflated models, when the full model (zero-inflation and conditional part) is taken into account (as opposed to the conditional component of zero-inflated models only, see following example)?

You can ignore the detailed changes in this PR - just the comments and examples shown here are relevant for the discussion!

What did this PR change?

In the current main branch, the dispersion parameter for Poisson and ZI Poisson is set to 1. This PR changes the behaviour for ZI Poisson, returning a different dispersion / variance:

# For zero-inflated poisson models, the
# distributional variance is based on Zuur et al. 2012
# ----------------------------------------------
.variance_zip <- function(model, faminfo, family_var) {
  if (inherits(model, "glmmTMB")) {
    p <- stats::predict(model, type = "zprob")
    mu <- stats::predict(model, type = "conditional")
    pvar <- (1 - p) * (mu + p * mu^2)
  } else if (inherits(model, "MixMod")) {
    p <- stats::plogis(stats::predict(model, type_pred = "link", type = "zero_part"))
    mu <- suppressWarnings(stats::predict(model, type = "mean_subject"))
    pvar <- (1 - p) * (mu + p * mu^2)
  } else {
    pvar <- family_var
  }

  mean(pvar)
}

Taking following model, for this particular example, this PR comes closer to a Bayesian model than insight from the main branch.

m <- glmmTMB::glmmTMB(count ~ mined + cover + (1 + cover | site),
  ziformula = ~mined,
  family = poisson(), data = Salamanders
)
performance::r2_nakagawa(m)

Current main branch returns

# R2 for Mixed Models

  Conditional R2: 0.650
     Marginal R2: 0.525

This PR returns

# R2 for Mixed Models

  Conditional R2: 0.414
     Marginal R2: 0.334

brms returns (marginal R2 only)

library(brms)
m2 <- brms::brm(bf(count ~ mined + (1 | site), zi ~mined),
  family = brms::zero_inflated_poisson(), data = Salamanders, backend = "rstan"
)
brms::bayes_R2(m2)
    Estimate  Est.Error      Q2.5     Q97.5
R2 0.1686378 0.01874732 0.1334217 0.2070608

Any ideas how to validate the results? @bbolker @bwiernik ?

strengejacke · 2024-06-14T20:04:07Z

What would be expected as R2 result for zero inflated models in comparison to non zero inflated? Would R2 be higher or lower for simple Poisson as opposed to zero inflated Poisson, for example?

strengejacke · 2024-06-18T08:37:41Z

Revisiting this code, the question is whether the last line should be mean() or var()?

# For zero-inflated poisson models, the
# distributional variance is based on Zuur et al. 2012
# ----------------------------------------------
.variance_zip <- function(model, faminfo, family_var) {
  if (inherits(model, "glmmTMB")) {
    p <- stats::predict(model, type = "zprob")
    mu <- stats::predict(model, type = "conditional")
    pvar <- (1 - p) * (mu + p * mu^2)
  }
  mean(pvar) # var() better here?
}

Using mean()

# using mean()
data(fish, package = "insight")
model <- glmmTMB::glmmTMB(
  count ~ child + camper + (1 | persons),
  ziformula = ~ child + camper,
  data = fish,
  family = poisson()
)
insight::get_variance(model)
#> $var.fixed
#> [1] 1.141135
#> 
#> $var.random
#> [1] 1.133478
#> 
#> $var.residual
#> [1] 1.03198
#> 
#> $var.distribution
#> [1] 1.03198
#> 
#> $var.dispersion
#> [1] 0
#> 
#> $var.intercept
#>  persons 
#> 1.133478

performance::r2_nakagawa(model)
#> # R2 for Mixed Models
#> 
#>   Conditional R2: 0.688
#>      Marginal R2: 0.345

Using var()

# using var()
data(fish, package = "insight")
model <- glmmTMB::glmmTMB(
  count ~ child + camper + (1 | persons),
  ziformula = ~ child + camper,
  data = fish,
  family = poisson()
)
insight::get_variance(model)
#> $var.fixed
#> [1] 1.141135
#> 
#> $var.random
#> [1] 1.133478
#> 
#> $var.residual
#> [1] 4.911707
#> 
#> $var.distribution
#> [1] 4.911707
#> 
#> $var.dispersion
#> [1] 0
#> 
#> $var.intercept
#>  persons 
#> 1.133478

performance::r2_nakagawa(model)
#> # R2 for Mixed Models
#> 
#>   Conditional R2: 0.317
#>      Marginal R2: 0.159

The same model fitted with brms returns (using bayes_R2()) a conditional R2 of ~0.20 (and marginal < 0.1).

strengejacke · 2024-06-18T09:11:18Z

I think, mean() is better. The important parameters are mu and p. When we take the average mu and p from the predictions, it is similar to taking the mean() at the end.

p <- stats::predict(model, type = "zprob")
mu <- stats::predict(model, type = "conditional")
pvar <- (1 - p) * (mu + p * mu^2)
mean(pvar)

# similar to
p <- mean(stats::predict(model, type = "zprob"))
mu <- mean(stats::predict(model, type = "conditional"))
pvar <- (1 - p) * (mu + p * mu^2)
pvar

Test get_variance for zero-inflation models

0ec79fe

Merge branch 'main' into var_zi

8dae4f3

strengejacke added the 3 investigators ❔❓ Need to look further into this issue label Jun 18, 2024

strengejacke added 20 commits June 18, 2024 11:25

docs

53b0052

add warning

a08e75f

news

d572671

rename variables

531882c

rename variablkes

9e9c65d

rename

db490b3

docs

b2c8ef7

fix

5dff93f

fix test

3d50ddc

fix test

894bd35

fix test

9fd98e2

typo

58e0d64

styler

d9cce15

add test

868f021

minor

d2d1a8c

typo

859fe25

lintr

0b9cf95

refactor

bf1e5dd

model_info(lognormal)

bb9a605

cleanup

9dd7e06

strengejacke merged commit 9759453 into main Jun 20, 2024
18 of 26 checks passed

strengejacke deleted the var_zi branch June 20, 2024 12:39

strengejacke mentioned this pull request Jun 20, 2024

Validate get_variance() against remaining families #889

Open

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test get_variance for zero-inflation models #893

Test get_variance for zero-inflation models #893

strengejacke commented Jun 14, 2024 •

edited

Loading

strengejacke commented Jun 14, 2024

strengejacke commented Jun 18, 2024 •

edited

Loading

strengejacke commented Jun 18, 2024

Test get_variance for zero-inflation models #893

Test get_variance for zero-inflation models #893

Conversation

strengejacke commented Jun 14, 2024 • edited Loading

TL;DR

What did this PR change?

strengejacke commented Jun 14, 2024

strengejacke commented Jun 18, 2024 • edited Loading

strengejacke commented Jun 18, 2024

strengejacke commented Jun 14, 2024 •

edited

Loading

strengejacke commented Jun 18, 2024 •

edited

Loading