Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

draw.gam errors with a valid HGAM and grouped_by = TRUE & parametric = TRUE #219

Closed
gavinsimpson opened this issue Mar 23, 2023 · 1 comment
Assignees
Labels
bug Something isn't working
Milestone

Comments

@gavinsimpson
Copy link
Owner

From the rate hormone example in my GAM course, m3_hgam fails:

pkgs <- c("mgcv", "lme4", "ggplot2", "readr", "dplyr", "forcats", "tidyr",
          "gratia")

# load data
vapply(pkgs, library, logical(1), character.only = TRUE, logical.return = TRUE,
       quietly = TRUE)

rats_url <- "https://bit.ly/rat-hormone"
rats <- read_table(rats_url, col_types = "dddddddddddd-")
# ignore the warning - it"s due to trailing white space at the ends of each
#   row in the file

rats <- rats %>%
    mutate(treatment = fct_recode(factor(group, levels = c(1, 2, 3)),
                                  Low = "1",
                                  High = "2",
                                  Control = "3"),
           treatment = fct_relevel(treatment, c("Control", "Low", "High")),
           subject = factor(subject))

draw() throws an error with:

m3_hgam <- gam(response ~ treatment +
                 s(time, by = treatment, k = K) +
                 s(subject, bs = "re"),
               data = rats, method = "REML")
draw(m3_hgam, residuals = TRUE, rug = FALSE, grouped_by = TRUE, parametric = TRUE)

The error is

> draw(m3_hgam, residuals = TRUE, rug = FALSE, grouped_by = TRUE, parametric = TRUE)
Error in `map()`:
ℹ In index: 1.
Caused by error in `bind_cols()`:
! Can't recycle `level` (size 350) to match `partial` (size 252).
Run `rlang::last_error()` to see where the error occurred.
@gavinsimpson gavinsimpson added the bug Something isn't working label Mar 23, 2023
@gavinsimpson gavinsimpson self-assigned this Mar 23, 2023
@gavinsimpson gavinsimpson added this to the 0.9 milestone Mar 23, 2023
@gavinsimpson
Copy link
Owner Author

The issue here is that the data contains NAs and hence the way parametric_effects() works to reconstruct the fitting data (along the lines of how termplot() does it) produces original data that contains the NA values. However, because the default in mgcv is to use na.action = na.omit, we have the observed incompatibility between the sizes of the variables being combined. If we fit the model with na.action = na.exclude, the issue goes away.

Options to consider to address this:

  1. could run na.omit or complete.cases on the reconstructed data before predict()?

    For factor parametric effects this is fine, but what about continuous terms? Are the reconstructed data the full thing passed to the data argument? In which case we only want to run those on the variables used in fitting the model.

  2. could run distinct() on the reconstructed data and pass that to predict()?

  3. could provide a better error message to suggest the source of the problem and suggest refitting with na.action = na.exclude?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant