smooth_samples() throws error messages for gam models with random effects #121

isabellaghement · 2021-12-03T00:37:25Z

Hi Gavin,

I'm trying to use gratia's smooth_samples() function as part of a "by hand" derivatives calculation but am getting error messages for gam models which include random effects.

Using the dataset normtimedata.csv available at https://uofi.app.box.com/s/jtzww0bmgo9216qgpxu6d7avpaa15sey via the Download button, I put together some R code which illustrates one type of error message that I get. The models used in this code were inspired by this post: https://marissabarlaz.github.io/portfolio/gam/.

library(readr)
library(dplyr)
library(magrittr)
library(gratia)
library(mgcv)

setwd("G:\\")

normtimeBP <- read_csv("normtimedata.csv",
                       col_types = 
                         c(Speaker = col_character(),
                          Word = col_character(),
                          Vowel = col_character(),
                          Nasality = col_character(),
                          RepNo = col_double(),
                          NormTime = col_double(),
                          Type = col_character(),
                          label = col_character(),
                          Time = col_double(),
                          F1 = col_double(),
                          F2 = col_double(),
                          F3 = col_double(),
                          F2_3 = col_double()))

BPa = normtimeBP %>% 
      filter(Vowel=="a", 
         Speaker %in% c("BP02", "BP04", "BP05", "BP10", "BP14"))

BPa$Speaker <- factor(BPa$Speaker)
BPa$Nasality <- factor(BPa$Nasality)


model1 <- gam(F1 ~ Nasality + s(NormTime, by = Nasality), 
             data = BPa, 
             method = "REML", 
             family = gaussian())


model2a <- gam(F1 ~ Nasality + s(NormTime, by = Nasality) + 
                    s(Speaker, bs = "re"), 
             data = BPa, 
             method = "REML", 
             family = gaussian())


model2b <- gam(F1 ~ Nasality + s(NormTime, by = Nasality)+ 
                    s(Speaker, Nasality, bs = "re"),
              data = BPa, 
              method = "REML", 
              family = gaussian())

range(BPa$F1)

BPa$F1bin <- ifelse(BPa$F1 > median(BPa$F1, na.rm = TRUE), 1, 0)

model3 <- gam(F1bin ~ Nasality + s(NormTime, by = Nasality) + s(Speaker, bs = "re"), 
              data = BPa, 
              method = "REML", 
              family = binomial(link = "logit"))

model1_smooth_samples <- 
  smooth_samples(model1, 
                 select = "s(NormTime)", 
                 partial_match = TRUE,
                 n = 5, seed = 42)

model2a_smooth_samples <- 
  smooth_samples(model2a, 
                 select = "s(NormTime)", 
                 partial_match = TRUE,
                 n = 5, seed = 42)

model2b_smooth_samples <- 
  smooth_samples(model2b, 
                 select = "s(NormTime)", 
                 partial_match = TRUE,
                 n = 5, seed = 42)

model3_smooth_samples <- 
  smooth_samples(model3, 
                 select = "s(NormTime)", 
                 partial_match = TRUE,
                 n = 5, seed = 42)

The error message I get from R for each of the above gam models expect for the one without random effects is:

Error in names(summ_names) <- paste0(".x", seq_along(summ_names)) :
'names' attribute [1] must be the same length as the vector [0]

For my own dataset, I actually get a different error message, which I could not reproduce with the examples listed above:

Error: New columns must be compatible with .data.
x New column has 15000 rows.
i .data has 7000 rows.

But it seems to me that smooth_samples() is struggles with gam models that include random effects and, depending on the model specification, it throws different types of error messages.

If you could look into this and let me know if there is a fix, that would be greatly appreciated.

Thanks very much!

Isabella

The text was updated successfully, but these errors were encountered:

gavinsimpson · 2021-12-03T06:03:10Z

The issue with these data is annoyingly simple (and you can avoid the bug by specifying the four smooths arising from the factor by term as the only smooths to evaluate). The function is working; all the hard work gets done, but it falls over when adding information about the variables involved in the smooth because in handling factor-by terms I am ignoring factors at some point in the code and for random effects the factor is the data hence I'm also ignoring the very thing I need.

Now I just need to figure out what the right fix is...

rroyaute · 2021-12-20T10:59:20Z

Thanks for the clarification Gavin,
I am still struggling with some aspects of selecting the proper term with smooth_samples(). I hope you may clarify further. Here's some example code following the "factor by" example from the documentation:

library(mgcv); library(gratia)

set.seed(2)
dat <- gamSim(4)

m2 <- gam(y ~ fac + s(x2, by = fac) + s(x0), data = dat)

sms <- smooth_samples(m2, n = 5, seed = 42)
draw(sms) # Works!

sms <- smooth_samples(m2, n = 5, seed = 42, term = "s(x2)")
draw(sms)
Returns: "Error in which_smooths.bam(model, term) :
None of the terms matched a smooth."

smooths(m2) # Select smooths specific to each level of "fac"
sms.1 <- smooth_samples(m2, n = 5, seed = 42, term = "s(x2):fac1")
draw(sms.1) # Works!

sms.2 <- smooth_samples(m2, n = 5, seed = 42, term = "s(x2):fac2")
draw(sms.2) # Still returns fac1

sms.3 <- smooth_samples(m2, n = 5, seed = 42, term = "s(x2):fac2")
draw(sms.3) # Still returns fac1

I hope I didn't make any mistake here, but it looks like the lines above only works for the first level of "fac". I am not able to select and plot the next levels however.

Hope this can be solved easily, I really like the package otherwise and it's really improved my understanding of these complicated classes of model!

Thanks!

Raphaël

gavinsimpson · 2021-12-20T12:27:27Z

Those last two should work; that they don't is a bug. Looks like I'm not pulling out the correct factor level or label somewhere.

I will take a look at this and @isabellaghement's problem in the next few days or over Christmas when I have some time away from work work.

… is now different; add tests for main issue in #121

gavinsimpson · 2022-01-15T15:57:18Z

@rroyaute I have fixed the problem you reported here - that was caused by a silly thinko in some of the code.

gavinsimpson · 2022-01-15T15:58:30Z

@isabellaghement I have now fixed the main issue you reported; I have decided to just ignore random effect smooths for now as they aren't really what I had in mind for this function and would be tricky to fit into the current paradigm.

gavinsimpson · 2022-01-15T16:24:29Z

@isabellaghement The example data from your issue are all working with the GitHub version of {gratia} now. You mentioned another issue with your real data but that isn't raised in the reproducible example you provided and from the error message you mention seems unrelated to the other error messages.

I'm going to close this one as fixed now, but can you install from GitHub and try your actual data example again and file a new issue if the other problem persists?

gavinsimpson self-assigned this Dec 3, 2021

gavinsimpson added the bug Something isn't working label Dec 3, 2021

Aariq mentioned this issue Jan 10, 2022

smooth_samples() errors when models have both a 1D and 2D smooth #126

Closed

gavinsimpson added a commit that referenced this issue Jan 15, 2022

fix wrong factor level problem aspect of #121 reported by Raphaël

62ab73b

gavinsimpson added a commit that referenced this issue Jan 15, 2022

GAM with ranefs is now in test setup; update snapshots as random data…

15cdb86

… is now different; add tests for main issue in #121

gavinsimpson added this to the 0.7 milestone Jan 15, 2022

gavinsimpson closed this as completed Jan 15, 2022

gavinsimpson added a commit that referenced this issue Jan 15, 2022

document fixes to smooth_samples from #121

fef4f2c

gavinsimpson added a commit that referenced this issue Jan 28, 2022

document changes to implement #71 and part of #121: closes #71

dff75a3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

smooth_samples() throws error messages for gam models with random effects #121

smooth_samples() throws error messages for gam models with random effects #121

isabellaghement commented Dec 3, 2021

gavinsimpson commented Dec 3, 2021 •

edited

rroyaute commented Dec 20, 2021 •

edited

gavinsimpson commented Dec 20, 2021 •

edited

gavinsimpson commented Jan 15, 2022

gavinsimpson commented Jan 15, 2022

gavinsimpson commented Jan 15, 2022

smooth_samples() throws error messages for gam models with random effects #121

smooth_samples() throws error messages for gam models with random effects #121

Comments

isabellaghement commented Dec 3, 2021

gavinsimpson commented Dec 3, 2021 • edited

rroyaute commented Dec 20, 2021 • edited

gavinsimpson commented Dec 20, 2021 • edited

gavinsimpson commented Jan 15, 2022

gavinsimpson commented Jan 15, 2022

gavinsimpson commented Jan 15, 2022

gavinsimpson commented Dec 3, 2021 •

edited

rroyaute commented Dec 20, 2021 •

edited

gavinsimpson commented Dec 20, 2021 •

edited