Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

penalized GAMs? #704

Open
bbolker opened this issue Jun 2, 2021 · 12 comments
Open

penalized GAMs? #704

bbolker opened this issue Jun 2, 2021 · 12 comments

Comments

@bbolker
Copy link
Contributor

bbolker commented Jun 2, 2021

Maybe we can add syntax to allow mgcv-type smooths (as brms does, or like gamm4 - but hopefully less clunky than gamm4! https://gist.github.com/dsjohnson/9d66aa47557ad56438aaf75dd25910ea)

@skaug
Copy link
Contributor

skaug commented Jun 3, 2021

As shown in the link you include, both the design matrix X and the penalty matrix S can easily be obtained from mgcv. I have used this approach in the past with plain TMB, and fitting the model matches the mgcv output closely. I think the mgcv package will provide a lot of infrastructure.

@bbolker
Copy link
Contributor Author

bbolker commented Jun 3, 2021

I agree that the basic idea is simple; designing things so that the integration into glmmTMB is simple and transparent is the harder part (we could start by adding this example to a vignette, with permission ... then making a helper function to automate some parts of it ...)

@skaug
Copy link
Contributor

skaug commented Jun 4, 2021

Sounds like a good idea. We could even write a small wrapper function that calculates Xf and Xr (and hide details that the user does not need to understand). The most difficult thing for the user will be the map argument, especially
when there other random terms in the linear predictor. A diagonal covariance structure should be used
for spline random effects.

I assume the order of RE-terms in the linear predictor determines the order of terms in theta. A list
of different cases (different number of splines, splines in combinations with ordinary random effects)
is the best way to explain how to set up the map.

The vignette should also contain a line of R code showing how to turn theta
into the mgcv penalty parameter.

I hope/believe that AIC values calculated by glmmTMB will agree with stuff returned by mgcv,
but this should be looked into.

@bbolker
Copy link
Contributor Author

bbolker commented Jun 4, 2021

new improved example from Devin Johnson: https://gist.github.com/dsjohnson/9d66aa47557ad56438aaf75dd25910ea

@skaug
Copy link
Contributor

skaug commented Jun 5, 2021

That definitely makes the code easier.

But how to handle multiple splines (different covariates), and splines in combination with other fixed and random effects?
My feeling here is that it best to use Devin's code to obtain Xr and Xf for each spline, and then insert these explicitly into
the glmmTMB linear predictor. The user must then remove the intercept column of the Xf's.

The real challenge is how set up the map argument in these more complicated situations. One can either
try to write a function that automatically figures out "map" based on the formula for the linear predictor,
or one can list a lot of special cases in the vignette and let the user figure out.

@skaug
Copy link
Contributor

skaug commented Jun 5, 2021

... or, if a new structured covariance matrix type could be introduced in glmmTMB:

diaghomosced (diagonal, homogeneous variance)

it would remove the need to use "map" entirely. It would also have the benefit of giving
a nicer summary, as the output from this term should only be single line,
which would then represent the entire spline term.

@bbolker
Copy link
Contributor Author

bbolker commented Jun 7, 2021

a diagonal/homogeneous variance type is a good idea in any case, and should be super-easy to code. For example, MCMCglmm has ‘idv’ (diag, constant-variance), 'idh' (diag, heterogeneous variance), 'us' (unstructured/general pos-def) (and cor[] and ante[]). diaghomosced feels too long and clunky though - cdiag for "constant diagonal" ? (too bad "homoscedastic" and "heteroscedastic" start with the same letter ...)

@skaug
Copy link
Contributor

skaug commented Jun 8, 2021

Yes, a short name i better. I had in mind 'dhom' for "diagonal homogeneous".

@jclaesen
Copy link

Is the diagonal homogeneous variance option already available? I would like to try it in combination with splines for multiple covariates for beta-binomial distributed data.

@skaug
Copy link
Contributor

skaug commented May 18, 2022

No, it has not been implemented yet (as far as I know).

@jclaesen
Copy link

pity, it would be very useful. But thanks for the answer

@bbolker bbolker mentioned this issue Feb 19, 2023
@bbolker
Copy link
Contributor Author

bbolker commented Feb 20, 2023

I implemented the homogeneous diagonal stuff. The next issue is that, if we use the mgcv stuff and do everything internally/magically for the user, we will have to include a separate chunk of code to transform the estimated b values (which are in a transformed space) back into the 'prediction' b values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants