Skip to content

Added adj.r.squared and npar as outputs of glance.gam for mgcv::gam#1172

Merged
simonpcouch merged 4 commits intotidymodels:mainfrom
tripartio:mgcv
Sep 6, 2023
Merged

Added adj.r.squared and npar as outputs of glance.gam for mgcv::gam#1172
simonpcouch merged 4 commits intotidymodels:mainfrom
tripartio:mgcv

Conversation

@tripartio
Copy link
Contributor

When using glance.gam for mgcv::gam, I was unpleasantly surprised to not find any results for adjusted R squared. This is one of the most important model evaluation statistics for statistical inference. Not only my colleagues and I, but also probably most other users of glance.gam would need this.

On examining the code, it seems that the only outputs provided by the current glance method are those directly available from the gam model object. But summary.gam provides some additional valuable outputs. So, I have modified the code to calculate the summary(x) on the model object (x) and then add in the other useful outputs (at least, those, that output a scalar numeric output.

That said, when examining modeltests::column_glossary, I could not find most of the additional outputs. So, I have added the only two outputs available in the modeltests::column_glossary (adj.r.squared and npar); for the other outputs, I have listed them in the code but commented them out--that way, if they are supported in the future in modeltests::column_glossary, the code can more easily be updated.

I have tested the updates with the unit tests in tests/testthat/test-mgcv.R; my modifications pass the tests. I hope you can accept this pull request.

@simonpcouch
Copy link
Collaborator

Thanks for the PR! This looks solid.

Wanted to make sure we wouldn't be increasing the time-to-tidy too drastically by introducing the additional summary() call, but looks like that's not an issue:

library(mgcv)
#> Loading required package: nlme
#> This is mgcv 1.9-0. For overview type 'help("mgcv-package")'.
library(broom)

set.seed(2) ## simulate some data... 
dat <- gamSim(1, n = 400, dist = "normal", scale = 2)
#> Gu & Wahba 4 term additive model

b <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat)

bench::mark(
  gam = gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat),
  tidy = tidy(b),
  summary = summary(b),
  check = FALSE,
  relative = TRUE
)
#> # A tibble: 3 × 6
#>   expression   min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <dbl>  <dbl>     <dbl>     <dbl>    <dbl>
#> 1 gam        36.1   36.1       1         34.9     1   
#> 2 tidy        4.62   4.57      7.78      32.8     2.78
#> 3 summary     1      1        35.8        1       2.67

Created on 2023-09-06 with reprex v2.0.2

Pushing some small changes before merging.

@simonpcouch simonpcouch merged commit cc679ed into tidymodels:main Sep 6, 2023
@github-actions
Copy link

This pull request has been automatically locked. If you believe the issue addressed here persists, please file a new PR (with a reprex: https://reprex.tidyverse.org) and link to this one.

@github-actions github-actions bot locked and limited conversation to collaborators Sep 21, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants