Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for car::lht and better term handling #1106

Merged
merged 8 commits into from
Jun 21, 2022

Conversation

grantmcdermott
Copy link
Contributor

(Mostly) fixes #1090.

I was very tempted to create a new, dedicated glance.anova method and pull some columns that are currently returned as part of the tidy.anova object. (Most obviously, the "df.residual" and "rss" columns.) But for now I'll stick to a more conservative fix that just supports car::linearHypothesis and returns more information about the model contrasts. OTOH, the fact that glance.anova currently still returns a not very sensible data frame is probably something that needs to be fixed at some point.

Some examples:

devtools::load_all("~/Documents/Projects/broom")
#> ℹ Loading broom

library(broom)

a <- lm(mpg ~ wt + qsec + disp, mtcars)
b <- lm(mpg ~ wt + qsec, mtcars)

ab <- anova(a, b)

# new change: term column shows comparison models
tidy(ab)
#> # A tibble: 2 × 7
#>   term                   df.residual   rss    df    sumsq statistic p.value
#>   <chr>                        <dbl> <dbl> <dbl>    <dbl>     <dbl>   <dbl>
#> 1 mpg ~ wt + qsec + disp          28  195.    NA NA       NA         NA    
#> 2 mpg ~ wt + qsec                 29  195.    -1 -0.00102  0.000147   0.990

library(car)
#> Loading required package: carData
#> 
#> Attaching package: 'car'
#> The following object is masked from 'package:broom':
#> 
#>     recode
tidy(lht(a, "wt = disp"))
#> # A tibble: 1 × 10
#>   term     null.value estimate std.error statistic p.value df.residual   rss    df
#>   <chr>         <dbl>    <dbl>     <dbl>     <dbl>   <dbl>       <dbl> <dbl> <dbl>
#> 1 wt -disp          0    -5.03      1.23      16.6 3.39e-4          28  195.     1
#> # … with 1 more variable: sumsq <dbl>

Created on 2022-06-06 by the reprex package (v2.0.1)

@simonpcouch
Copy link
Collaborator

simonpcouch commented Jun 7, 2022

@grantmcdermott, thank you for this PR! I apologize for not getting back to your original issue—as always, very well-considered and -argued.

I'm on board for this as well as the glance.anova addition. If you have the energy for it, feel free to add that to this PR or in a separate one, whichever you find more fitting.

I appreciate you linking out to the modelsummary issue; good to know where the both of yall are at in terms of how you think about broom's lifecycle and reliability.

I pushed one small change to pass the "hard" check. [edit: I had said I would request a change, but I see why your logic is the way it is now.🙂]

@simonpcouch
Copy link
Collaborator

Feel free to ignore the pkgdown checks—that machinery needs updating.🌝🌚

@grantmcdermott
Copy link
Contributor Author

grantmcdermott commented Jun 7, 2022

Thanks Si!

I can add the glance.anova method to this PR if you'd prefer. Here's what that would entail:

  • Dropping any "df.residual" and "rss" columns from the returned tidy.anova object.
  • Porting those columns to the glance method instead. (And maybe renaming "rss" to "deviance" to be consistent with other glance methods?)
  • In some default cases—e.g. anova(lm(mpg ~ wt, mtcars))—the glance method would just be an empty data frame, since the return object doesn't produce appropriate glance-like columns.

Lmk your thoughts and I'll try to submit ASAP.

@simonpcouch
Copy link
Collaborator

Porting those columns to the glance method instead. (And maybe renaming "rss" to "deviance" to be consistent with other glance methods?)

On board!

In some default cases—e.g. anova(lm(mpg ~ wt, mtcars))—the glance method would just be an empty data frame, since the return object doesn't produce appropriate glance-like columns.

On board!

Dropping any "df.residual" and "rss" columns from the returned tidy.anova object.

I think I'd hold off on this. These columns have been around, at least by position, since some of the first commits to broom. I see the argument for why they shouldn't be there, but I'd imagine this would affect a good few reverse dependencies.

@grantmcdermott
Copy link
Contributor Author

just to let you know i haven't forgotten this... need to get grading done first, though :'-|

@simonpcouch
Copy link
Collaborator

@grantmcdermott No rush. :)

@grantmcdermott
Copy link
Contributor Author

Thanks for bearing with me @simonpcouch. I think these last few changes should do it.

I added the following note to the glance.anova() help documentation.

#' Note that the output of `glance.anova()` will vary depending on the initializing 
#' anova call. In some cases, it will just return an empty data frame. In other 
#' cases, `glance.anova()` may return columns that are also common to
#' `tidy.anova()`. This is partly to preserve backwards compatibility with early
#' versions of `broom`, but also because the underlying anova model yields 
#' components that could reasonably be interpreted as goodness-of-fit summaries
#' too.

Example:

devtools::load_all("~/Documents/Projects/broom")
#> ℹ Loading broom

a <- lm(mpg ~ wt + qsec + disp, mtcars)
b <- lm(mpg ~ wt + qsec, mtcars)

ab <- anova(a, b)

glance(ab)
#> # A tibble: 1 × 2
#>   deviance df.residual
#>      <dbl>       <dbl>
#> 1     195.          29

## Example where glance returns an empty DF
glance(anova(a))
#> # A tibble: 0 × 0

Created on 2022-06-13 by the reprex package (v2.0.1)

@simonpcouch
Copy link
Collaborator

Awesome—I'm away from work for the week but will give this a more thorough look + merge if things look good next week. :)

@simonpcouch
Copy link
Collaborator

This looks great! No edits from me—will just update NEWS.

Was a little bit nervous about the tidy.anova column repositioning and renaming of res.df, so I ran some revdepchecks:

We checked 199 reverse dependencies, comparing R CMD check results across CRAN and dev versions of this package.

 * We saw 0 new problems
 * We failed to check 0 packages

Woop woop! This may be breaking for some non-testable dependencies, but this feels like a change worth making.

@simonpcouch simonpcouch merged commit 358df2d into tidymodels:main Jun 21, 2022
@github-actions
Copy link

github-actions bot commented Jul 6, 2022

This pull request has been automatically locked. If you believe the issue addressed here persists, please file a new PR (with a reprex: https://reprex.tidyverse.org) and link to this one.

@github-actions github-actions bot locked and limited conversation to collaborators Jul 6, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve anova tidiers
2 participants