Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected behaviour in augment() for person residuals. #1147

Closed
egosv opened this issue Mar 7, 2023 · 3 comments
Closed

Unexpected behaviour in augment() for person residuals. #1147

egosv opened this issue Mar 7, 2023 · 3 comments

Comments

@egosv
Copy link

egosv commented Mar 7, 2023

The problem

I was trying the augment(fit) function using type.residuals = "pearson" and found the output of column .std.resid to be different to that of rstandard(fit, type = "pearson").

After some digging, it seems the issue is introduced by function add_hat_sigma_cols

df$.std.resid[nonzero_idx] <- rstandard(x, infl = infl) %>% unname()
which essentially overwrites the previously (correctly) calculated values.

This issue is inconsequential for family = gaussian but has some implications for any other family member and that's probably why simple testing may have passed.

So, in essence, column .std.resid always contains standardized deviance residuals irrespective of the argument type.residuals, which I believe is not the expected behaviour.

Reproducible example

## an example with offsets from Venables & Ripley (2002, p.189)
utils::data(anorexia, package = "MASS")

anorex.1 <- glm(Postwt ~ Prewt + Treat + offset(Prewt),
                family = gaussian, data = anorexia)

rs1 = unlist(broom::augment(anorex.1, type.residuals = "pearson")[,c(".std.resid")])
rs2 = rstandard(anorex.1, type = "pearson")
d1 = rstandard(anorex.1, type = "deviance")

all.equal(rs1, rs2, check.attributes = FALSE)
#> [1] TRUE
all.equal(rs1, d1, check.attributes = FALSE)
#> [1] TRUE

## Dobson (1990) Page 93: Randomized Controlled Trial :
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)
glm.D93 <- glm(counts ~ outcome + treatment, family = poisson())

rs1 = unlist(broom::augment(glm.D93, type.residuals = "pearson")[,c(".std.resid")])
rs2 = rstandard(glm.D93, type = "pearson")
d1 = rstandard(glm.D93, type = "deviance")

all.equal(rs1, rs2, check.attributes = FALSE)
#> [1] "Mean relative difference: 0.03595586"
all.equal(rs1, d1, check.attributes = FALSE)
#> [1] TRUE

# A Gamma example, from McCullagh & Nelder (1989, pp. 300-2)
clotting <- data.frame(
  u = c(5,10,15,20,30,40,60,80,100),
  lot1 = c(118,58,42,35,27,25,21,19,18),
  lot2 = c(69,35,26,21,18,16,13,12,12))
fitGamma <- glm(lot1 ~ log(u), data = clotting, family = Gamma)

rs1 = unlist(broom::augment(fitGamma, type.residuals = "pearson")[,c(".std.resid")])
rs2 = rstandard(fitGamma, type = "pearson")
d1 = rstandard(fitGamma, type = "deviance")

all.equal(rs1, rs2, check.attributes = FALSE)
#> [1] "Mean relative difference: 0.01597699"
all.equal(rs1, d1, check.attributes = FALSE)
#> [1] TRUE

Created on 2023-03-07 with reprex v2.0.2

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.1.2 (2021-11-01)
#>  os       macOS Monterey 12.6.3
#>  system   aarch64, darwin20
#>  ui       X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       America/Toronto
#>  date     2023-03-07
#>  pandoc   2.19.2 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version date (UTC) lib source
#>  assertthat    0.2.1   2019-03-21 [1] CRAN (R 4.1.0)
#>  backports     1.4.0   2021-11-23 [1] CRAN (R 4.1.2)
#>  broom         0.7.12  2022-01-28 [1] CRAN (R 4.1.1)
#>  cli           3.4.1   2022-09-23 [1] CRAN (R 4.1.1)
#>  DBI           1.1.3   2022-06-18 [1] CRAN (R 4.1.1)
#>  digest        0.6.28  2021-09-23 [1] CRAN (R 4.1.1)
#>  dplyr         1.0.10  2022-09-01 [1] CRAN (R 4.1.1)
#>  evaluate      0.14    2019-05-28 [1] CRAN (R 4.1.0)
#>  fansi         0.5.0   2021-05-25 [1] CRAN (R 4.1.0)
#>  fastmap       1.1.0   2021-01-25 [1] CRAN (R 4.1.0)
#>  fs            1.5.2   2021-12-08 [1] CRAN (R 4.1.1)
#>  generics      0.1.3   2022-07-05 [1] CRAN (R 4.1.1)
#>  glue          1.6.2   2022-02-24 [1] CRAN (R 4.1.1)
#>  highr         0.9     2021-04-16 [1] CRAN (R 4.1.0)
#>  htmltools     0.5.3   2022-07-18 [1] CRAN (R 4.1.1)
#>  knitr         1.37    2021-12-16 [1] CRAN (R 4.1.1)
#>  lifecycle     1.0.3   2022-10-07 [1] CRAN (R 4.1.1)
#>  magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.1.1)
#>  pillar        1.8.1   2022-08-19 [1] CRAN (R 4.1.1)
#>  pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.1.0)
#>  purrr         0.3.4   2020-04-17 [1] CRAN (R 4.1.0)
#>  R6            2.5.1   2021-08-19 [1] CRAN (R 4.1.1)
#>  reprex        2.0.2   2022-08-17 [1] CRAN (R 4.1.1)
#>  rlang         1.0.6   2022-09-24 [1] CRAN (R 4.1.1)
#>  rmarkdown     2.17    2022-10-07 [1] CRAN (R 4.1.1)
#>  rstudioapi    0.13    2020-11-12 [1] CRAN (R 4.1.0)
#>  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.1.1)
#>  stringi       1.7.5   2021-10-04 [1] CRAN (R 4.1.1)
#>  stringr       1.4.0   2019-02-10 [1] CRAN (R 4.1.1)
#>  tibble        3.1.8   2022-07-22 [1] CRAN (R 4.1.1)
#>  tidyr         1.1.4   2021-09-27 [1] CRAN (R 4.1.1)
#>  tidyselect    1.2.0   2022-10-10 [1] CRAN (R 4.1.1)
#>  utf8          1.2.2   2021-07-24 [1] CRAN (R 4.1.0)
#>  vctrs         0.5.0   2022-10-22 [1] CRAN (R 4.1.1)
#>  withr         2.5.0   2022-03-03 [1] CRAN (R 4.1.1)
#>  xfun          0.34    2022-10-18 [1] CRAN (R 4.1.1)
#>  yaml          2.2.1   2020-02-01 [1] CRAN (R 4.1.0)
#> 
#> 
#> ──────────────────────────────────────────────────────────────────────────────
@simonpcouch
Copy link
Collaborator

Thanks for the detailed reprex! Agreed that this is an issue. I missed this in #965.

Rather than touching that helper, I may just overwrite its std.resid results in augment.glm()augment.rlm() is the other function that uses this helper and doesn't take a type.residuals argument:

augment.rlm <- function(x, data = model.frame(x), newdata = NULL,
se_fit = FALSE, ...) {
df <- augment_newdata(x, data, newdata, se_fit)
if (is.null(newdata)) {
tryCatch(
{
infl <- influence(x, do.coef = FALSE)
df <- add_hat_sigma_cols(df, x, infl)

@egosv
Copy link
Author

egosv commented Mar 8, 2023

Thanks, yes, that will likely fix this.

@github-actions
Copy link

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Mar 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants