Skip to content

no .resid column with augment.lm and newdata #124

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
cderv opened this issue May 6, 2016 · 15 comments
Closed

no .resid column with augment.lm and newdata #124

cderv opened this issue May 6, 2016 · 15 comments

Comments

@cderv
Copy link

cderv commented May 6, 2016

help page says that with newdata argument supplied, 3 columns are added to the dataset.

When newdata is supplied, augment.lm returns one row for each observation, with three columns added to the new data:

  • .fitted Fitted values of model
  • .se.fit Standard errors of fitted values
  • .resid Residuals of fitted values on the new data

However only 2 are there - no .resid column is calculated

library(dplyr)
library(broom)

train <- mtcars %>%
  sample_frac(0.7)
test <- setdiff(mtcars, train)


model <- lm(mpg ~ cyl + disp, data= train)

augment(model, newdata = test) %>% 
  names()
#>  [1] "mpg"     "cyl"     "disp"    "hp"      "drat"    "wt"      "qsec"   
#>  [8] "vs"      "am"      "gear"    "carb"    ".fitted" ".se.fit"

So either, we could add a .resid column but it implies some calculation or at least the help page should be modified.

@yosuke-yasuda
Copy link

I also encountered this.

@cderv
Copy link
Author

cderv commented Jan 6, 2017

I could make a PR on this if it could help...

@simonthelwall
Copy link

simonthelwall commented Jan 17, 2017

same issue here
broom_0.4.1
R version 3.3.1 (2016-06-21)

@topepo
Copy link
Member

topepo commented Apr 27, 2017

Same thing here.

One issue is that newdata may not have the outcome variable in it. For lm (and most other single numeric outcome models), this can be used to determine if needed variable is present:

lm_response_var <- function(x) {
  y_index <- attr(x$terms, "response")
  if(y_index == 0) return(NA)
  var_list <- attr(x$terms, "variables")
  as.character(var_list[y_index + 1])
}

So inside augment.lm somewhere, there could be something like

if(!is.na(lm_response_var(x)) || lm_response_var(x) %in% colnames(newdata))
   res$.resid <- blah blah blah

@alexpghayes
Copy link
Collaborator

The goal is to have this behavior for all relevant augment methods, not just augment.lm, in 0.7.0. The first step will be adding to this augment test suite.

@IndrajeetPatil
Copy link
Contributor

This is fixed in the development version.

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(broom)

train <- mtcars %>%
  sample_frac(0.7)
test <- setdiff(mtcars, train)


model <- lm(mpg ~ cyl + disp, data= train)

augment(model, newdata = test) %>% 
  names()
#>  [1] "mpg"     "cyl"     "disp"    "hp"      "drat"    "wt"      "qsec"   
#>  [8] "vs"      "am"      "gear"    "carb"    ".fitted" ".resid"

Created on 2019-03-10 by the reprex package (v0.2.1.9000)

Session info
devtools::session_info()
#> - Session info ----------------------------------------------------------
#>  setting  value                                             
#>  version  R Under development (unstable) (2019-03-02 r76189)
#>  os       Windows 10 x64                                    
#>  system   x86_64, mingw32                                   
#>  ui       RTerm                                             
#>  language (EN)                                              
#>  collate  English_United States.1252                        
#>  ctype    English_United States.1252                        
#>  tz       America/Chicago                                   
#>  date     2019-03-10                                        
#> 
#> - Packages --------------------------------------------------------------
#>  package     * version    date       lib
#>  assertthat    0.2.0      2017-04-11 [1]
#>  backports     1.1.3      2018-12-14 [1]
#>  broom       * 0.5.1.9000 2019-03-10 [1]
#>  callr         3.1.1      2018-12-21 [1]
#>  cli           1.0.1.9000 2019-01-20 [1]
#>  crayon        1.3.4      2017-09-16 [1]
#>  desc          1.2.0      2019-01-21 [1]
#>  devtools      2.0.1.9000 2019-02-18 [1]
#>  digest        0.6.18     2018-10-10 [1]
#>  dplyr       * 0.8.0.9006 2019-03-07 [1]
#>  evaluate      0.13       2019-02-12 [1]
#>  fs            1.2.6      2018-08-23 [1]
#>  generics      0.0.2      2019-03-05 [1]
#>  glue          1.3.0      2018-07-17 [1]
#>  highr         0.7        2018-06-09 [1]
#>  htmltools     0.3.6      2017-04-28 [1]
#>  knitr         1.22       2019-03-08 [1]
#>  magrittr      1.5        2014-11-22 [1]
#>  memoise       1.1.0      2017-04-21 [1]
#>  pillar        1.3.1      2018-12-15 [1]
#>  pkgbuild      1.0.2      2018-10-16 [1]
#>  pkgconfig     2.0.2      2018-08-16 [1]
#>  pkgload       1.0.2      2018-10-29 [1]
#>  prettyunits   1.0.2      2015-07-13 [1]
#>  processx      3.2.1      2018-12-05 [1]
#>  ps            1.3.0      2018-12-21 [1]
#>  purrr         0.3.1      2019-03-03 [1]
#>  R6            2.4.0      2019-02-14 [1]
#>  Rcpp          1.0.0      2018-11-07 [1]
#>  remotes       2.0.2      2018-10-30 [1]
#>  rlang         0.3.1      2019-01-08 [1]
#>  rmarkdown     1.11.6     2019-02-14 [1]
#>  rprojroot     1.3-2      2018-01-03 [1]
#>  sessioninfo   1.1.1      2018-11-05 [1]
#>  stringi       1.3.1      2019-02-13 [1]
#>  stringr       1.4.0      2019-02-10 [1]
#>  testthat      2.0.1      2018-10-13 [1]
#>  tibble        2.0.1.9001 2019-03-07 [1]
#>  tidyr         0.8.3.9000 2019-03-07 [1]
#>  tidyselect    0.2.5      2018-10-11 [1]
#>  usethis       1.4.0.9000 2019-02-18 [1]
#>  vctrs         0.1.0.9002 2019-03-07 [1]
#>  withr         2.1.2      2018-03-15 [1]
#>  xfun          0.5        2019-02-20 [1]
#>  yaml          2.2.0      2018-07-25 [1]
#>  zeallot       0.1.0      2018-01-28 [1]
#>  source                            
#>  CRAN (R 3.5.1)                    
#>  CRAN (R 3.6.0)                    
#>  local                             
#>  CRAN (R 3.6.0)                    
#>  Github (r-lib/cli@94e2fc5)        
#>  CRAN (R 3.5.1)                    
#>  Github (r-lib/desc@42b9578)       
#>  Github (r-lib/devtools@188a613)   
#>  CRAN (R 3.5.1)                    
#>  Github (tidyverse/dplyr@2ef1fd9)  
#>  CRAN (R 3.6.0)                    
#>  CRAN (R 3.5.1)                    
#>  Github (r-lib/generics@c15ac43)   
#>  CRAN (R 3.5.1)                    
#>  CRAN (R 3.5.1)                    
#>  CRAN (R 3.5.1)                    
#>  CRAN (R 3.6.0)                    
#>  CRAN (R 3.5.1)                    
#>  CRAN (R 3.5.1)                    
#>  CRAN (R 3.6.0)                    
#>  CRAN (R 3.5.1)                    
#>  CRAN (R 3.5.1)                    
#>  CRAN (R 3.6.0)                    
#>  CRAN (R 3.5.1)                    
#>  CRAN (R 3.6.0)                    
#>  CRAN (R 3.6.0)                    
#>  CRAN (R 3.6.0)                    
#>  CRAN (R 3.6.0)                    
#>  CRAN (R 3.6.0)                    
#>  CRAN (R 3.6.0)                    
#>  CRAN (R 3.6.0)                    
#>  Github (rstudio/rmarkdown@bbd0786)
#>  CRAN (R 3.5.1)                    
#>  CRAN (R 3.6.0)                    
#>  CRAN (R 3.6.0)                    
#>  CRAN (R 3.6.0)                    
#>  CRAN (R 3.5.1)                    
#>  Github (tidyverse/tibble@5fc065b) 
#>  Github (tidyverse/tidyr@7a51bfd)  
#>  CRAN (R 3.5.1)                    
#>  Github (r-lib/usethis@ed9ae17)    
#>  Github (r-lib/vctrs@6b8c98a)      
#>  CRAN (R 3.5.1)                    
#>  CRAN (R 3.6.0)                    
#>  CRAN (R 3.5.1)                    
#>  CRAN (R 3.5.1)                    
#> 
#> [1] C:/Users/inp099/Documents/R/win-library/3.6
#> [2] C:/Program Files/R/R-devel/library

@alexpghayes
Copy link
Collaborator

I want to keep this as a reminder to clearly document when you get .resid column in augment().

@mehrlander
Copy link

mehrlander commented Mar 22, 2020

Just ran into the same problem, but resolved it by downloading the Github development version.

@simonpcouch simonpcouch removed this from the 0.7.0 milestone May 28, 2020
@Amogh-Joshi
Copy link

Amogh-Joshi commented Sep 19, 2020

I would like to report a bug.
When you transform both variables (say a log transformation), augment will not display .resid column. Please check:
library(alr4); library(broom)

modelUN <- lm(I(log(fertility)) ~ I(log(ppgdp)), data = UN11)

augment(modelUN)

I tried removing and installing the library again. Any ideas on how to show the column?

@topepo
Copy link
Member

topepo commented Sep 20, 2020

@Amogh-Joshi Can you please start another issue and use the reprex::reprex() function so that it is reproducible?

@hardin47
Copy link

hardin47 commented Feb 1, 2021

i find that when i transform the response variable, augment() doesn't keep .resid.

library(tidyverse)
library(broom)
house = read.table("http://www.rossmanchance.com/iscam2/data/housing.txt", 
                   header=TRUE, sep="\t")

lm(price ~  sqft, data=house)  %>% augment () %>%
  ggplot(aes(x = .fitted, y = .resid)) + 
  geom_point() + 
  geom_hline(yintercept=0) +
  ggtitle("Residual plot for price as a function of sqft")

lm(log(price) ~  sqft, data=house)  %>% augment () %>%
  ggplot(aes(x = .fitted, y = .resid))+ 
  geom_point() + 
  geom_hline(yintercept=0) +
  ggtitle("Residual plot for ln price as a function of sqft")
#> Error in FUN(X[[i]], ...): object '.resid' not found

Created on 2021-01-31 by the reprex package (v0.3.0)

@hardin47
Copy link

hardin47 commented Feb 1, 2021

i think it might have something to do with the fact that the variable used to be automatically changed to log.price. and now it is log(price) with the tick-quotations.

@simonpcouch
Copy link
Collaborator

Related to #937. :-) A helpful diagnostic here.

I'm in the same spot that Alex noted on that issue—not in a place where I can work through a fix right now, but would be glad to review a PR!

@github-actions
Copy link

github-actions bot commented Feb 2, 2022

This issue has been automatically closed due to inactivity.

@github-actions
Copy link

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Feb 17, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

10 participants