Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bind_rows() using tibbles with attributes loses attributes #3259

Closed
DavisVaughan opened this issue Dec 22, 2017 · 9 comments
Closed

bind_rows() using tibbles with attributes loses attributes #3259

DavisVaughan opened this issue Dec 22, 2017 · 9 comments
Labels
feature a feature request or enhancement vctrs ↗️

Comments

@DavisVaughan
Copy link
Member

I assume this is very similar to #2457.

Using bind_rows() on two tibbles where either one has extra attributes removes all extra attributes. Perhaps an approach similar to #1692 can be taken where the attributes of the first are kept? Ideally I would like bind_rows() to be generic but I've read all the problems associated with that.

My use case is in tibbletime. My print.tbl_time() method relies on one of the attributes. Because the tbl_time class is kept after using bind_rows(), but the attributes are lost, this causes problems. It causes other problems too, but this shows up immediately.

library(dplyr)

iris_tbl <- as_tibble(iris)

attr(iris_tbl, "my_attr") <- "important thing"

# The classes of the first object are kept due to #1692
class(iris_tbl) <- c("my_new_class", class(iris_tbl))

attributes(iris_tbl)
#> $names
#> [1] "Sepal.Length" "Sepal.Width"  "Petal.Length" "Petal.Width" 
#> [5] "Species"     
#> 
#> $row.names
#>   ...truncated
#>
#> $class
#> [1] "my_new_class" "tbl_df"       "tbl"          "data.frame"  
#> 
#> $my_attr
#> [1] "important thing"

iris_doubled <- iris_tbl %>%
  bind_rows(iris_tbl)

# The attributes besides names / row.names / class are lost
iris_doubled %>%
  attributes()
#> $names
#> [1] "Sepal.Length" "Sepal.Width"  "Petal.Length" "Petal.Width" 
#> [5] "Species"     
#> 
#> $row.names
#>   ...truncated
#>
#> 
#> $class
#> [1] "my_new_class" "tbl_df"       "tbl"          "data.frame"
Session info
devtools::session_info()
#> Session info -------------------------------------------------------------
#>  setting  value                       
#>  version  R version 3.4.3 (2017-11-30)
#>  system   x86_64, darwin15.6.0        
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  tz       America/New_York            
#>  date     2017-12-22
#> Packages -----------------------------------------------------------------
#>  package    * version    date      
#>  assertthat   0.2.0      2017-04-11
#>  backports    1.1.0      2017-05-22
#>  base       * 3.4.3      2017-12-07
#>  bindr        0.1        2016-11-13
#>  bindrcpp     0.2        2017-06-17
#>  compiler     3.4.3      2017-12-07
#>  datasets   * 3.4.3      2017-12-07
#>  devtools     1.13.3     2017-08-02
#>  digest       0.6.12     2017-01-27
#>  dplyr      * 0.7.4      2017-09-28
#>  evaluate     0.10.1     2017-06-24
#>  glue         1.2.0      2017-10-29
#>  graphics   * 3.4.3      2017-12-07
#>  grDevices  * 3.4.3      2017-12-07
#>  htmltools    0.3.6      2017-04-28
#>  knitr        1.17       2017-08-10
#>  magrittr     1.5        2014-11-22
#>  memoise      1.1.0      2017-10-26
#>  methods    * 3.4.3      2017-12-07
#>  pillar       0.0.0.9000 2017-11-29
#>  pkgconfig    2.0.1      2017-03-21
#>  R6           2.2.2      2017-06-17
#>  Rcpp         0.12.14    2017-11-23
#>  rlang        0.1.4      2017-11-05
#>  rmarkdown    1.6.0.9004 2017-09-23
#>  rprojroot    1.2        2017-01-16
#>  stats      * 3.4.3      2017-12-07
#>  stringi      1.1.5      2017-04-07
#>  stringr      1.2.0      2017-02-18
#>  tibble       1.3.4.9003 2017-11-29
#>  tools        3.4.3      2017-12-07
#>  utils      * 3.4.3      2017-12-07
#>  withr        2.1.0.9000 2017-11-29
#>  yaml         2.1.14     2016-11-12
#>  source                                 
#>  CRAN (R 3.4.0)                         
#>  CRAN (R 3.4.0)                         
#>  local                                  
#>  CRAN (R 3.4.0)                         
#>  CRAN (R 3.4.0)                         
#>  local                                  
#>  local                                  
#>  CRAN (R 3.4.1)                         
#>  CRAN (R 3.4.0)                         
#>  CRAN (R 3.4.2)                         
#>  CRAN (R 3.4.0)                         
#>  CRAN (R 3.4.2)                         
#>  local                                  
#>  local                                  
#>  CRAN (R 3.4.0)                         
#>  cran (@1.17)                           
#>  CRAN (R 3.4.0)                         
#>  Github (hadley/memoise@d63ae9c)        
#>  local                                  
#>  Github (r-lib/pillar@5a082e1)          
#>  CRAN (R 3.4.0)                         
#>  CRAN (R 3.4.0)                         
#>  CRAN (R 3.4.3)                         
#>  CRAN (R 3.4.2)                         
#>  Github (DavisVaughan/rmarkdown@f85ba35)
#>  CRAN (R 3.4.0)                         
#>  local                                  
#>  CRAN (R 3.4.0)                         
#>  CRAN (R 3.4.0)                         
#>  Github (tidyverse/tibble@60281b3)      
#>  local                                  
#>  local                                  
#>  Github (jimhester/withr@fe81c00)       
#>  CRAN (R 3.4.0)
@DavisVaughan
Copy link
Member Author

Thinking about what might be coming in the future, what if bind_rows() called sloop::reconstruct() using the new object and the first object before returning? This could retain class and attributes of the first object, and is generic, but there may be problems I'm not thinking of.

@krlmlr
Copy link
Member

krlmlr commented Dec 27, 2017

Thanks. Calling reconstruct() feels right, but we might want a "symmetric" version of reconstruct() to support use cases where we keep the attributes of the second (third, ...) argument.

@DavisVaughan
Copy link
Member Author

By "symmetric" is that meaning keeping all custom attributes for each argument? Or just specifically attributes of the second argument, and not the first?

If it's the former, I can think of edge cases where the first and second arguments have attributes of the same name but with different data. Perhaps in this case the first argument should have precedence.

I assume this should propagate to bind_cols() too. It currently keeps all attributes of just the first object (different than bind_rows()).

library(dplyr)

iris_tbl1 <- as_tibble(iris)
iris_tbl2 <- as_tibble(iris)

attr(iris_tbl1, "my_attr")  <- "important thing 1"
attr(iris_tbl2, "my_attr")  <- "important thing 2"
attr(iris_tbl2, "my_attr2") <- "attr unique to 2"

# bind_cols() keeps attributes of the first object ONLY
bind_cols(iris_tbl1, iris_tbl2) %>%
  attributes()
#> $row.names
#> ...truncated
#> 
#> $class
#> [1] "tbl_df"     "tbl"        "data.frame"
#> 
#> $my_attr
#> [1] "important thing 1"
#> 
#> $names
#>  [1] "Sepal.Length"  "Sepal.Width"   "Petal.Length"  "Petal.Width"  
#>  [5] "Species"       "Sepal.Length1" "Sepal.Width1"  "Petal.Length1"
#>  [9] "Petal.Width1"  "Species1"

# bind_rows() of course loses all custom attributes
bind_rows(iris_tbl1, iris_tbl2) %>%
  attributes()
#> $names
#> [1] "Sepal.Length" "Sepal.Width"  "Petal.Length" "Petal.Width" 
#> [5] "Species"     
#> 
#> $row.names
#> ...truncated
#> 
#> $class
#> [1] "tbl_df"     "tbl"        "data.frame"

@krlmlr
Copy link
Member

krlmlr commented Jan 17, 2018

"Symmetric" means that the logic decides which attributes are inherited from which arguments, and that the attributes of bind_rows(a, b) and bind_rows(b, a) should be the same. Associativity would be great, too.

@gabefair

This comment has been minimized.

@krlmlr

This comment has been minimized.

@krlmlr
Copy link
Member

krlmlr commented Jan 13, 2019

Related to #3429?

@hadley
Copy link
Member

hadley commented May 27, 2019

Duplicate of #2457 and will be resolved by vctrs.

@hadley hadley closed this as completed May 27, 2019
@lock
Copy link

lock bot commented Nov 23, 2019

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/

@lock lock bot locked and limited conversation to collaborators Nov 23, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feature a feature request or enhancement vctrs ↗️
Projects
None yet
Development

No branches or pull requests

4 participants