Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bind_rows() using tibbles with attributes loses attributes #3259

Closed
DavisVaughan opened this issue Dec 22, 2017 · 8 comments

Comments

@DavisVaughan
Copy link
Member

commented Dec 22, 2017

I assume this is very similar to #2457.

Using bind_rows() on two tibbles where either one has extra attributes removes all extra attributes. Perhaps an approach similar to #1692 can be taken where the attributes of the first are kept? Ideally I would like bind_rows() to be generic but I've read all the problems associated with that.

My use case is in tibbletime. My print.tbl_time() method relies on one of the attributes. Because the tbl_time class is kept after using bind_rows(), but the attributes are lost, this causes problems. It causes other problems too, but this shows up immediately.

library(dplyr)

iris_tbl <- as_tibble(iris)

attr(iris_tbl, "my_attr") <- "important thing"

# The classes of the first object are kept due to #1692
class(iris_tbl) <- c("my_new_class", class(iris_tbl))

attributes(iris_tbl)
#> $names
#> [1] "Sepal.Length" "Sepal.Width"  "Petal.Length" "Petal.Width" 
#> [5] "Species"     
#> 
#> $row.names
#>   ...truncated
#>
#> $class
#> [1] "my_new_class" "tbl_df"       "tbl"          "data.frame"  
#> 
#> $my_attr
#> [1] "important thing"

iris_doubled <- iris_tbl %>%
  bind_rows(iris_tbl)

# The attributes besides names / row.names / class are lost
iris_doubled %>%
  attributes()
#> $names
#> [1] "Sepal.Length" "Sepal.Width"  "Petal.Length" "Petal.Width" 
#> [5] "Species"     
#> 
#> $row.names
#>   ...truncated
#>
#> 
#> $class
#> [1] "my_new_class" "tbl_df"       "tbl"          "data.frame"
Session info
devtools::session_info()
#> Session info -------------------------------------------------------------
#>  setting  value                       
#>  version  R version 3.4.3 (2017-11-30)
#>  system   x86_64, darwin15.6.0        
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  tz       America/New_York            
#>  date     2017-12-22
#> Packages -----------------------------------------------------------------
#>  package    * version    date      
#>  assertthat   0.2.0      2017-04-11
#>  backports    1.1.0      2017-05-22
#>  base       * 3.4.3      2017-12-07
#>  bindr        0.1        2016-11-13
#>  bindrcpp     0.2        2017-06-17
#>  compiler     3.4.3      2017-12-07
#>  datasets   * 3.4.3      2017-12-07
#>  devtools     1.13.3     2017-08-02
#>  digest       0.6.12     2017-01-27
#>  dplyr      * 0.7.4      2017-09-28
#>  evaluate     0.10.1     2017-06-24
#>  glue         1.2.0      2017-10-29
#>  graphics   * 3.4.3      2017-12-07
#>  grDevices  * 3.4.3      2017-12-07
#>  htmltools    0.3.6      2017-04-28
#>  knitr        1.17       2017-08-10
#>  magrittr     1.5        2014-11-22
#>  memoise      1.1.0      2017-10-26
#>  methods    * 3.4.3      2017-12-07
#>  pillar       0.0.0.9000 2017-11-29
#>  pkgconfig    2.0.1      2017-03-21
#>  R6           2.2.2      2017-06-17
#>  Rcpp         0.12.14    2017-11-23
#>  rlang        0.1.4      2017-11-05
#>  rmarkdown    1.6.0.9004 2017-09-23
#>  rprojroot    1.2        2017-01-16
#>  stats      * 3.4.3      2017-12-07
#>  stringi      1.1.5      2017-04-07
#>  stringr      1.2.0      2017-02-18
#>  tibble       1.3.4.9003 2017-11-29
#>  tools        3.4.3      2017-12-07
#>  utils      * 3.4.3      2017-12-07
#>  withr        2.1.0.9000 2017-11-29
#>  yaml         2.1.14     2016-11-12
#>  source                                 
#>  CRAN (R 3.4.0)                         
#>  CRAN (R 3.4.0)                         
#>  local                                  
#>  CRAN (R 3.4.0)                         
#>  CRAN (R 3.4.0)                         
#>  local                                  
#>  local                                  
#>  CRAN (R 3.4.1)                         
#>  CRAN (R 3.4.0)                         
#>  CRAN (R 3.4.2)                         
#>  CRAN (R 3.4.0)                         
#>  CRAN (R 3.4.2)                         
#>  local                                  
#>  local                                  
#>  CRAN (R 3.4.0)                         
#>  cran (@1.17)                           
#>  CRAN (R 3.4.0)                         
#>  Github (hadley/memoise@d63ae9c)        
#>  local                                  
#>  Github (r-lib/pillar@5a082e1)          
#>  CRAN (R 3.4.0)                         
#>  CRAN (R 3.4.0)                         
#>  CRAN (R 3.4.3)                         
#>  CRAN (R 3.4.2)                         
#>  Github (DavisVaughan/rmarkdown@f85ba35)
#>  CRAN (R 3.4.0)                         
#>  local                                  
#>  CRAN (R 3.4.0)                         
#>  CRAN (R 3.4.0)                         
#>  Github (tidyverse/tibble@60281b3)      
#>  local                                  
#>  local                                  
#>  Github (jimhester/withr@fe81c00)       
#>  CRAN (R 3.4.0)
@DavisVaughan

This comment has been minimized.

Copy link
Member Author

commented Dec 22, 2017

Thinking about what might be coming in the future, what if bind_rows() called sloop::reconstruct() using the new object and the first object before returning? This could retain class and attributes of the first object, and is generic, but there may be problems I'm not thinking of.

@krlmlr

This comment has been minimized.

Copy link
Member

commented Dec 27, 2017

Thanks. Calling reconstruct() feels right, but we might want a "symmetric" version of reconstruct() to support use cases where we keep the attributes of the second (third, ...) argument.

@DavisVaughan

This comment has been minimized.

Copy link
Member Author

commented Dec 27, 2017

By "symmetric" is that meaning keeping all custom attributes for each argument? Or just specifically attributes of the second argument, and not the first?

If it's the former, I can think of edge cases where the first and second arguments have attributes of the same name but with different data. Perhaps in this case the first argument should have precedence.

I assume this should propagate to bind_cols() too. It currently keeps all attributes of just the first object (different than bind_rows()).

library(dplyr)

iris_tbl1 <- as_tibble(iris)
iris_tbl2 <- as_tibble(iris)

attr(iris_tbl1, "my_attr")  <- "important thing 1"
attr(iris_tbl2, "my_attr")  <- "important thing 2"
attr(iris_tbl2, "my_attr2") <- "attr unique to 2"

# bind_cols() keeps attributes of the first object ONLY
bind_cols(iris_tbl1, iris_tbl2) %>%
  attributes()
#> $row.names
#> ...truncated
#> 
#> $class
#> [1] "tbl_df"     "tbl"        "data.frame"
#> 
#> $my_attr
#> [1] "important thing 1"
#> 
#> $names
#>  [1] "Sepal.Length"  "Sepal.Width"   "Petal.Length"  "Petal.Width"  
#>  [5] "Species"       "Sepal.Length1" "Sepal.Width1"  "Petal.Length1"
#>  [9] "Petal.Width1"  "Species1"

# bind_rows() of course loses all custom attributes
bind_rows(iris_tbl1, iris_tbl2) %>%
  attributes()
#> $names
#> [1] "Sepal.Length" "Sepal.Width"  "Petal.Length" "Petal.Width" 
#> [5] "Species"     
#> 
#> $row.names
#> ...truncated
#> 
#> $class
#> [1] "tbl_df"     "tbl"        "data.frame"
@krlmlr

This comment has been minimized.

Copy link
Member

commented Jan 17, 2018

"Symmetric" means that the logic decides which attributes are inherited from which arguments, and that the attributes of bind_rows(a, b) and bind_rows(b, a) should be the same. Associativity would be great, too.

@gabefair

This comment was marked as off-topic.

Copy link

commented Apr 16, 2018

So is the use of bind_rows() on tibbles not supported?

@krlmlr

This comment was marked as resolved.

Copy link
Member

commented Apr 16, 2018

This issue is about tibbles with extra attributes. bind_rows() on regular tibbles is supported.

@krlmlr

This comment has been minimized.

Copy link
Member

commented Jan 13, 2019

Related to #3429?

@hadley

This comment has been minimized.

Copy link
Member

commented May 27, 2019

Duplicate of #2457 and will be resolved by vctrs.

@hadley hadley closed this May 27, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.