Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gather causing segfault #553

Closed
benstory opened this issue Feb 12, 2019 · 4 comments
Closed

gather causing segfault #553

benstory opened this issue Feb 12, 2019 · 4 comments

Comments

@benstory
Copy link

@benstory benstory commented Feb 12, 2019

Fresh R install, fresh RStudio install, and fresh installation of tidyverse. The following command produces a segfault:

library(tidyr)
library(dplyr)

gather(tibble(iris))
> > *** caught segfault *** > address 0x0, cause 'unknown' > > Traceback: > 1: remove_rownames(x) > 2: FUN(X[[i]], ...) > 3: lapply(.x, .f, ...) > 4: map(x, strip_dim) > 5: lst_to_tibble(x, .rows, .name_repair, col_lengths(x)) > 6: as_tibble.list(unclass(x), ..., .rows = .rows, .name_repair = .name_repair) > 7: as_tibble.data.frame(output) > 8: as_tibble(output) > 9: reconstruct_tibble(data, out, gather_vars) > 10: gather.data.frame(tibble(iris)) > 11: gather(tibble(iris)) > > Possible actions: > 1: abort (with core dump, if enabled) > 2: normal R exit > 3: exit R without saving workspace > 4: exit R saving workspace >

On older versions of maybe dplyr and/or tidy and/or R the following error occurs after trying to convert iris into a tibble:

Error: Column iris must be a 1d atomic vector or a list



R version 3.5.2 (2018-12-20)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.3

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] forcats_0.3.0   stringr_1.4.0   dplyr_0.7.8     purrr_0.3.0     readr_1.3.1    
[6] tidyr_0.8.2     tibble_2.0.1    ggplot2_3.1.0   tidyverse_1.2.1

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.0       cellranger_1.1.0 pillar_1.3.1     compiler_3.5.2   plyr_1.8.4      
 [6] bindr_0.1.1      tools_3.5.2      jsonlite_1.6     lubridate_1.7.4  nlme_3.1-137    
[11] gtable_0.2.0     lattice_0.20-38  pkgconfig_2.0.2  rlang_0.3.1      cli_1.0.1       
[16] rstudioapi_0.9.0 yaml_2.2.0       haven_2.0.0      bindrcpp_0.2.2   withr_2.1.2     
[21] xml2_1.2.0       httr_1.4.0       generics_0.0.2   hms_0.4.2        grid_3.5.2      
[26] tidyselect_0.2.5 glue_1.3.0       R6_2.3.0         fansi_0.4.0      readxl_1.2.0    
[31] modelr_0.1.3     magrittr_1.5     backports_1.1.3  scales_1.0.0     rvest_0.3.2     
[36] assertthat_0.2.0 colorspace_1.4-0 utf8_1.1.4       stringi_1.2.4    lazyeval_0.2.1  
[41] munsell_0.5.0    broom_0.5.1      crayon_1.3.4   
@benstory benstory changed the title gather (newest version) causing segfault gather causing segfault Feb 12, 2019
@benstory
Copy link
Author

@benstory benstory commented Feb 12, 2019

Also causes (understandably) RStudio to crash

@yutannihilation
Copy link
Member

@yutannihilation yutannihilation commented Feb 13, 2019

Currently, gather() cannot handle 2d (e.g. data.frame or matrix) columns. (c.f. #544)

tibble::tibble(iris)
#> # A tibble: 150 x 1
#>    iris$Sepal.Length $Sepal.Width $Petal.Length $Petal.Width $Species
#>                <dbl>        <dbl>         <dbl>        <dbl> <fct>   
#>  1               5.1          3.5           1.4          0.2 setosa  
#>  2               4.9          3             1.4          0.2 setosa  
#>  3               4.7          3.2           1.3          0.2 setosa  
#>  4               4.6          3.1           1.5          0.2 setosa  
#>  5               5            3.6           1.4          0.2 setosa  
#>  6               5.4          3.9           1.7          0.4 setosa  
#>  7               4.6          3.4           1.4          0.3 setosa  
#>  8               5            3.4           1.5          0.2 setosa  
#>  9               4.4          2.9           1.4          0.2 setosa  
#> 10               4.9          3.1           1.5          0.1 setosa  
#> # … with 140 more rows

Created on 2019-02-13 by the reprex package (v0.2.1.9000)

I think it takes a while for tidyr to support data.frame columns. Should gather() raise an error for 2d columns until then? An error is better than crash. (I expect spread() is not so difficult to tweak.)

@hadley
Copy link
Member

@hadley hadley commented Feb 13, 2019

@yutannihilation my goal is to have reshaping of df-cols working within the next month, but it'll be through a new verb so gather() will still need a fix. I agree that an clear error message is the right approach here.

@yutannihilation
Copy link
Member

@yutannihilation yutannihilation commented Feb 13, 2019

@hadley Thanks for the plan, sounds great! Then an error message is good because it's a nice place to advertise the new verb :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants