Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gather causing segfault #553

Closed
benstory opened this issue Feb 12, 2019 · 4 comments

Comments

@benstory
Copy link

commented Feb 12, 2019

Fresh R install, fresh RStudio install, and fresh installation of tidyverse. The following command produces a segfault:

library(tidyr)
library(dplyr)

gather(tibble(iris))
> > *** caught segfault *** > address 0x0, cause 'unknown' > > Traceback: > 1: remove_rownames(x) > 2: FUN(X[[i]], ...) > 3: lapply(.x, .f, ...) > 4: map(x, strip_dim) > 5: lst_to_tibble(x, .rows, .name_repair, col_lengths(x)) > 6: as_tibble.list(unclass(x), ..., .rows = .rows, .name_repair = .name_repair) > 7: as_tibble.data.frame(output) > 8: as_tibble(output) > 9: reconstruct_tibble(data, out, gather_vars) > 10: gather.data.frame(tibble(iris)) > 11: gather(tibble(iris)) > > Possible actions: > 1: abort (with core dump, if enabled) > 2: normal R exit > 3: exit R without saving workspace > 4: exit R saving workspace >

On older versions of maybe dplyr and/or tidy and/or R the following error occurs after trying to convert iris into a tibble:

Error: Column iris must be a 1d atomic vector or a list



R version 3.5.2 (2018-12-20)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.3

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] forcats_0.3.0   stringr_1.4.0   dplyr_0.7.8     purrr_0.3.0     readr_1.3.1    
[6] tidyr_0.8.2     tibble_2.0.1    ggplot2_3.1.0   tidyverse_1.2.1

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.0       cellranger_1.1.0 pillar_1.3.1     compiler_3.5.2   plyr_1.8.4      
 [6] bindr_0.1.1      tools_3.5.2      jsonlite_1.6     lubridate_1.7.4  nlme_3.1-137    
[11] gtable_0.2.0     lattice_0.20-38  pkgconfig_2.0.2  rlang_0.3.1      cli_1.0.1       
[16] rstudioapi_0.9.0 yaml_2.2.0       haven_2.0.0      bindrcpp_0.2.2   withr_2.1.2     
[21] xml2_1.2.0       httr_1.4.0       generics_0.0.2   hms_0.4.2        grid_3.5.2      
[26] tidyselect_0.2.5 glue_1.3.0       R6_2.3.0         fansi_0.4.0      readxl_1.2.0    
[31] modelr_0.1.3     magrittr_1.5     backports_1.1.3  scales_1.0.0     rvest_0.3.2     
[36] assertthat_0.2.0 colorspace_1.4-0 utf8_1.1.4       stringi_1.2.4    lazyeval_0.2.1  
[41] munsell_0.5.0    broom_0.5.1      crayon_1.3.4   
@benstory benstory changed the title gather (newest version) causing segfault gather causing segfault Feb 12, 2019
@benstory

This comment has been minimized.

Copy link
Author

commented Feb 12, 2019

Also causes (understandably) RStudio to crash

@yutannihilation

This comment has been minimized.

Copy link
Member

commented Feb 13, 2019

Currently, gather() cannot handle 2d (e.g. data.frame or matrix) columns. (c.f. #544)

tibble::tibble(iris)
#> # A tibble: 150 x 1
#>    iris$Sepal.Length $Sepal.Width $Petal.Length $Petal.Width $Species
#>                <dbl>        <dbl>         <dbl>        <dbl> <fct>   
#>  1               5.1          3.5           1.4          0.2 setosa  
#>  2               4.9          3             1.4          0.2 setosa  
#>  3               4.7          3.2           1.3          0.2 setosa  
#>  4               4.6          3.1           1.5          0.2 setosa  
#>  5               5            3.6           1.4          0.2 setosa  
#>  6               5.4          3.9           1.7          0.4 setosa  
#>  7               4.6          3.4           1.4          0.3 setosa  
#>  8               5            3.4           1.5          0.2 setosa  
#>  9               4.4          2.9           1.4          0.2 setosa  
#> 10               4.9          3.1           1.5          0.1 setosa  
#> # … with 140 more rows

Created on 2019-02-13 by the reprex package (v0.2.1.9000)

I think it takes a while for tidyr to support data.frame columns. Should gather() raise an error for 2d columns until then? An error is better than crash. (I expect spread() is not so difficult to tweak.)

@hadley

This comment has been minimized.

Copy link
Member

commented Feb 13, 2019

@yutannihilation my goal is to have reshaping of df-cols working within the next month, but it'll be through a new verb so gather() will still need a fix. I agree that an clear error message is the right approach here.

@yutannihilation

This comment has been minimized.

Copy link
Member

commented Feb 13, 2019

@hadley Thanks for the plan, sounds great! Then an error message is good because it's a nice place to advertise the new verb :)

hadley added a commit that referenced this issue Feb 28, 2019
@hadley hadley added the wip label Feb 28, 2019
@hadley hadley closed this in #562 Mar 2, 2019
hadley added a commit that referenced this issue Mar 2, 2019
Fixes #553
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.