Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dplyr::arrange causes R to crash #2308

Closed
BillVenables opened this issue Dec 11, 2016 · 9 comments
Closed

dplyr::arrange causes R to crash #2308

BillVenables opened this issue Dec 11, 2016 · 9 comments

Comments

@BillVenables
Copy link

BillVenables commented Dec 11, 2016

If you include in arrange() the name of a non-existent column on which to sort it can cause R itself to crash.
Here is an example.

> library(dplyr)

Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

    filter, lag

The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union

> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.1 LTS

locale:
 [1] LC_CTYPE=en_AU.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_AU.UTF-8        LC_COLLATE=en_AU.UTF-8    
 [5] LC_MONETARY=en_AU.UTF-8    LC_MESSAGES=en_AU.UTF-8   
 [7] LC_PAPER=en_AU.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] dplyr_0.5.0

loaded via a namespace (and not attached):
[1] drat_0.1.2     magrittr_1.5   R6_2.2.0       assertthat_0.1 DBI_0.5-1     
[6] tools_3.3.2    parallel_3.3.2 tibble_1.2     Rcpp_0.12.8   
> d <- data.frame(a = runif(10))
> arrange(d, a, Dud)
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Aborted (core dumped)
@r2evans
Copy link

r2evans commented Dec 14, 2016

FYI, this does not happen for me:

library(dplyr)
# Attaching package: 'dplyr'
# The following objects are masked from 'package:stats':
#     filter, lag
# The following objects are masked from 'package:base':
#     intersect, setdiff, setequal, union
d <- data.frame(a = runif(10))
arrange(d, a, Dud)
# Error in arrange_impl(.data, dots) : binding not found: 'Dud'
sessionInfo()
# R version 3.3.1 (2016-06-21)
# Platform: x86_64-w64-mingw32/x64 (64-bit)
# Running under: Windows 10 x64 (build 14393)
# locale:
# [1] LC_COLLATE=English_United States.1252 
# [2] LC_CTYPE=English_United States.1252   
# [3] LC_MONETARY=English_United States.1252
# [4] LC_NUMERIC=C                          
# [5] LC_TIME=English_United States.1252    
# attached base packages:
# [1] stats     graphics  grDevices utils     datasets  methods   base     
# other attached packages:
# [1] dplyr_0.5.0.9000 r2_0.4.23       
# loaded via a namespace (and not attached):
#  [1] Rcpp_0.12.8     digest_0.6.10   rprojroot_1.1   assertthat_0.1 
#  [5] R6_2.2.0        DBI_0.5-13      backports_1.0.4 magrittr_1.5   
#  [9] evaluate_0.10   stringi_1.1.2   lazyeval_0.2.0  rmarkdown_1.2  
# [13] tools_3.3.1     stringr_1.1.0   compiler_3.3.1  htmltools_0.3.5
# [17] knitr_1.15.1    tibble_1.2     

(I know the OP is on linux, R-3.3.2, and dplyr-0.5.0, just thought I'd add perspective.)

@krlmlr
Copy link
Member

krlmlr commented Dec 15, 2016

Thanks. I couldn't replicate this on Linux either. But I see something strange happening in a vanilla R session with valgrind enabled:

==15136== Source and destination overlap in memcpy_chk(0x530d404, 0x530d400, 53)
==15136==    at 0x4C34467: __memcpy_chk (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==15136==    by 0x4F28951: memmove (string3.h:59)
==15136==    by 0x4F28951: R_ConciseTraceback.constprop.8 (errors.c:1384)
==15136==    by 0x4F29666: verrorcall_dflt (errors.c:684)
==15136==    by 0x4F2D60A: errorcall_dflt.constprop.6 (errors.c:712)
==15136==    by 0x4F2D67E: do_dfltStop (errors.c:1734)
==15136==    by 0x4F39DC0: bcEval (eval.c:5658)
==15136==    by 0x4F4499F: Rf_eval (eval.c:616)
==15136==    by 0x4F4661C: Rf_applyClosure (eval.c:1135)
==15136==    by 0x4F44B2C: Rf_eval (eval.c:732)
==15136==    by 0x23A548D1: dplyr_arrange_impl (RcppExports.cpp:67)
==15136==    by 0x4F09733: do_dotcall (dotcode.c:1251)
==15136==    by 0x4F44F5E: Rf_eval (eval.c:713)
==15136== 

Could you please run:

R --vanilla --debugger=valgrind -e 'library(dplyr); d <- data.frame(a = runif(10)); arrange(d, a, Dud)'

@BillVenables
Copy link
Author

BillVenables commented Dec 16, 2016 via email

@krlmlr
Copy link
Member

krlmlr commented Dec 16, 2016

Thanks. Could you please try Rcpp from GitHub and/or Rcpp 0.12.7 from the archives?

@BillVenables
Copy link
Author

BillVenables commented Dec 16, 2016 via email

@krlmlr
Copy link
Member

krlmlr commented Dec 16, 2016

Thanks. I assume all apt packages are up to date on your system? The next thing to try would be to install dplyr in a fresh R package library.

If that fails, could you please post valgrind output for 0.12.8.2? I'm still pretty sure it's an Rcpp issue, but the valgrind output will be helpful anyway.

@kevinushey
Copy link
Contributor

Note that you likely need to re-install both Rcpp and dplyr; ie, you want to rebuild dplyr against the updated Rcpp sources.

@BillVenables
Copy link
Author

BillVenables commented Dec 16, 2016 via email

@krlmlr
Copy link
Member

krlmlr commented Dec 16, 2016

Thanks, to Bill for your patience, and to Kevin for the hint. I've brought R's lack of warnings here to R-devel, we'll see.

@hadley hadley closed this as completed Jan 31, 2017
@lock lock bot locked as resolved and limited conversation to collaborators Jun 8, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants