Skip to content

segfault when using melt in a dplyr operation #357

@eipi10

Description

@eipi10

I'm getting a segfault when I use melt at the end of a series of chained operations in dplyr.

Here's the code I'm running:

df %.%
    filter(!is.na(ftf.flg), !is.na(m.rmd), !is.na(C1Apass), !is.na(C1Apal),
           ftf.flg=="FTF") %.%
    group_by(ftf.flg, C1Apass, C1Apal, m.rmd) %.%
    summarise(num=length(!is.na(C1Agrd.pts)),
              avgGrd=round(mean(C1Agrd.pts, na.rm=TRUE),1),
              p25Grd=round(quantile(C1Agrd.pts, probs=0.25, na.rm=TRUE),1)) %.%
    mutate(pct=round(num/sum(num)*100,2)) %.%
    melt(id.var=1:4)   

It runs fine if I stop at the end of the mutate operation. But if I include melt, I get a segfault. It also runs fine, including the melt operation, if I exclude the line in the summarise section that begins p25Grd=round(quantile....

The segfault is repeatable regardless of how many rows of the data I subset down to. See below for a small subset of the data for reproducing the error (the actual data set has tens of thousands of rows and dozens of variables), the segfault message, and the output of sessionInfo().

df = structure(list(term.desc = structure(c(18L, 15L, 17L, 16L, 15L, 
18L, 16L, 17L, 17L, 16L, 18L, 15L, 16L, 17L, 18L, 15L, 18L, 17L, 
15L, 16L, 18L, 16L, 17L, 15L, 18L, 15L, 16L, 17L, 16L, 15L, 18L, 
17L, 16L, 15L, 17L, 18L, 17L, 18L, 16L, 15L), .Label = c("Spring 2005", 
"Fall 2005", "Spring 2006", "Fall 2006", "Spring 2007", "Fall 2007", 
"Spring 2008", "Fall 2008", "Spring 2009", "Fall 2009", "Spring 2010", 
"Fall 2010", "Spring 2011", "Fall 2011", "Spring 2012", "Fall 2012", 
"Spring 2013", "Fall 2013", "Spring 2014"), class = c("ordered", 
"factor")), ftf.flg = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L), .Label = c("FTF", "TRF"), class = "factor"), m.rmd = c(NA, 
NA, NA, NA, "Not Remedial", "Not Remedial", "Not Remedial", "Not Remedial", 
"Not Remedial", "Not Remedial", "Not Remedial", "Not Remedial", 
"Remedial", "Remedial", "Remedial", "Remedial", "Not Remedial", 
"Not Remedial", "Not Remedial", "Not Remedial", "Not Remedial", 
"Not Remedial", "Not Remedial", "Not Remedial", "Not Remedial", 
"Not Remedial", "Not Remedial", "Not Remedial", "Not Remedial", 
"Not Remedial", "Not Remedial", "Not Remedial", "Not Remedial", 
"Not Remedial", "Not Remedial", "Not Remedial", "Not Remedial", 
"Not Remedial", "Not Remedial", "Not Remedial"), C1Apass = c(0, 
0, 0, 0, 0, 0, 0, 0, NA, NA, NA, NA, 0, 0, 0, 0, 0, 0, 0, 0, 
2, 2, 2, 2, NA, NA, NA, NA, 0, 0, 0, 0, NA, NA, NA, NA, 0, 0, 
0, 0), C1Apal = c(2, 2, 2, 2, 1, 1, 1, 1, NA, NA, NA, NA, 2, 
2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, NA, NA, NA, NA, 2, 2, 2, 2, 
NA, NA, NA, NA, 0, 0, 0, 0), C1Agrd.pts = c(3.3, 3.3, 3.3, 3.3, 
2, 2, 2, 2, NA, NA, NA, NA, 1.7, 1.7, 1.7, 1.7, 1.7, 1.7, 1.7, 
1.7, 0, 0, 0, 0, NA, NA, NA, NA, 4, 4, 4, 4, NA, NA, NA, NA, 
4, 4, 4, 4)), .Names = c("term.desc", "ftf.flg", "m.rmd", "C1Apass", 
"C1Apal", "C1Agrd.pts"), row.names = c(4866L, 4868L, 4870L, 4876L, 
7496L, 7500L, 7501L, 7503L, 12606L, 12609L, 12610L, 12612L, 15335L, 
15337L, 15342L, 15351L, 22897L, 22899L, 22900L, 22907L, 25027L, 
25032L, 25035L, 25038L, 28737L, 28738L, 28740L, 28744L, 29280L, 
29284L, 29290L, 29296L, 41366L, 41368L, 41371L, 41378L, 42468L, 
42472L, 42473L, 42475L), class = "data.frame")

Here's the information R displays when the segfault occurs:

 *** caught segfault ***
address 0x1200000b5, cause 'memory not mapped'

Traceback:
 1: unlist(unname(data[var$measure]))
 2: melt.data.frame(`__prev`, id.var = 1:4)
 3: melt(`__prev`, id.var = 1:4)
 4: eval(expr, envir, enclos)
 5: eval(new_call, e)
 6: chain_q(list(substitute(x), substitute(y)), env = parent.frame())
 7: df %.% filter(!is.na(ftf.flg), !is.na(m.rmd), !is.na(C1Apass),     !is.na(C1Apal), ftf.flg == "FTF") %.% group_by(ftf.flg, C1Apass,     C1Apal, m.rmd) %.% summarise(num = length(!is.na(C1Agrd.pts)),     avgGrd = round(mean(C1Agrd.pts, na.rm = TRUE), 1), p25Grd = round(quantile(C1Agrd.pts,         probs = 0.25, na.rm = TRUE), 1)) %.% mutate(pct = round(num/sum(num) *     100, 2)) %.% melt(id.var = 1:4)

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace

Here's the Session Info:

> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] reshape2_1.2.2 dplyr_0.1.3   

loaded via a namespace (and not attached):
[1] assertthat_0.1 plyr_1.8.1     Rcpp_0.11.1    stringr_0.6.2  tools_3.0.2  

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions