Nov 18, 2013


Here is a toy example where acast crashes (seg_fault) because of split_indices. The memory footprints of the objects does not seem to be an issues for 16Gb computer.
A similar issues has been discussed here:

# This example is fine
indata <- data.frame(A=rep(1:10000,20), B=rep(1:100,200)) 
print(object.size(indata),units="Mb") # 1.5 Mb
outdata <- acast(indata, A ~ B)
print(object.size(outdata),units="Mb") # 4.4 Mb

# This one crashes
indata <- data.frame(A=rep(1:100000,20), B=rep(1:100,2000)) 
print(object.size(indata),units="Mb") # 15.3 Mb
outdata <- acast(indata, A ~ B) # <- crashes here !!

The problem seems to be in this call

.Call("split_indices", group, as.integer(n))
> sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] reshape2_1.2.2

loaded via a namespace (and not attached):
[1] plyr_1.8      stringr_0.6.2 tools_3.0.1  


Solved by installing latest version from straight from github.

