I’ve found a bug in dcast in reshape, where NA values are not correctly handled if drop=FALSE. Example:
d = data.frame(x = c(10, 10), y = c("A", NA), z=1:2)
# d$y = as.character(d$y)
d2 = melt(d, measure.vars="z")
dcast(d, x+y~., fun=sum, value.var="z", drop=FALSE)
The actual error message depends on whether y is a factor or a character vector (run the commented line). When it’s a factor vector, the error message says:
Error in split_indices(seq_along(.value), .group, .n) :
INTEGER() can only be applied to a 'integer', not a 'pairlist'
When it’s a character vector, it says:
Error: nrow(res$labels[]) == nrow(data) is not TRUE
And the most surprising thing is that if y is a factor vector and I repeatedly run the dcast line, R actually crashes. This is 100% reproducible: One Windows it crashes the third time the dcast line is run, and on Linux it crashes the first time. I can also reproduce it with the latest Git version.
--please do not edit the information below--
Maintainer: Hadley Wickham email@example.com
Built: R 2.15.1; ; 2012-06-23 16:36:47 UTC; windows
platform = i386-pc-mingw32
arch = i386
os = mingw32
system = i386, mingw32
major = 2
minor = 15.1
year = 2012
month = 06
day = 22
svn rev = 59607
language = R
version.string = R version 2.15.1 (2012-06-22)
nickname = Roasted Marshmallows
Windows XP (build 2600) Service Pack 3
.GlobalEnv, package:reshape2, package:stats, package:graphics,
package:grDevices, package:datasets, package:utils, package:methods,
With the dev version of plyr, this at least no longer crashes.
Better handling of .
Fixes #24. Fixes #6