Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected result with .SD #927

Closed
jrowen opened this issue Oct 31, 2014 · 1 comment
Closed

Unexpected result with .SD #927

jrowen opened this issue Oct 31, 2014 · 1 comment
Assignees
Labels
Milestone

Comments

@jrowen
Copy link

jrowen commented Oct 31, 2014

I was expecting the two approaches shown below to result in the same output: group by the Grp and then calculate A - B - C for each group. The second example is likely preferable, but is the first example a misuse/misunderstanding of .SD or a bug?

> library(data.table)
data.table 1.9.4  For help type: ?data.table
*** NB: by=.EACHI is now explicit. See README to restore previous behaviour.

> library(reshape2)

> tmp = data.table(Grp=LETTERS[1:10], A=1:10, B=11:20, C=21:30)

> tmp
    Grp  A  B  C
 1:   A  1 11 21
 2:   B  2 12 22
 3:   C  3 13 23
 4:   D  4 14 24
 5:   E  5 15 25
 6:   F  6 16 26
 7:   G  7 17 27
 8:   H  8 18 28
 9:   I  9 19 29
10:   J 10 20 30

> subtract = function(DT) {
+   for (nm in names(DT)[-1])
+     set(DT, j = nm, value = DT[[nm]] * -1)
+   DT[, base::sum(.SD)]
+ }

> tmp[, list(Value = subtract(.SD)), by = Grp]
    Grp Value
 1:   A   -31
 2:   B     2
 3:   C     3
 4:   D     4
 5:   E     5
 6:   F     6
 7:   G     7
 8:   H   NaN
 9:   I     9
10:   J    10

> subtract2 = function(x) {
+   ct = length(x)
+   x[2:ct] = x[2:ct] * -1
+   sum(x)
+ }

> tmp_ = melt(tmp, id.vars = "Grp", variable.factor = FALSE)

> tmp_[, list(Value = subtract2(value)), by = Grp]
    Grp Value
 1:   A   -31
 2:   B   -32
 3:   C   -33
 4:   D   -34
 5:   E   -35
 6:   F   -36
 7:   G   -37
 8:   H   -38
 9:   I   -39
10:   J   -40

> sessionInfo()
R version 3.1.1 (2014-07-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods  
[7] base     

other attached packages:
[1] reshape2_1.4     data.table_1.9.4

loaded via a namespace (and not attached):
[1] chron_2.3-45  plyr_1.8.1    Rcpp_0.11.3   stringr_0.6.2
[5] tools_3.1.1
@arunsrinivasan
Copy link
Member

.SD is not allowed for update by reference using := and set yet. Seems like this case wasn't taken care of. It should be an error. Will fix. Thank you.

You should use subtract(copy(.SD)) - of course this is an inefficient way to do it. But it'll get better once shallow is exported.

@arunsrinivasan arunsrinivasan added this to the v1.9.6 milestone Oct 31, 2014
@arunsrinivasan arunsrinivasan self-assigned this Oct 31, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants