Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R-Forge #5222] 'not found' when DT[, list(sum(non-.SD-col), lapply(.SD,mean)), by=..., .SDcols=...] #495

Closed
arunsrinivasan opened this issue Jun 8, 2014 · 12 comments
Assignees
Milestone

Comments

@arunsrinivasan
Copy link
Member

@arunsrinivasan arunsrinivasan commented Jun 8, 2014

Submitted by: Matt Weller; Assigned to: Nobody; R-Forge link

When using .SDcols (for the purpose of applying a function to multiple columns) I cannot reference other columns in the original table (v1) using the following syntax:

dt = data.table(grp=c(2,3,3,1,1,2,3), v1=1:7, v2=7:1, v3=10:16)
dt.out = dt[, c(v1 = sum(v1),  lapply(.SD,mean)), by = grp, .SDcols = v2:v3]
# Error in `[.data.table`(dt, , list(v1 = sum(v1), lapply(.SD, mean)), by = grp,  : 
#   object 'v1' not found

A similar error happens when I use c instead of list, clearly the column v1 cannot be accessed within the j clause.

I resorted to the following code which includes column v1, even though I do not want that to be included in the lapply portion, having to drop it after computation.

sd.cols = c("v1","v2", "v3")
dt.out = dt[, c(sum.v1 = sum(v1), lapply(.SD,mean)), by = grp, .SDcols = sd.cols]

According to eddi on Stackoverflow this is a bug and he has asked me to report it. I cannot provide much more detail as I'm not exactly sure which part he thinks was a bug, looking at the accepted answer by Arun and their ensuing discussion will highlight where but the problem lies.

Here is the relevant SO post.

@arunsrinivasan
Copy link
Member Author

@arunsrinivasan arunsrinivasan commented Jan 4, 2015

@MichaelChirico
Copy link
Member

@MichaelChirico MichaelChirico commented Jul 11, 2015

Bit late, but adding this question of mine to the pile

@jangorecki
Copy link
Member

@jangorecki jangorecki commented Jul 11, 2015

I didn't even think about it as a bug, usually I provide additional required fields to .SDcols and later in j I use .SD[, !"total", with=FALSE] to exclude unwanted column.

@MichaelChirico
Copy link
Member

@MichaelChirico MichaelChirico commented Jul 11, 2015

That's another good workaround, I wonder the performance difference vis-a-vis using dt$total. And yes, this sort of dances the line between FR and bug, IMO.

@DavidArenburg
Copy link
Member

@DavidArenburg DavidArenburg commented Aug 19, 2015

Bumping this up again. Looks like this could be a very important fix. this question seem to be related to and could be potentially solved via DT[, (deltaColsNewNames) := lapply(.SD, normalDelta, price), .SDcols = deltaColsNames]

@franknarf1
Copy link
Contributor

@franknarf1 franknarf1 commented Sep 10, 2015

Here's another simple case where this would be useful: http://stackoverflow.com/a/32498711/1191259

@rentrop
Copy link

@rentrop rentrop commented Oct 5, 2015

@franknarf1
Copy link
Contributor

@franknarf1 franknarf1 commented Oct 9, 2015

Another to update when fixed: http://stackoverflow.com/q/32915770/1191259

@arunsrinivasan
Copy link
Member Author

@arunsrinivasan arunsrinivasan commented Mar 7, 2016

Yay! we can now do this:

require(data.table)
dt = data.table(grp=c(2,3,3,1,1,2,3), v1=1:7, v2=7:1, v3=10:16)
dt.out = dt[, c(v1 = sum(v1),  lapply(.SD,mean)), by = grp, .SDcols = v2:v3]
  #  grp v1  v2   v3
# 1:   2  7 4.5 12.5
# 2:   3 12 4.0 13.0
# 3:   1  9 3.5 13.5

@arunsrinivasan
Copy link
Member Author

@arunsrinivasan arunsrinivasan commented Mar 8, 2016

Updated all SO posts linked here. Thanks to all.

@DavidArenburg
Copy link
Member

@DavidArenburg DavidArenburg commented Mar 8, 2016

Thanks, @arunsrinivasan. I was waiting for this fix for couple of years.

@rentrop
Copy link

@rentrop rentrop commented Mar 8, 2016

Awesome! Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
6 participants