Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

names(.SD) sometimes incorrect #1965

Closed
MichaelChirico opened this issue Dec 19, 2016 · 5 comments · Fixed by #3771
Closed

names(.SD) sometimes incorrect #1965

MichaelChirico opened this issue Dec 19, 2016 · 5 comments · Fixed by #3771
Milestone

Comments

@MichaelChirico
Copy link
Member

@MichaelChirico MichaelChirico commented Dec 19, 2016

Related to #495. names(.SD) incorrectly (unexpectedly?) includes all variable names in j:

DT = structure(list(CRIM = c(0.00632, 0.02731, 0.02729, 0.03237, 0.06905
), INDUS = c(2.31, 7.07, 7.07, 2.18, 2.18), NOX = c(0.538, 0.469, 
0.469, 0.458, 0.458), MEDV = c(24, 21.6, 34.7, 33.4, 36.2)), .Names = c("CRIM", 
"INDUS", "NOX", "MEDV"), class = c("data.table", "data.frame"
), row.names = c(NA, -5L))
DT
#       CRIM INDUS   NOX MEDV
# 1: 0.00632  2.31 0.538 24.0
# 2: 0.02731  7.07 0.469 21.6
# 3: 0.02729  7.07 0.469 34.7
# 4: 0.03237  2.18 0.458 33.4
# 5: 0.06905  2.18 0.458 36.2

I expected the following to reproduce the problem, but it doesn't (i.e., this works as expected):

DT[ , {
  print(names(.SD))
  c(list(MEDV), lapply(.SD, `+`, 1))},
  .SDcols = !"MEDV"]
# [1] "CRIM"  "INDUS" "NOX"  
# [...]

So I'm including the more complicated example which brought the problem out:

DT[ , paste0(
  MEDV, " | ", sapply(transpose(lapply(
    names(.SD), function(jj)
      paste0(jj, ":", get(jj)))),
    paste, collapse = " ")),
  .SDcols = !"MEDV"]
# [1] "24 | CRIM:0.00632 INDUS:2.31 NOX:0.538 MEDV:24"    
# [2] "21.6 | CRIM:0.02731 INDUS:7.07 NOX:0.469 MEDV:21.6"
# [3] "34.7 | CRIM:0.02729 INDUS:7.07 NOX:0.469 MEDV:34.7"
# [4] "33.4 | CRIM:0.03237 INDUS:2.18 NOX:0.458 MEDV:33.4"
# [5] "36.2 | CRIM:0.06905 INDUS:2.18 NOX:0.458 MEDV:36.2"

MEDV:.. shouldn't be there at the end. If we exclude the outer paste0, it's not:

DT[ , sapply(transpose(lapply(
    names(.SD), function(jj)
      paste0(jj, ":", get(jj)))),
    paste, collapse = " "),
  .SDcols = !"MEDV"]
# [1] "CRIM:0.00632 INDUS:2.31 NOX:0.538"
# [2] "CRIM:0.02731 INDUS:7.07 NOX:0.469"
# [3] "CRIM:0.02729 INDUS:7.07 NOX:0.469"
# [4] "CRIM:0.03237 INDUS:2.18 NOX:0.458"
# [5] "CRIM:0.06905 INDUS:2.18 NOX:0.458"

Maybe somehow names(.SD) is interacting with get?

Even the old trick of replacing MEDV with DT$MEDV doesn't work (nor copying it)!

I had to define mv in the main environment and replace MEDV with mv.

@MichaelChirico
Copy link
Member Author

@MichaelChirico MichaelChirico commented Dec 19, 2016

@franknarf1
Copy link
Contributor

@franknarf1 franknarf1 commented Dec 19, 2016

Maybe related to #1744

@MichaelChirico
Copy link
Member Author

@MichaelChirico MichaelChirico commented Dec 19, 2016

@franknarf1 in fact I guess it's the same. I'll leave this open just in case...

@jangorecki
Copy link
Member

@jangorecki jangorecki commented Mar 17, 2019

paste is not a problem here, the problem is that MEDV was referred in j, and that somehow forced it to be included in names(.SD)

@MichaelChirico
Copy link
Member Author

@MichaelChirico MichaelChirico commented Aug 17, 2019

@jangorecki I have a fix but it's not quite so simple as having MEDV in j:

DT[ , {MEDV; names(.SD)}, .SDcols = !'MEDV']
[1] "CRIM"  "INDUS" "NOX" 

It's get, so in fact this is identical to #1744:

DT[ , {MEDV; get('CRIM'); names(.SD)}, .SDcols = !'MEDV']
[1] "CRIM"  "INDUS" "NOX"   "MEDV"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants