-
Notifications
You must be signed in to change notification settings - Fork 969
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
setDT and setkeyv inside function #5618
Comments
Please let me know if I can add more information or context to make this question more understandable. |
One of the comments in an answer to your linked stack overflow question summarizes the big picture in broad strokes:
If you don't want a |
Thanks for the comment. Indeed, the side effect was really surprising and I wanted to understand how that mechanically happens. After |
This assumption actually contradicts itself. When you make a shallow copy, you create a new list of pointers to the same data:
so if you change > foo <- function(dt) {
print(paste("dt:", address(dt), " dt$a:", address(dt$a)))
setDT(dt)
print(paste("dt:", address(dt), " dt$a:", address(dt$a)))
setkeyv(dt, "a")
print(paste("dt:", address(dt), " dt$a:", address(dt$a)))
return(dt[])
}
> x <- data.frame(a = c(3,2,1))
> address(x)
[1] "0x11f8acbc0"
> address(x$a)
[1] "0x111a15798"
> foo(x)
[1] "dt: 0x11f8acbc0 dt$a: 0x111a15798"
[1] "dt: 0x10ee0a600 dt$a: 0x111a15798"
[1] "dt: 0x10ee0a600 dt$a: 0x111a15798"
a
1: 1
2: 2
3: 3
> x
a
1: 1
2: 2
3: 3
> address(x)
[1] "0x11f8acbc0"
> address(x$a)
[1] "0x111a15798" If you want a deep copy, you should use foo <- function(df) {
+ print(paste("df:", address(df), " df$a:", address(df$a)))
+ dt <- as.data.table(df)
+ print(paste("df:", address(df), " df$a:", address(df$a)))
+ print(paste("dt:", address(dt), " dt$a:", address(dt$a)))
+ setkeyv(dt, "a")
+ print(paste("df:", address(df), " df$a:", address(df$a)))
+ print(paste("dt:", address(dt), " dt$a:", address(dt$a)))
+ return(dt[])
+ }
> x <- data.frame(a = c(3,2,1))
> foo(x)
[1] "df: 0x10b7e2b70 df$a: 0x1111fcb78"
[1] "df: 0x10b7e2b70 df$a: 0x1111fcb78"
[1] "dt: 0x11194ce00 dt$a: 0x10b5524a8"
[1] "df: 0x10b7e2b70 df$a: 0x1111fcb78"
[1] "dt: 0x11194ce00 dt$a: 0x10b5524a8"
a
1: 1
2: 2
3: 3
> x
a
1 3
2 2
3 1 If you want consistent reference semantics, you should call > bar <- function(dt) {
+ print(paste("dt:", address(dt), " dt$a:", address(dt$a)))
+ setkeyv(dt, "a")
+ print(paste("dt:", address(dt), " dt$a:", address(dt$a)))
+ dt[, b := 7]
+ print(paste("dt:", address(dt), " dt$a:", address(dt$a), " dt$b:", address(dt$b)))
+ return(dt[])
+ }
> y <- data.frame(a = c(7,6,5))
> address(y)
[1] "0x1101fa8c8"
> address(y$a)
[1] "0x13eb354d8"
> setDT(y)
> address(y)
[1] "0x111a0be00"
> address(y$a)
[1] "0x13eb354d8"
> bar(y)
[1] "dt: 0x111a0be00 dt$a: 0x13eb354d8"
[1] "dt: 0x111a0be00 dt$a: 0x13eb354d8"
[1] "dt: 0x111a0be00 dt$a: 0x13eb354d8 dt$b: 0x111998598"
a b
1: 5 7
2: 6 7
3: 7 7
> y
a b
1: 5 7
2: 6 7
3: 7 7
> address(y)
[1] "0x111a0be00"
> address(y$a)
[1] "0x13eb354d8"
> address(y$b)
[1] "0x111998598" |
Hi all! This is likely related to #4783 and #4816.
Consider this example:
The above is expected. From what I understood
setDT
will make a shallow copy ofx
and then reference another object, hence the change in address from0x106cbd5c8
to0x115082a00
. However, what is unexpected to me is that the order of the original variable,x
, also changed:Is this expected behavior? I thought that the
setDT
insidefoo
would make a shallow copy of the passed object, and would then reference that copy instead of the original, so thatsetkeyv
would only arrange the data inside the function, but not the data outside the function. Just like in the example in this #SO where the data in the original variable is not modified.Question: Why does
setkeyv
reordersx
, even though we are callingsetDT
insidefoo
?The text was updated successfully, but these errors were encountered: