-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Darren Wong
committed
Aug 4, 2023
1 parent
c034d07
commit d57b0a9
Showing
7 changed files
with
130 additions
and
0 deletions.
There are no files selected for viewing
130 changes: 130 additions & 0 deletions
130
data-science-programming/r/advanced-r/2 Names and Values/2.3 Copy-on-modify.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,132 @@ | ||
#textbook_advanced-r #r | ||
|
||
The following code creates a vector and binds `x` and `y` names to that object. It then modifies `y`. | ||
|
||
```r | ||
x <- c(1, 2, 3) | ||
y <- x | ||
|
||
y[[3]] <- 4 | ||
x | ||
#> [1] 1 2 3 | ||
``` | ||
|
||
Modifying `y` didn't change `x`, so what happened to the shared object address? | ||
|
||
- The value associated with `y` changed. R created a new object to associate with `y` - a copy of `x` with one value changed. i.e. The copy only occurred when `y` was modified (*copy-on-modify*). | ||
- R objects are generally *immutable* (with a few [[important exceptions, including rules that lead to modify-in-place behaviour]]). | ||
|
||
## 2.3.1 `tracemem()` | ||
|
||
A useful tool for seeing how an object gets copied is by running `tracemem()` on the object. This will print a message telling you when the object was copied, its new address, and the sequence of calls that led to the copy. | ||
|
||
```r | ||
x <- c(1, 2, 3) | ||
cat(tracemem(x), '\n') | ||
#> <0x7feb7bfeaf98> | ||
|
||
y <- x | ||
y[[3]] <- 4L | ||
#> tracemem[0x7feb7bfeaf98 -> 0x7feb7bfb13f8]: | ||
|
||
# Turn memory tracing off | ||
untracemem(x) | ||
``` | ||
|
||
Note that if we modify `y` again, there won't be another copy. The new object that `y` is bound to will just get updated instead. This is because there is only one name bound to this object, so R applies [[modify-in-place optimisation]]. | ||
|
||
## 2.3.2 Function calls | ||
|
||
The same rules for copying also apply to function calls. The following code creates a function that calls the object passed as an argument and assigns it to the variable name we provide using arrow assignment. | ||
|
||
```r | ||
f <- function(a) { | ||
a | ||
} | ||
|
||
x <- c(1, 2, 3) | ||
cat(tracemem(x), "\n") | ||
#> <0x7feb7e83c858> | ||
|
||
z <- f(x) | ||
# No copy here | ||
``` | ||
|
||
While `f()` is running, the `a` points to the same value as the `x` does outside the function. Since it points to the same object with no modifications, the behaviour is the same as at the start of this page - i.e. `z` now points to the same value as `x` does. | ||
|
||
If `f()` did modify `x`, then R would create a new modified copy and bind that to `z` | ||
|
||
## 2.3.3 Lists | ||
|
||
It's not just names that point to variables, elements of lists do as well. | ||
|
||
```r | ||
l1 <- list(1, 2, 3) | ||
l2 <- l1 | ||
``` | ||
|
||
Instead of storing the values (1, 2, 3), the list stores *references* to them. When we copy the list, we're making a *shallow* copy; i.e. the list object and its bindings are copied, but the values pointed to by the bindings are not. | ||
|
||
![[Pasted image 20230804115514.png]] | ||
|
||
When we modify a list, only the modified items have their bindings changed | ||
|
||
```r | ||
l2[[3]] <- 4 | ||
``` | ||
|
||
![[Pasted image 20230804115506.png]] | ||
|
||
To see the values shared across lists, use `lobstr::ref()` which prints the memory address of each object along with a local ID so you can easily cross-reference shared components. | ||
|
||
```r | ||
> lobstr::ref(l1, l2) | ||
#> █ [1:0x7feb7bc31858] <list> | ||
#> ├─[2:0x7feb7914d270] <dbl> | ||
#> ├─[3:0x7feb7914d238] <dbl> | ||
#> └─[4:0x7feb7914d200] <dbl> | ||
#> | ||
#> █ [5:0x7feb69e430d8] <list> | ||
#> ├─[2:0x7feb7914d270] | ||
#> ├─[3:0x7feb7914d238] | ||
#> └─[6:0x7feb7ef98500] <dbl> | ||
``` | ||
|
||
## 2.3.4 Data-frames | ||
|
||
Data frames are just lists of vectors, so the concepts in [[#2.3.3 Lists]] are easily extensible to this case. | ||
|
||
When you modify a column, only that column needs to be modified; the others will still point to their original references. | ||
|
||
```r | ||
d1 <- data.frame(x = c(1, 5, 6), y = c(2, 4, 3)) | ||
d2 <- d1 | ||
d2[, 2] <- d2[, 2] * 2 | ||
``` | ||
|
||
![[Pasted image 20230804115842.png]] | ||
|
||
But if you modify a row, you're changing values across every column, so all columns will be copied | ||
|
||
```r | ||
d3 <- d1 | ||
d3[1, ] <- d3[1, ] * 3 | ||
``` | ||
|
||
![[Pasted image 20230804115924.png]] | ||
|
||
## 2.3.5 Character vectors | ||
|
||
The final place where R uses references is in character vectors. How character vectors actually work is that there is a *global string pool*, and each element of a character vector points to a unique string in the pool. You can request the reference of each string in the pool using `lobstr::ref(., character = TRUE)`. | ||
|
||
```r | ||
x <- c("a", "a", "abc", "d") | ||
lobstr::ref(x, character = TRUE) | ||
#> █ [1:0x7feb79c75b78] <chr> | ||
#> ├─[2:0x7fec002d8fc8] <string: "a"> | ||
#> ├─[2:0x7fec002d8fc8] | ||
#> ├─[3:0x7febca7dda88] <string: "abc"> | ||
#> └─[4:0x7fec003c6e70] <string: "d"> | ||
``` | ||
|
||
![[Pasted image 20230804120121.png]] |
Empty file.
Binary file added
BIN
+31.5 KB
...ing/r/advanced-r/2 Names and Values/attachments/Pasted image 20230804115506.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+17.8 KB
...ing/r/advanced-r/2 Names and Values/attachments/Pasted image 20230804115514.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+41 KB
...ing/r/advanced-r/2 Names and Values/attachments/Pasted image 20230804115842.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+40 KB
...ing/r/advanced-r/2 Names and Values/attachments/Pasted image 20230804115924.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+35.8 KB
...ing/r/advanced-r/2 Names and Values/attachments/Pasted image 20230804120121.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.