Skip to content

Commit

Permalink
Vault backup: 2023-08-04 12:27:50
Browse files Browse the repository at this point in the history
  • Loading branch information
Darren Wong committed Aug 4, 2023
1 parent c034d07 commit d57b0a9
Show file tree
Hide file tree
Showing 7 changed files with 130 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -1,2 +1,132 @@
#textbook_advanced-r #r

The following code creates a vector and binds `x` and `y` names to that object. It then modifies `y`.

```r
x <- c(1, 2, 3)
y <- x

y[[3]] <- 4
x
#> [1] 1 2 3
```

Modifying `y` didn't change `x`, so what happened to the shared object address?

- The value associated with `y` changed. R created a new object to associate with `y` - a copy of `x` with one value changed. i.e. The copy only occurred when `y` was modified (*copy-on-modify*).
- R objects are generally *immutable* (with a few [[important exceptions, including rules that lead to modify-in-place behaviour]]).

## 2.3.1 `tracemem()`

A useful tool for seeing how an object gets copied is by running `tracemem()` on the object. This will print a message telling you when the object was copied, its new address, and the sequence of calls that led to the copy.

```r
x <- c(1, 2, 3)
cat(tracemem(x), '\n')
#> <0x7feb7bfeaf98>

y <- x
y[[3]] <- 4L
#> tracemem[0x7feb7bfeaf98 -> 0x7feb7bfb13f8]:

# Turn memory tracing off
untracemem(x)
```

Note that if we modify `y` again, there won't be another copy. The new object that `y` is bound to will just get updated instead. This is because there is only one name bound to this object, so R applies [[modify-in-place optimisation]].

## 2.3.2 Function calls

The same rules for copying also apply to function calls. The following code creates a function that calls the object passed as an argument and assigns it to the variable name we provide using arrow assignment.

```r
f <- function(a) {
a
}

x <- c(1, 2, 3)
cat(tracemem(x), "\n")
#> <0x7feb7e83c858>

z <- f(x)
# No copy here
```

While `f()` is running, the `a` points to the same value as the `x` does outside the function. Since it points to the same object with no modifications, the behaviour is the same as at the start of this page - i.e. `z` now points to the same value as `x` does.

If `f()` did modify `x`, then R would create a new modified copy and bind that to `z`

## 2.3.3 Lists

It's not just names that point to variables, elements of lists do as well.

```r
l1 <- list(1, 2, 3)
l2 <- l1
```

Instead of storing the values (1, 2, 3), the list stores *references* to them. When we copy the list, we're making a *shallow* copy; i.e. the list object and its bindings are copied, but the values pointed to by the bindings are not.

![[Pasted image 20230804115514.png]]

When we modify a list, only the modified items have their bindings changed

```r
l2[[3]] <- 4
```

![[Pasted image 20230804115506.png]]

To see the values shared across lists, use `lobstr::ref()` which prints the memory address of each object along with a local ID so you can easily cross-reference shared components.

```r
> lobstr::ref(l1, l2)
#> █ [1:0x7feb7bc31858] <list>
#> ├─[2:0x7feb7914d270] <dbl>
#> ├─[3:0x7feb7914d238] <dbl>
#> └─[4:0x7feb7914d200] <dbl>
#>
#> █ [5:0x7feb69e430d8] <list>
#> ├─[2:0x7feb7914d270]
#> ├─[3:0x7feb7914d238]
#> └─[6:0x7feb7ef98500] <dbl>
```

## 2.3.4 Data-frames

Data frames are just lists of vectors, so the concepts in [[#2.3.3 Lists]] are easily extensible to this case.

When you modify a column, only that column needs to be modified; the others will still point to their original references.

```r
d1 <- data.frame(x = c(1, 5, 6), y = c(2, 4, 3))
d2 <- d1
d2[, 2] <- d2[, 2] * 2
```

![[Pasted image 20230804115842.png]]

But if you modify a row, you're changing values across every column, so all columns will be copied

```r
d3 <- d1
d3[1, ] <- d3[1, ] * 3
```

![[Pasted image 20230804115924.png]]

## 2.3.5 Character vectors

The final place where R uses references is in character vectors. How character vectors actually work is that there is a *global string pool*, and each element of a character vector points to a unique string in the pool. You can request the reference of each string in the pool using `lobstr::ref(., character = TRUE)`.

```r
x <- c("a", "a", "abc", "d")
lobstr::ref(x, character = TRUE)
#> █ [1:0x7feb79c75b78] <chr>
#> ├─[2:0x7fec002d8fc8] <string: "a">
#> ├─[2:0x7fec002d8fc8]
#> ├─[3:0x7febca7dda88] <string: "abc">
#> └─[4:0x7fec003c6e70] <string: "d">
```

![[Pasted image 20230804120121.png]]
Empty file.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit d57b0a9

Please sign in to comment.