23_Measuring_performance.Rmd

```{r, include = FALSE}
source("common.R")
```

# (PART) Techniques {-} 
\stepcounter{chapter}

# Measuring performance
<!-- 23 -->

\stepcounter{section}
## Profiling
<!-- 23.2 -->

__[Q1]{.Q}__: Profile the following function with `torture = TRUE`. What is surprising? Read the source code of `rm()` to figure out what's going on.

```{r}
f <- function(n = 1e5) {
  x <- rep(1, n)
  rm(x)
}
```

__[A]{.solved}__: We expect `f()` to create a vector (`x`) of length `n`, which is then removed so that `f()` just returns `NULL`. When we profile this function, it executes too fast for meaningful results.

```{r, error = TRUE}
profvis::profvis(f())
```

Setting `torture = TRUE` triggers garbage collection after every memory allocation call, which may be useful for more exact memory profiling.

```{r, eval = FALSE}
profvis::profvis(f(), torture = TRUE)
```

Surprisingly, profiling `f()` like this takes a very long time. What could be the reason?

We follow the hint in the question and inspect the source code of `rm()`:

```{r,eval = FALSE}
function (..., list = character(), pos = -1,
          envir = as.environment(pos), 
          inherits = FALSE) 
{
  dots <- match.call(expand.dots = FALSE)$...
  if (
    length(dots) && !all(
      vapply(dots, function(x) is.symbol(x) || 
             is.character(x), NA, USE.NAMES = FALSE)
    )
  ) 
    stop("... must contain names or character strings")
  names <- vapply(dots, as.character, "")
  if (length(names) == 0L) 
    names <- character()
  list <- .Primitive("c")(list, names)
  .Internal(remove(list, envir, inherits))
}
```

`rm()` does a surprising amount of work to get the name of the object to delete because it relies on non-standard evaluation.

We can make the job of `rm()` considerably simpler by using the `list` argument:

```{r,eval = FALSE}
f2 <- function(n = 1e5) {
  x <- rep(1, n)
  rm(list = "x")
}
profvis::profvis(f2(), torture = TRUE)
```

Unfortunately, this still takes too long, and we are literally stuck in profiling.

Anecdotally, one of the authors once finished the profiling under an older R version. But the output seemed to be not very meaningful.

In conclusion, this question appears to be unanswerable for us, even for Hadley.

## Microbenchmarking
<!-- 23.3 -->

__[Q1]{.Q}__: Instead of using `bench::mark()`, you could use the built-in function `system.time()`. But `system.time()` is much less precise, so you'll need to repeat each operation many times with a loop, and then divide to find the average time of each operation, as in the code below.

```{r, eval = FALSE}
n <- 1e6
system.time(for (i in 1:n) sqrt(x)) / n
system.time(for (i in 1:n) x ^ 0.5) / n
```

How do the estimates from `system.time()` compare to those from `bench::mark()`? Why are they different?

__[A]{.solved}__: We first microbenchmark these two expressions using `bench::mark()` [@bench] and observe that the mean is not reported (as it is generally more affected by outliers).

```{r}
n <- 1e6
x <- runif(100)

bench_df <- bench::mark(
  sqrt(x), 
  x ^ 0.5,
  iterations = n
)

bench_df
```

We need to access the raw data, so we can compare the results of both benchmarking approaches.

```{r}
t1_bench <- mean(unlist(bench_df[1, "time"]))
t2_bench <- mean(unlist(bench_df[2, "time"]))

t1_systime <- system.time(for (i in 1:n) sqrt(x)) / n
t2_systime <- system.time(for (i in 1:n) x ^ 0.5) / n
```

We see, that both approaches get the order of magnitude right. We assume, that the `bench::mark()`-results may be a little more accurate, because of its high precision timer. There may also be overhead introduced by the for loop in the `system.time()`-approach.

```{r}
# Compare the results
t1_systime["elapsed"]
t1_bench

t2_systime["elapsed"]
t2_bench
```

Side Note: take a look at `?proc.time` if you want to learn about the differences between "user", "system" and "elapsed" time.

__[Q2]{.Q}__: Here are two other ways to compute the square root of a vector. Which do you think will be fastest? Which will be slowest? Use microbenchmarking to test your answers.

```{r, eval = FALSE}
x ^ (1 / 2)
exp(log(x) / 2)
```

__[A]{.solved}__: To compare these approaches, we'll `bench::mark()` them and sort the result by the median execution time.

```{r}
x <- runif(100)

bm <- bench::mark(
  sqrt(x),
  x^0.5,
  x^(1 / 2),
  exp(log(x) / 2)
)

bm[order(bm$median), ]
```

As one might expect the idiomatic primitive function `sqrt()` is the fastest. The approach `exp(log(x) / 2)` which builds on two other primitive functions is second, even though already considerably slower. The other calculations are even slower: `x ^ 0.5` is faster than `x ^ (1 / 2)`, because `0.5` requires less computation than `(1 / 2)`.