Rperf.Rmd

---
title: "R in Grenoble #38"
output:
  xaringan::moon_reader:
    seal: false
    lib_dir: libs
    nature:
      highlightStyle: github
      highlightLines: true
      countIncrementalSlides: false
---

```{r setup, include=FALSE}
options(htmltools.dir.version = FALSE, width = 70)
knitr::opts_chunk$set(fig.align = 'center', dev = "svg", out.width = "70%",
                      echo = FALSE, comment = "", fig.width = 5, global.par = TRUE)
ICON_R_PROJECT <- icons::fontawesome$brands$`r-project`
ICON_TRI_EXCL <- icons::fontawesome$solid$`exclamation-triangle`
```

class: title-slide center middle inverse

<br>

# Performance of `r icons::icon_style(fill = "white", ICON_R_PROJECT)` code

<br>

## `r icons::icon_style(fill = "white", ICON_R_PROJECT)` in Grenoble

<br>

### Florian Privé

<br>

---

class: center middle inverse

# Part 1: Memory management in `r icons::icon_style(fill = "white", ICON_R_PROJECT)`

<br>

#### Read more with [this chapter of Advanced R](https://adv-r.hadley.nz/names-values.html)

---

### Understanding binding basics

```{r, echo = TRUE}
x <- c(1, 2, 3)
```

* It's creating an object, a vector of values, `c(1, 2, 3)`.
* And it's binding that object to a name, `x`.

```{r, out.width="32%"}
knitr::include_graphics("https://d33wubrfki0l68.cloudfront.net/bd90c87ac98708b1731c92900f2f53ec6a71edaf/ce375/diagrams/name-value/binding-1.png")
```

--

<br>

```{r, echo = TRUE}
y <- x
```
```{r, out.width="30%"}
knitr::include_graphics("https://d33wubrfki0l68.cloudfront.net/bdc72c04d3135f19fb3ab13731129eb84c9170af/f0ab9/diagrams/name-value/binding-2.png")
```

---

### Copy-on-modify

```{r, echo=TRUE}
x <- c(1, 2, 3)
y <- x
y[3] <- 4
```

--

```{r, echo=TRUE}
x
```

--

```{r, out.width="28%"}
knitr::include_graphics("https://d33wubrfki0l68.cloudfront.net/ef9f480effa2f1d0e401d1f94218d0cf118433c0/b56e9/diagrams/name-value/binding-3.png")
```

---

### Copy-on-modify: what about inside functions?

```{r, echo=TRUE}
f <- function(a) {
  a
}
x <- c(1, 2, 3)
z <- f(x)
```

--

```{r, out.width="30%"}
knitr::include_graphics("https://d33wubrfki0l68.cloudfront.net/e8718027aabaed377da311f45b45a179588e4dcf/6bf90/diagrams/name-value/binding-f2.png")
```

--

```{r, echo=TRUE}
f2 <- function(a) {
  a[1] <- 10
  a
}
z2 <- f2(x)
```

--

```{r}
cbind(x, z2)
```

---

### Lists

It's not just names (i.e. variables) that point to values; elements of lists do too. 

```{r, echo=TRUE}
l1 <- list(1, 2, 3)
```

```{r, out.width="32%"}
knitr::include_graphics("https://d33wubrfki0l68.cloudfront.net/dae84980f1586fc4ef47091c91f51a5737b38135/a1403/diagrams/name-value/list.png")
```

--

```{r, echo=TRUE}
l2 <- l1
```

```{r, out.width="30%"}
knitr::include_graphics("https://d33wubrfki0l68.cloudfront.net/52bc0e3da3382cba957a9d83397b6c9200906ce2/c72aa/diagrams/name-value/l-modify-1.png")
```

---

### Copy-on-modify for lists?

```{r, echo=TRUE}
l2[[3]] <- 4
```

--

```{r, out.width="38%"}
knitr::include_graphics("https://d33wubrfki0l68.cloudfront.net/b844bb5a3443e1344299627f5760e2ae3a9885b5/e1c76/diagrams/name-value/l-modify-2.png")
```

---

### Data frames

<br>

**Data frames are lists of vectors.**

<br>

```{r, echo=TRUE}
d1 <- data.frame(x = c(1, 5, 6), y = c(2, 4, 3))
```
```{r, out.width="28%"}
knitr::include_graphics("https://d33wubrfki0l68.cloudfront.net/80d8995999aa240ff4bc91bb6aba2c7bf72afc24/95ee6/diagrams/name-value/dataframe.png")
```

---

```{r, echo=TRUE}
d2 <- d1
d2[, 2] <- d2[, 2] * 2  # modify one column
```

--

```{r, out.width="30%"}
knitr::include_graphics("https://d33wubrfki0l68.cloudfront.net/c19fd7e31bf34ceff73d0fac6e3ea22b09429e4a/23d8d/diagrams/name-value/d-modify-c.png")
```

--

```{r, echo=TRUE}
d3 <- d1
d3[1, ] <- d3[1, ] * 3  # modify one row
```

--

```{r, out.width="45%"}
knitr::include_graphics("https://d33wubrfki0l68.cloudfront.net/36df61f54d1ac62e066fb814cb7ba38ea6047a74/facf8/diagrams/name-value/d-modify-r.png")
```

---

class: center middle inverse

# Part 2: Why loops are slow in R?

---

### Never grow a vector

<br>

Example computing the cumulative sums of a vector:

<br>

```{r, echo=TRUE, eval=FALSE}
x <- rnorm(10e3)

current_sum <- 0
res <- c()

for (x_i in x) {
  current_sum <- current_sum + x_i
  res <- c(res, current_sum)
}
```

<br>

Why is this code bad?

---

```{r, out.width = "68%"}
knitr::include_graphics("https://privefl.github.io/blog/images/stairs.jpg")
```

---

### Much faster code by pre-allocating

<br>

If you know the size of the result in advance, you should pre-allocate it.

<br>

```{r, echo=TRUE, eval=FALSE}
current_sum <- 0
res2 <- double(length(x))  # same as rep(0, length(x))

for (i in seq_along(x)) {
  current_sum <- current_sum + x[i]
  res2[i] <- current_sum
}
```

---

### Much faster code by using a list

<br>

It is okay to grow a list.

<br>

```{r, echo=TRUE, eval=FALSE}
current_sum <- 0
res3 <- list()

for (i in seq_along(x)) {
  current_sum <- current_sum + x[i]
  res3[[i]] <- current_sum
}

unlist(res3)
```

---

### Much faster code by growing a vector (the right way)

<br>

Since `r ICON_R_PROJECT` v3.4, you can efficiently grow a vector (not with `c()` though).

<br>

```{r, echo=TRUE, eval=FALSE}
current_sum <- 0
res4 <- c()

for (i in seq_along(x)) {
  current_sum <- current_sum + x[i]
  res4[i] <- current_sum
}
```

---

### Much faster code by using existing efficient functions

<br>

Some base `r ICON_R_PROJECT` functions are code in C or Fortan, and are very fast. Use them! 

<br>

```{r, echo=TRUE, eval=FALSE}
res5 <- cumsum(x)
```

--

<br>

#### Other examples

- Using `rowMeans(x)` is much faster than `apply(x, 1, mean)`. 

- If you want more efficient functions that apply to rows and columns of matrices, you can use [package {matrixStats}](https://github.com/HenrikBengtsson/matrixStats). 

- When reading large text files, rather than `read.table()`, prefer using `data.table::fread()` (or `bigreadr::fread2()`).

- Generally, packages that uses C/Rcpp are efficient.

---

### Loops vs sapply() vs vectorization

<br>

Three ways to compute the sum of two vectors (element-wise):

<br>

```{r, echo=TRUE}
add_loop_prealloc <- function(x, y) {
  res <- double(length(x))
  for (i in seq_along(x)) {
    res[i] <- x[i] + y[i]
  }
  res
}

add_sapply <- function(x, y) {
  sapply(seq_along(x), function(i) x[i] + y[i])
}

add_vectorized <- `+`
```

---

### Benchmark

```{r, echo=TRUE, eval=FALSE}
N <- 10e3; x <- runif(N); y <- rnorm(N)

microbenchmark::microbenchmark(
        LOOP = add_loop_prealloc(x, y),
      SAPPLY = add_sapply(x, y),
  VECTORIZED = add_vectorized(x, y)
)
```

--

```{r, echo=FALSE, cache=TRUE}
N <- 10e3; x <- runif(N); y <- rnorm(N)

microbenchmark::microbenchmark(
        LOOP = add_loop_prealloc(x, y),
      SAPPLY = add_sapply(x, y),
  VECTORIZED = add_vectorized(x, y)
)
```

<br>

Loops are actually faster than `sapply()` because they can benefit from just-in-time compilation (JIT, since `r ICON_R_PROJECT` v3.4).

Vectorization is by far the best.

---

### Why vectorizing?

<br>

I call *vectorized* a function that takes vectors as arguments and operate on each element of these vectors in another (compiled) language (such as C, C++ and Fortran).

--

As an interpreted language, for each iteration `res[i] <- x[i] + y[i]`, `r ICON_R_PROJECT` has to ask:

- what is the type of `x[i]` and `y[i]`?

- can I add these two types? 

- what is the type of `x[i] + y[i]` then?

- can I store this result in `res` or do I need to convert it?

These questions must be answered for each iteration, which takes time.    
Some of this is alleviated by JIT compilation.

--

On the contrary, for vectorized functions, these questions must be answered only once, which saves a lot of time. Read more with [Noam Ross’s blog post on vectorization](http://www.noamross.net/blog/2014/4/16/vectorization-in-r--why.html).

---

### Example of vectorization
Suppose we wish to estimate the integral $\int_0^1 x^2 dx$ using a Monte-Carlo method. Essentially, we throw darts at the curve and count the number of darts that fall below the curve (as in the following figure).

```{r monte-carlo, echo=FALSE, out.width="88%"}
knitr::include_graphics("https://bookdown.org/csgillespie/efficientR/_main_files/figure-html/3-1-1.png")
```

---

### How to vectorize this code?

<br>

Naive R code implementing this Monte-Carlo algorithm:

<br>

```{r, echo=TRUE}
monte_carlo <- function(nb_samp) {
  
  hits <- 0
  for (i in seq_len(nb_samp)) {
    x <- runif(1)
    y <- runif(1)
    if (y < x^2) hits <- hits + 1
  }
  
  hits / nb_samp
}
```

```{r, echo=TRUE}
monte_carlo(1e4)
```

---

### A better solution

```{r, echo=TRUE}
monte_carlo2 <- function(nb_samp) {
  
  all_x <- runif(nb_samp)
  all_y <- runif(nb_samp)
  
  hits <- 0
  for (i in seq_len(nb_samp)) {
    x <- all_x[i]
    y <- all_y[i]
    if (y < x^2) hits <- hits + 1
  }
  
  hits / nb_samp
}
```

---

### A better solution

```{r, echo=TRUE}
monte_carlo3 <- function(nb_samp) {
  
  all_x <- runif(nb_samp)
  all_y <- runif(nb_samp)
  test <- all_y < all_x^2
  
  hits <- 0
  for (i in seq_len(nb_samp)) {
    if (test[i]) hits <- hits + 1
  }
  
  hits / nb_samp
}
```

---

### An even better solution

```{r, echo=TRUE}
monte_carlo4 <- function(nb_samp) {
  
  all_x <- runif(nb_samp)
  all_y <- runif(nb_samp)
  test <- all_y < all_x^2
  
  mean(test)
}
```

--

```{r, echo=TRUE}
monte_carlo5 <- function(nb_samp) {
  mean(runif(nb_samp) < runif(nb_samp)^2)
}
```

--

```{r, echo=TRUE, cache=TRUE}
c(monte_carlo (1e6),
  monte_carlo2(1e6),
  monte_carlo3(1e6),
  monte_carlo4(1e6),
  monte_carlo5(1e6))
```

---

### Benchmark

```{r, echo=TRUE, cache=TRUE}
microbenchmark::microbenchmark(
  monte_carlo (1e4),
  monte_carlo2(1e4),
  monte_carlo3(1e4),
  monte_carlo4(1e4),
  monte_carlo5(1e4)
)
```

---

class: center middle inverse

# Part 3: Other strategies
# to make your code faster

---

### Identify *where* your code is slow

<br>

> "Programmers waste enormous amounts of time thinking about, or worrying
> about, the speed of noncritical parts of their programs, and these attempts 
> at efficiency actually have a strong negative impact when debugging and
> maintenance are considered."
>
> -- Donald Knuth.

<br>

--

Trying to optimize each and every part of your code    
$\Longrightarrow$ time lost + code too complex

R is great at prototyping quickly; always start with that!        
If performance matters, then profile your code to see which part of your code is taking too much time and optimize only this part!

Learn more on how to profile your code in RStudio in [this article](https://support.rstudio.com/hc/en-us/articles/218221837-Profiling-R-code-with-the-RStudio-IDE).

---

### Let us profile the code we used before

<br>

```{r, echo=TRUE, eval=FALSE}
x <- rnorm(50e3)

current_sum <- 0
res <- c()

for (x_i in x) {
  current_sum <- current_sum + x_i
  res <- c(res, current_sum)
}
```

<br>

In RStudio, 'Profile' panel $\rightarrow$ Profile Selected Line(s).

---

### Rcpp: making R functions from C++ code

Rcpp lives between R and C++, so that you can get

- the performance of C++,

- the convenience of R.

--

Typical bottlenecks that C++ can address include:

- Recursive functions, or problems which involve calling functions **millions of times**. 
The overhead of calling a function in C++ is much lower than that in R.

- Loops that **can’t be easily vectorized** because subsequent iterations depend on previous ones.

- Problems that require **advanced data structures** and algorithms that R doesn’t provide. Through the standard template library (STL), C++ has efficient implementations of many important data structures, from ordered maps to double-ended queues. See [this chapter](https://adv-r.hadley.nz/rcpp.html#stl).

<br>

To learn more, have a look at [my presentation on Rcpp](https://privefl.github.io/R-presentation/Rcpp.html).

---

### Parallelization

<br>

For parallelizing R code, I basically always use `foreach` and recommend to do so. See [my guide to parallelism in R with `foreach`](https://privefl.github.io/blog/a-guide-to-parallelism-in-r/).

<br>

Just remember to **optimize your code before** trying to parallelize it.

---

class: inverse, center, middle

# Thanks!

<br>

Presentation available at [bit.ly/RUGgre38](https://bit.ly/RUGgre38)

<br>

`r icons::icon_style(fill = "white", icons::fontawesome$brands$twitter)` `r icons::icon_style(fill = "white", icons::fontawesome$brands$github)` privefl

.footnote[Slides created via the R package [**xaringan**](https://github.com/yihui/xaringan)]