In [1]:
library(tidyverse)

-- [1mAttaching packages[22m --------------------------------------- tidyverse 1.2.1 --
[32mv[39m [34mggplot2[39m 3.2.1     [32mv[39m [34mpurrr  [39m 0.3.3
[32mv[39m [34mtibble [39m 3.0.1     [32mv[39m [34mdplyr  [39m 0.8.3
[32mv[39m [34mtidyr  [39m 1.0.0     [32mv[39m [34mstringr[39m 1.4.0
[32mv[39m [34mreadr  [39m 1.3.1     [32mv[39m [34mforcats[39m 0.4.0
"package 'stringr' was built under R version 3.6.3"-- [1mConflicts[22m ------------------------------------------ tidyverse_conflicts() --
[31mx[39m [34mdplyr[39m::[32mfilter()[39m masks [34mstats[39m::filter()
[31mx[39m [34mdplyr[39m::[32mlag()[39m    masks [34mstats[39m::lag()


In [2]:
df <- tibble(
  a = rnorm(10),
  b = rnorm(10),
  c = rnorm(10),
  d = rnorm(10)
)

In [3]:
output <- vector("double", ncol(df))  # 1. output
for (i in seq_along(df)) {            # 2. sequence
  output[[i]] <- median(df[[i]])      # 3. body
}
output

In [4]:
seq_along(df)

In [5]:
# map function
# map() makes a list.
# map_lgl() makes a logical vector.
# map_int() makes an integer vector.
# map_dbl() makes a double vector.
# map_chr() makes a character vector
# Each function takes a vector as input, applies a function to each piece, and then returns a new vector that’s the same length

In [6]:
# Once you master these functions, you’ll find it takes much less time to solve iteration problems. 
# But you should never feel bad about using a for loop instead of a map function.

### Some people will tell you to avoid for loops because they are slow. They’re wrong!

In [7]:
# The chief benefits of using functions like map() is not speed, but clarity:

In [8]:
df <- tibble(
  a = rnorm(10),
  b = rnorm(10),
  c = rnorm(10),
  d = rnorm(10)
)

In [9]:
# examples
map_dbl(df, mean)

In [13]:
# map_*() uses … ([dot dot dot]) to pass along additional arguments
map_dbl(df, mean, trim = 0.5)
# The map functions also preserve names:
z <- list(x = 1:3, y = 4:5)
map_int(z, length)

In [14]:
models <- mtcars %>% 
  split(.$cyl) %>% 
  map(function(df) lm(mpg ~ wt, data = df))

In [21]:
#using anonymous function
models <- mtcars %>% 
  split(.$cyl) %>% 
  map(~lm(mpg ~ wt, data = .))

In [23]:
models %>% 
  map(summary) %>% 
  map_dbl(~.$r.squared)

In [24]:
# relation to Base R
# lapply() is basically identical to map()
# base sapply() is a wrapper around lapply() that automatically simplifies the output

In [25]:
# we could use map2() which iterates over two vectors in parallel:

In [28]:
sigma <- list(1, 5, 10)
mu <- list(5, 10, -3)
map2(mu, sigma, rnorm, n = 5) %>% str()
# Note that the arguments that vary for each call come before the function; 
# arguments that are the same for every call come after

List of 3
 $ : num [1:5] 5.86 5.89 4.79 4.75 3
 $ : num [1:5] 16.04 9.01 12.47 8.81 6.43
 $ : num [1:5] 3.72 -8.12 -10.18 -7.98 -14.09
