# Section 04: The apply family
### `01-Use lapply with a built-in R function`

- Have a look at the strsplit() calls, that splits the strings in `pioneers` on the `:` sign. The result, `split_math` is a list of 4 character vectors: the first vector element represents the name, the second element the birth year.
- Use `lapply()` to convert the character vectors in `split_math` to lowercase letters: apply `tolower()` on each of the elements in `split_math`. Assign the result, which is a list, to a new variable `split_low`.
- Finally, inspect the contents of `split_low` with `str()`.

In [1]:
# The vector pioneers has already been created for you
pioneers <- c("GAUSS:1777", "BAYES:1702", "PASCAL:1623", "PEARSON:1857")

# Split names from birth year
split_math <- strsplit(pioneers, split = ":")

# Convert to lowercase strings: split_low
split_low <- lapply(split_math, tolower)

# Take a look at the structure of split_low
str(split_low)

List of 4
 $ : chr [1:2] "gauss" "1777"
 $ : chr [1:2] "bayes" "1702"
 $ : chr [1:2] "pascal" "1623"
 $ : chr [1:2] "pearson" "1857"


### `02-Use lapply with your own function`
- Apply `select_first()` over the elements of `split_low` with `lapply()` and assign the result to a new variable `names`.
- Next, write a function `select_second()` that does the exact same thing for the second element of an inputted vector.
- Finally, apply the `select_second()` function over `split_low` and assign the output to the variable `years`.


In [2]:
# Code from previous exercise:
pioneers <- c("GAUSS:1777", "BAYES:1702", "PASCAL:1623", "PEARSON:1857")
split <- strsplit(pioneers, split = ":")
split_low <- lapply(split, tolower)

# Write function select_first()
select_first <- function(x) {
  x[1]
}

# Apply select_first() over split_low: names
names <- lapply(split_low, select_first)

# Write function select_second()
select_second <- function(x) {
  x[2]
}

# Apply select_second() over split_low: years
years <- lapply(split_low, select_second)

### `03-lapply and anonymous functions`

Transform the first call of `lapply()` such that it uses an anonymous function that does the same thing.
In a similar fashion, convert the second call of `lapply` to use an anonymous version of the `select_second()` function.
Remove both the definitions of `select_first()` and `select_second()`, as they are no longer useful.

In [31]:
# split_low has been created for you
split_low

names <- lapply(split_low, function(x) {x[1]})
years <- lapply(split_low, function(x) {x[2]})

### `04-Use lapply with additional arguments`
- Use `lapply()` twice to call `select_el()` over all elements in `split_low`: once with the `index` equal to 1 and a second time with the index equal to 2. Assign the result to `names` and `years`, respectively.

In [1]:
# Definition of split_low
pioneers <- c("GAUSS:1777", "BAYES:1702", "PASCAL:1623", "PEARSON:1857")
split <- strsplit(pioneers, split = ":")
split_low <- lapply(split, tolower)

# Generic select function
select_el <- function(x, index) {
  x[index]
}

# Use lapply() twice on split_low: names and years
names <- lapply(split_low, select_el, index = 1)
years <-lapply(split_low, select_el, index = 2)

### `05-Apply functions that return NULL`


In [2]:
lapply(split_low, function(x) {
  if (nchar(x[1]) > 5) {
    return(NULL)
  } else {
    return(x[2])
  }
})

### `06-How to use sapply`

- Use `lapply()` to calculate the minimum (built-in function `min()`) of the temperature measurements for every day.
- Do the same thing but this time with `sapply()`. See how the output differs.
- Use `lapply()` to compute the the maximum (`max()`) temperature for each day.
- Again, use `sapply()` to solve the same question and see how `lapply()` and `sapply()` differ.

In [5]:
t1 <- c(3, 7, 9, 6, -1)
t2 <- c(6, 9, 12, 13, 5)
t3 <- c(4, 8, 3, -1, -3)
t4 <- c(1, 4, 7, 2, -2)
t5 <- c(5, 7, 9, 4, 2)
t6 <- c(-3, 5, 8, 9, 4)
t7 <- c(3, 6, 9, 4, 1)

temp <- list(t1, t2, t3, t4, t5, t6, t7)

- `Note: the result of lapply is a list where sapply is often a vector !`

In [11]:
# temp has already been defined in the workspace

# Use lapply() to find each day's minimum temperature
lapply(temp, min)

# Use sapply() to find each day's minimum temperature
sapply(temp, min)

# Use lapply() to find each day's maximum temperature
lapply(temp, max)

# Use sapply() to find each day's maximum temperature
sapply(temp, max)

###  `07-sapply with your own function`

- Finish the definition of `extremes_avg()`: it takes a vector of temperatures and calculates the average of the minimum and maximum temperatures of the vector.
- Next, use this function inside `sapply()` to apply it over the vectors inside `temp`.
- Use the same function over `temp` with `lapply()` and see how the outputs differ.


In [12]:
# temp is already defined in the workspace

# Finish function definition of extremes_avg
extremes_avg <- function(x) {
  ( min(x) + max(x) ) / 2
}

# Apply extremes_avg() over temp using sapply()
sapply(temp, extremes_avg)

# Apply extremes_avg() over temp using lapply()
lapply(temp, extremes_avg)

### `08-sapply with function returning vector`

- Finish the definition of the `extremes()` function. It takes a vector of numerical values and returns a vector containing the minimum and maximum values of a given vector, with the names "min" and "max", respectively.
- Apply this function over the vector `temp` using `sapply()`.
- Finally, apply this function over the vector `temp` using `lapply()` as well.

In [15]:
# temp is already available in the workspace

# Create a function that returns min and max of a vector: extremes
extremes <- function(x) {
  c(min = min(x), max = max(x))
}

# Apply extremes() over temp with sapply()
sapply(temp, extremes)

# Apply extremes() over temp with lapply()
lapply(temp, extremes)

0,1,2,3,4,5,6,7
min,-1,5,-3,-2,2,-3,1
max,9,13,8,7,9,9,9


### `09-sapply can't simplify, now what?`

- Apply `below_zero()` over `temp` using `sapply()` and store the result in `freezing_s`.
- Apply `below_zero()` over `temp` using `lapply()`. Save the resulting list in a variable freezing_l.
- Compare `freezing_s` to `freezing_l` using the `identical()` function.


In [16]:
# temp is already prepared for you in the workspace

# Definition of below_zero()
below_zero <- function(x) {
  return(x[x < 0])
}

# Apply below_zero over temp using sapply(): freezing_s
freezing_s <- sapply(temp, below_zero)

# Apply below_zero over temp using lapply(): freezing_l
freezing_l <- lapply(temp, below_zero)

# Are freezing_s and freezing_l identical?
identical(freezing_s, freezing_l)

### `10-sapply with functions that return NULL`
- Apply `print_info()` over the contents of `temp` with `sapply()`.
- Repeat this process with `lapply()`. Do you notice the difference?

In [20]:
# temp is already available in the workspace

# Definition of print_info()
print_info <- function(x) {
  cat("The average temperature is", mean(x), "\n")
}

# Apply print_info() over temp using sapply()
sapply(temp, print_info)

# Apply print_info() over temp using lapply()
lapply(temp, print_info)


The average temperature is 4.8 
The average temperature is 9 
The average temperature is 2.2 
The average temperature is 2.4 
The average temperature is 5.4 
The average temperature is 4.6 
The average temperature is 4.6 


The average temperature is 4.8 
The average temperature is 9 
The average temperature is 2.2 
The average temperature is 2.4 
The average temperature is 5.4 
The average temperature is 4.6 
The average temperature is 4.6 


### `11-Use vapply`
- Apply the function `basics()` over the list of temperatures, `temp`, using `vapply()`. This time, you can use `numeric(3)` to specify the `FUN.VALUE` argument.

In [21]:
# temp is already available in the workspace

# Definition of basics()
basics <- function(x) {
  c(min = min(x), mean = mean(x), max = max(x))
}

# Apply basics() over temp using vapply()
vapply(temp, basics, numeric(3))

0,1,2,3,4,5,6,7
min,-1.0,5,-3.0,-2.0,2.0,-3.0,1.0
mean,4.8,9,2.2,2.4,5.4,4.6,4.6
max,9.0,13,8.0,7.0,9.0,9.0,9.0


### `12-Use vapply (2)`
- Inspect the pre-loaded code and try to run it. If you haven't changed anything, an error should pop up. That's because `vapply()` still expects `basics()` to return a vector of length 3. The error message gives you an indication of what's wrong.
- Try to fix the error by editing the `vapply()` command.

In [22]:
# temp is already available in the workspace

# Definition of the basics() function
basics <- function(x) {
  c(min = min(x), mean = mean(x), median = median(x), max = max(x))
}

# Fix the error:
vapply(temp, basics, numeric(4))

0,1,2,3,4,5,6,7
min,-1.0,5,-3.0,-2.0,2.0,-3.0,1.0
mean,4.8,9,2.2,2.4,5.4,4.6,4.6
median,6.0,9,3.0,2.0,5.0,5.0,4.0
max,9.0,13,8.0,7.0,9.0,9.0,9.0


### `13-From sapply to vapply`

Convert all the `sapply()` expressions on the right to their `vapply()` counterparts. Their results should be exactly the same; you're only adding robustness. You'll need the templates `numeric(1)` and `logical(1)`.

In [24]:
# temp is already defined in the workspace

# Convert to vapply() expression
vapply(temp, max, numeric(1))

# Convert to vapply() expression
vapply(temp, function(x, y) { mean(x) > y }, y = 5, logical(1))

In [23]:
# temp is already defined in the workspace

# Convert to vapply() expression
vapply(temp, max, numeric(1))

# Convert to vapply() expression
vapply(temp, function(x, y) { mean(x) > y }, y = 5, logical(1))

### `The End`