Name:

Student ID: 

# Worksheet B3: Nesting, List Columns, and `purrr` (16 points)
**Fall 2025**


There are 16 questions, each worth 1 point. 


From this topic, students are anticipated to be able to:

- Use the `map` family of functions from the purrr package to iteratively apply a function.
- Create and operate on list columns in a tibble using `nest()`, `unnest()`, and the `map` family of functions.
- Define functions on-the-fly within a `map` function using shortcuts.
- Apply list columns to cases in data analysis: columns of models, columns of nested lists (JSON-style data), and operating on entire groups within a tibble.

Load the worksheet requirements:

In [None]:
suppressPackageStartupMessages(library(testthat))
suppressPackageStartupMessages(library(digest))
suppressPackageStartupMessages(library(tidyverse))
suppressPackageStartupMessages(library(palmerpenguins))
suppressPackageStartupMessages(library(glue))
suppressPackageStartupMessages(library(gapminder))
suppressPackageStartupMessages(library(broom))
suppressPackageStartupMessages(library(repurrrsive))

The following code chunk has been unlocked, to give you the flexibility to start this document with some of your own code. Remember, it's bad manners to keep a call to `install.packages()` in your source code, so don't forget to delete these lines if you ever need to run them. 

In [None]:
# An unlocked code chunk.


# Part 1: Exploring `purrr` Fundamentals


The `purrr` package is also part of the `tidyverse`.

Apply a function to each element in a list/vector with `map`.

General usage: `purrr::map(VECTOR_OR_LIST, YOUR_FUNCTION)`

Note:

- `map` always returns a list.
- `YOUR_FUNCTION` can return anything!

There are many variations of `map_*`, which you can find in this [cheatsheet](https://github.com/rstudio/cheatsheets/blob/master/purrr.pdf).

For the next few tasks, you will be converting for-loop(s) to vectorized expressions that reproduce the output (numbers should be the same, the format can be different).

## QUESTION 1

{points: 1}


Without using vectorization, take the square root of the following vector:

In [None]:
x <- 1:10

The result should be a list of the calculations. Store your answer in `answer1`.

```r
answer1 <- map(FILL_THIS_IN, FILL_THIS_IN)
```

In [None]:
# your code here
fail() # No Answer - remove if you provide an answer
answer1

In [None]:
library(digest)
stopifnot("type of answer1 is not list"= setequal(digest(paste(toString(class(answer1)), "e8e68")), "302b38335cbe54a6393db9763130b7e6"))
stopifnot("length of answer1 is not correct"= setequal(digest(paste(toString(length(answer1)), "e8e68")), "6d9f5a67a6c6b27f19b2a64b4932925a"))
stopifnot("values of answer1 names are not correct"= setequal(digest(paste(toString(sort(c(names(answer1)))), "e8e68")), "930bdb47de00be1e31d6243f87c51b06"))
stopifnot("order of elements of answer1 is not correct"= setequal(digest(paste(toString(answer1), "e8e68")), "354c3ca562580b223349727dcbebb217"))

print('Success!')

## QUESTION 2 

{points: 1}


In Question 1, we used the generic `map` function, and got a list. Let's use a more specific `map_*` function this time.

Again without using vectorization, square each component of `x`. The result should be a numeric vector. Store your answer in `answer2`:

```r
answer2 <- map_dbl(FILL_THIS_IN, FILL_THIS_IN)
```

_Hint:_ The last `FILL_THIS_IN` corresponds to an anonymous function!

In [None]:
# your code here
fail() # No Answer - remove if you provide an answer
print(answer2)

In [None]:
library(digest)
stopifnot("type of answer2 is not numeric"= setequal(digest(paste(toString(class(answer2)), "1c1a8")), "d71412e05e24332596cc591b22c8a0e3"))
stopifnot("value of answer2 is not correct (rounded to 2 decimal places)"= setequal(digest(paste(toString(round(answer2, 2)), "1c1a8")), "a9236d4df4d673160b85bee34baaed61"))
stopifnot("length of answer2 is not correct"= setequal(digest(paste(toString(length(answer2)), "1c1a8")), "6e3afaeffc96949462a4937a3746957d"))
stopifnot("values of answer2 are not correct"= setequal(digest(paste(toString(sort(round(answer2, 2))), "1c1a8")), "a9236d4df4d673160b85bee34baaed61"))

print('Success!')

Now we've used both `map` and a more specific `map_dbl`. Now you can see how they differ, and how the use of one is better justified than the other for our purpose. Now it's your turn to choose! Remember to use the [cheatsheet](https://github.com/rstudio/cheatsheets/blob/master/purrr.pdf) if you need it!

## QUESTION 3

{points: 1}


Below is sample code that computes the mean of every column in the `mtcars` dataset. Use the appropriate `purrr` function to vectorize this task.

In [None]:
mtcars_means <- numeric()
for (c in seq_along(mtcars)){
  mtcars_means[[c]] <- mean(mtcars[[c]])
}
mtcars_means

Store your answer in `answer3`; as above, your answer should be a vector. _Hint_: remember that a tibble / data frame is just a list, where each entry is a column (a vector).

```r
answer3 <- FILL_THIS_IN(datasets::mtcars, FILL_THIS_IN)
```

In [None]:
# your code here
fail() # No Answer - remove if you provide an answer
print(answer3)

In [None]:
library(digest)
stopifnot("type of answer3 is not numeric"= setequal(digest(paste(toString(class(answer3)), "a22de")), "d5baa49d105e133b02797d0995291f75"))
stopifnot("value of answer3 is not correct (rounded to 2 decimal places)"= setequal(digest(paste(toString(round(answer3, 2)), "a22de")), "a88eddb0705c304d3fc768d339bb4a40"))
stopifnot("length of answer3 is not correct"= setequal(digest(paste(toString(length(answer3)), "a22de")), "93364ff92443b7362d8bec3343246e6b"))
stopifnot("values of answer3 are not correct"= setequal(digest(paste(toString(sort(round(answer3, 2))), "a22de")), "fdb874ead188b87e5a0197f43d4f8a0b"))

print('Success!')

## QUESTION 4

{points: 1}


Below is sample code that computes the number of unique values in each column of `mtcars` as a named vector, using for-loops. Use the appropriate `purrr` function to vectorize this task.

In [None]:
mtcars_unique <- numeric()
for (c in seq_along(datasets::mtcars)){
  mtcars_unique[[c]] <- length(unique(datasets::mtcars[[c]]))
}
names(mtcars_unique) <- names(datasets::mtcars)
mtcars_unique

Store your answer in `answer4`:

```r
answer4 <- datasets::mtcars %>% 
    FILL_THIS_IN(FILL_THIS_IN) %>% 
    FILL_THIS_IN(FILL_THIS_IN)
```

In [None]:
# your code here
fail() # No Answer - remove if you provide an answer
print(answer4)

In [None]:
library(digest)
stopifnot("type of answer4 is not numeric"= setequal(digest(paste(toString(class(answer4)), "604b5")), "75dc2536842deb5b78c409e5926e32a7"))
stopifnot("value of answer4 is not correct (rounded to 2 decimal places)"= setequal(digest(paste(toString(round(answer4, 2)), "604b5")), "8d82833db264c8c68a5ca0fd27d243f6"))
stopifnot("length of answer4 is not correct"= setequal(digest(paste(toString(length(answer4)), "604b5")), "2b96a626d378caa9d0f7b8dd5f5d58f0"))
stopifnot("values of answer4 are not correct"= setequal(digest(paste(toString(sort(round(answer4, 2))), "604b5")), "e764cb129241c889021e9397d839b91b"))

print('Success!')

## QUESTION 5

{points: 1}

Below is sample code that divides the values in each column of the `mtcars` dataset by the maximum in that column. 

In [None]:
for (i in seq_along(mtcars)){
  mtcars[[c]] <- mtcars[[i]] / max(mtcars[[i]], na.rm = TRUE)
}
head(mtcars)

Find a way to do this without a loop. Store your answer in `answer5`:

```r
answer5 <- datasets::mtcars %>%
  mutate(FILL_THIS_IN(FILL_THIS_IN, FILL_THIS_IN))
 ```
 
 _Hint_: The last `FILL_THIS_IN` corresponds to an anonymous function.

 _Food for thought_: Here is some sample code that divides the values in each column of the `mtcars` dataset by the maximum in that column using `purrr` instead of a loop or using `mutate`: `map(mtcars, function(x) x/max(x))`. How and why is the output of this code different from `answer5`?  

In [None]:
# your code here
fail() # No Answer - remove if you provide an answer
head(answer5)

In [None]:
library(digest)
stopifnot("answer5 should be a data frame"= setequal(digest(paste(toString('data.frame' %in% class(answer5)), "5fdac")), "b138db9e9fb17ac221e14cc4d3e4dd2b"))
stopifnot("dimensions of answer5 are not correct"= setequal(digest(paste(toString(dim(answer5)), "5fdac")), "63f2618f8414aecf23f332b84f71b150"))
stopifnot("column names of answer5 are not correct"= setequal(digest(paste(toString(sort(colnames(answer5))), "5fdac")), "3f329c375e864091d94b3ab305b605f3"))
stopifnot("types of columns in answer5 are not correct"= setequal(digest(paste(toString(sort(unlist(sapply(answer5, class)))), "5fdac")), "f6fbc266551a70ea9109a453f56251f6"))
stopifnot("values in one or more numerical columns in answer5 are not correct"= setequal(digest(paste(toString(if (any(sapply(answer5, is.numeric))) sort(round(sapply(answer5[, sapply(answer5, is.numeric)], sum, na.rm = TRUE), 2)) else 0), "5fdac")), "67858f09e0d1170e3269fc55ee367bf5"))
stopifnot("values in one or more character columns in answer5 are not correct"= setequal(digest(paste(toString(if (any(sapply(answer5, is.character))) sum(sapply(answer5[sapply(answer5, is.character)], function(x) length(unique(x)))) else 0), "5fdac")), "f177b27b1423c5c4440f0a91ebf11d5f"))
stopifnot("values in one or more factor columns in answer5 are not correct"= setequal(digest(paste(toString(if (any(sapply(answer5, is.factor))) sum(sapply(answer5[, sapply(answer5, is.factor)], function(col) length(unique(col)))) else 0), "5fdac")), "f177b27b1423c5c4440f0a91ebf11d5f"))

print('Success!')

## QUESTION 6: `map2`

{points: 1}


Let's use `purrr` to calculate some probabilities from negative binomial distributions. One parameterization of the negative binomial distribution has a mean parameter and a dispersion parameter - this parameterization is often used to model highly variable count-valued data. 

We would like to calculate the probability of observing a count of 0 and the probability of observing a count of 1 for a bunch of negative binomial distributions. Here's a function to do this for a single negative binomial distribution: 

In [None]:
prob_01_neg_bin <- function(mean, disp) { 
  dnbinom(x=c(0, 1), mu=mean, size=disp) 
}

prob_01_neg_bin(1, 1) # calculates probability of observing 0 count and probability of observing 1 count

The `prob_01_neg_bin()` function has two arguments corresponding to the two parameters of the negative binomial distribution, so to calculate probabilities for a bunch of negative binomial distributions, we'd need a `purrr` function to plug in the two parameters. Here are the parameters we would like to plug in:   

In [None]:
mean_params <- c(1, 5, 10)
disp_params <- c(0.1, 1, 5)

Store your answer in `answer6`. It should be a list of probabilities.

```r
answer6 <- map2(FILL_THIS_IN, FILL_THIS_IN, FILL_THIS_IN)
```

In [None]:
# your code here
fail() # No Answer - remove if you provide an answer
print(answer6)

In [None]:
library(digest)
stopifnot("type of answer6 is not list"= setequal(digest(paste(toString(class(answer6)), "6b5fb")), "6c2d958b04498e99fc24d54e38df0f00"))
stopifnot("length of answer6 is not correct"= setequal(digest(paste(toString(length(answer6)), "6b5fb")), "b65a0989e0fff0fb3fc05fc85230c4cf"))
stopifnot("values of answer6 names are not correct"= setequal(digest(paste(toString(sort(c(names(answer6)))), "6b5fb")), "979ea974d13f5ac962e4d63b8e85ff30"))
stopifnot("order of elements of answer6 is not correct"= setequal(digest(paste(toString(answer6), "6b5fb")), "a47fb1822e6d28f69c123c143e84028f"))

print('Success!')

## Question 7

{points: 1}


Introducing the Big Bang `!!!` and `rlang::exec()`.

Using the ```palmerpenguins::penguins``` tibble: 

Let's use the `recode()` function from the `dplyr` package to rename the levels of the `island` factor variable. The straightforward way to do this would be to do:

In [None]:
penguins %>% mutate(island = 
   recode(island, Biscoe = "B", Dream = "D", Torgersen = "T"))

But what if you wanted to specify the levels to be renamed and their new names via a list? 

In [None]:
island_level_key <- c(Biscoe = "B", Dream = "D", Torgersen = "T")

Inputting the list itself via `penguins %>% mutate(island = recode(island, island_level_key))` throws an error, because `dplyr::recode()` is not expecting a list input. What's the alternative?

Your task: use the big bang operator (`!!!`) in front of the list argument, to get the desired result. This effectively takes the arguments of a list, and puts them as arguments to a function.

```
answer12 <- penguins %>% mutate(island = recode(island, FILL_THIS_IN))
```

__FYI__: Conveniently, `dplyr::recode()` recognizes the big bang operator. If you have a function that doesn't recognize it (like `sum()`), use `rlang::exec(function, !!!list_of_arguments)` instead.

In [None]:
# your code here
fail() # No Answer - remove if you provide an answer
print(answer7)

In [None]:
library(digest)
stopifnot("answer7 should be a data frame"= setequal(digest(paste(toString('data.frame' %in% class(answer7)), "5f6c2")), "b39c77a998021d8db7fb364d95551061"))
stopifnot("dimensions of answer7 are not correct"= setequal(digest(paste(toString(dim(answer7)), "5f6c2")), "6416b57caa3680d2cecf64e1bceff981"))
stopifnot("column names of answer7 are not correct"= setequal(digest(paste(toString(sort(colnames(answer7))), "5f6c2")), "62f1dbc919a235ece6536c7535c82f1b"))
stopifnot("types of columns in answer7 are not correct"= setequal(digest(paste(toString(sort(unlist(sapply(answer7, class)))), "5f6c2")), "4facb82c0e1e55a769694c970c4e9b91"))
stopifnot("values in one or more numerical columns in answer7 are not correct"= setequal(digest(paste(toString(if (any(sapply(answer7, is.numeric))) sort(round(sapply(answer7[, sapply(answer7, is.numeric)], sum, na.rm = TRUE), 2)) else 0), "5f6c2")), "2a8bb0e08933e827db9ab07a20105567"))
stopifnot("values in one or more character columns in answer7 are not correct"= setequal(digest(paste(toString(if (any(sapply(answer7, is.character))) sum(sapply(answer7[sapply(answer7, is.character)], function(x) length(unique(x)))) else 0), "5f6c2")), "7a0a9ee4dc11dd2381143aac16ef84dd"))
stopifnot("values in one or more factor columns in answer7 are not correct"= setequal(digest(paste(toString(if (any(sapply(answer7, is.factor))) sum(sapply(answer7[, sapply(answer7, is.factor)], function(col) length(unique(col)))) else 0), "5f6c2")), "83743bdfcd168d5191e61660307af13b"))

print('Success!')

# Part 2: Nesting and List Columns

_One_ of the ways a list-column can be made is by using `nest()`.

## QUESTION 8

{points: 1}


Create a tibble that bundles everything in `gapminder` except for `country` and `continent` into a list-column. Name your list column `other` (without using `rename()` or `mutate()`). Store your answer in `answer8`.

```r
answer8 <- gapminder %>%
   nest(FILL_THIS_IN = FILL_THIS_IN)
```

In [None]:
# your code here
fail() # No Answer - remove if you provide an answer
head(answer8, n = 3)

In [None]:
library(digest)
stopifnot("answer8 should be a data frame"= setequal(digest(paste(toString('data.frame' %in% class(answer8)), "b9b72")), "0d58b201418ea7ec6b7bd8a3d88cd382"))
stopifnot("dimensions of answer8 are not correct"= setequal(digest(paste(toString(dim(answer8)), "b9b72")), "489670ae32b2356ae7ff709e5c89f6aa"))
stopifnot("column names of answer8 are not correct"= setequal(digest(paste(toString(sort(colnames(answer8))), "b9b72")), "f745727f8ef4f917410c1b93f1847125"))
stopifnot("types of columns in answer8 are not correct"= setequal(digest(paste(toString(sort(unlist(sapply(answer8, class)))), "b9b72")), "d3b88846b0c3653b2b15b108dfbaea70"))
stopifnot("values in one or more numerical columns in answer8 are not correct"= setequal(digest(paste(toString(if (any(sapply(answer8, is.numeric))) sort(round(sapply(answer8[, sapply(answer8, is.numeric)], sum, na.rm = TRUE), 2)) else 0), "b9b72")), "bd39c8036a5e2e3fd16fd332486decfb"))
stopifnot("values in one or more character columns in answer8 are not correct"= setequal(digest(paste(toString(if (any(sapply(answer8, is.character))) sum(sapply(answer8[sapply(answer8, is.character)], function(x) length(unique(x)))) else 0), "b9b72")), "bd39c8036a5e2e3fd16fd332486decfb"))
stopifnot("values in one or more factor columns in answer8 are not correct"= setequal(digest(paste(toString(if (any(sapply(answer8, is.factor))) sum(sapply(answer8[, sapply(answer8, is.factor)], function(col) length(unique(col)))) else 0), "b9b72")), "8823a034b870354ae6ca03d7afab3086"))

print('Success!')

## QUESTION 9

{points: 1}


_Reproducibly_ sample 5 countries in the `gapminder` tibble at random. Store your answer in `answer9`, and set the seed as 123.

```r
FILL_THIS_IN(123)
answer9 <- gapminder %>%
    nest(FILL_THIS_IN = FILL_THIS_IN) %>% 
    slice_sample(n=5) %>% 
    unnest(FILL_THIS_IN)
```

In [None]:
# your code here
fail() # No Answer - remove if you provide an answer
head(answer9)

In [None]:
library(digest)
stopifnot("answer9 should be a data frame"= setequal(digest(paste(toString('data.frame' %in% class(answer9)), "599b7")), "48c0a34216d17e6057d16828c2683fd3"))
stopifnot("dimensions of answer9 are not correct"= setequal(digest(paste(toString(dim(answer9)), "599b7")), "7e5ef29e4b2110d9bfd6c770c912d4a9"))
stopifnot("column names of answer9 are not correct"= setequal(digest(paste(toString(sort(colnames(answer9))), "599b7")), "bed94b3894b648557ecedd631d4dafd7"))
stopifnot("types of columns in answer9 are not correct"= setequal(digest(paste(toString(sort(unlist(sapply(answer9, class)))), "599b7")), "1b23763f8ebc29e55dac9460b67e7dcb"))
stopifnot("values in one or more numerical columns in answer9 are not correct"= setequal(digest(paste(toString(if (any(sapply(answer9, is.numeric))) sort(round(sapply(answer9[, sapply(answer9, is.numeric)], sum, na.rm = TRUE), 2)) else 0), "599b7")), "4e34cb96d19a25212dc85a84417109fe"))
stopifnot("values in one or more character columns in answer9 are not correct"= setequal(digest(paste(toString(if (any(sapply(answer9, is.character))) sum(sapply(answer9[sapply(answer9, is.character)], function(x) length(unique(x)))) else 0), "599b7")), "062be694296c582dfa7dc052d658785a"))
stopifnot("values in one or more factor columns in answer9 are not correct"= setequal(digest(paste(toString(if (any(sapply(answer9, is.factor))) sum(sapply(answer9[, sapply(answer9, is.factor)], function(col) length(unique(col)))) else 0), "599b7")), "4ac8eeae0c116537e6df975c907f1f84"))

print('Success!')

## QUESTION 10

{points: 1}


For each `gapminder` continent, fit a linear model of `lifeExp` from `log(gdpPercap)` and put this as a new column. Store your answer into `answer10`:

```r
answer10 <- gapminder %>% 
  select(continent, gdpPercap, lifeExp) %>% 
  nest(data = c(FILL_THIS_IN, FILL_THIS_IN)) %>% 
  mutate(model = FILL_THIS_IN(data, ~ lm(FILL_THIS_IN ~ FILL_THIS_IN, data = FILL_THIS_IN)))
```

In [None]:
# your code here
fail() # No Answer - remove if you provide an answer
print(answer10)

In [None]:
library(digest)
stopifnot("answer10 should be a data frame"= setequal(digest(paste(toString('data.frame' %in% class(answer10)), "7f1ee")), "9b17cc48215922e7ce47ee8edf7b57ee"))
stopifnot("dimensions of answer10 are not correct"= setequal(digest(paste(toString(dim(answer10)), "7f1ee")), "677b28431407039b1bccc870cd6ccbd8"))
stopifnot("column names of answer10 are not correct"= setequal(digest(paste(toString(sort(colnames(answer10))), "7f1ee")), "aa869a55bd83acdf3868e6dd20a2bbca"))
stopifnot("types of columns in answer10 are not correct"= setequal(digest(paste(toString(sort(unlist(sapply(answer10, class)))), "7f1ee")), "b12c3abc92b6d0acff458e69628c16a6"))
stopifnot("values in one or more numerical columns in answer10 are not correct"= setequal(digest(paste(toString(if (any(sapply(answer10, is.numeric))) sort(round(sapply(answer10[, sapply(answer10, is.numeric)], sum, na.rm = TRUE), 2)) else 0), "7f1ee")), "581f8324613cf725720df97b30a2350a"))
stopifnot("values in one or more character columns in answer10 are not correct"= setequal(digest(paste(toString(if (any(sapply(answer10, is.character))) sum(sapply(answer10[sapply(answer10, is.character)], function(x) length(unique(x)))) else 0), "7f1ee")), "581f8324613cf725720df97b30a2350a"))
stopifnot("values in one or more factor columns in answer10 are not correct"= setequal(digest(paste(toString(if (any(sapply(answer10, is.factor))) sum(sapply(answer10[, sapply(answer10, is.factor)], function(col) length(unique(col)))) else 0), "7f1ee")), "1922c5af8711f19a557c13e7887936f6"))

print('Success!')

## QUESTION 11

{points: 1}


Using your model from Question 10, make predictions using `augment()` from the `broom` package, and then `unnest`. Store your answer in `answer11`:

```r
answer11 <- answer10 %>% 
  mutate(continent, yhat = map(FILL_THIS_IN, FILL_THIS_IN), .keep = "none") %>% 
  unnest(FILL_THIS_IN)
```

In [None]:
# your code here
fail() # No Answer - remove if you provide an answer
print(answer11)

In [None]:
library(digest)
stopifnot("answer11 should be a data frame"= setequal(digest(paste(toString('data.frame' %in% class(answer11)), "b1656")), "42799cbfc1005613567643475d326d57"))
stopifnot("dimensions of answer11 are not correct"= setequal(digest(paste(toString(dim(answer11)), "b1656")), "e8ab4506966d8aa8d5c6969b1b4be504"))
stopifnot("column names of answer11 are not correct"= setequal(digest(paste(toString(sort(colnames(answer11))), "b1656")), "aecf523d94ad853496e068f2e890bae5"))
stopifnot("types of columns in answer11 are not correct"= setequal(digest(paste(toString(sort(unlist(sapply(answer11, class)))), "b1656")), "657c3f656cbd50824a79be55b5f2a2ab"))
stopifnot("values in one or more numerical columns in answer11 are not correct"= setequal(digest(paste(toString(if (any(sapply(answer11, is.numeric))) sort(round(sapply(answer11[, sapply(answer11, is.numeric)], sum, na.rm = TRUE), 2)) else 0), "b1656")), "764f2ce8eb1df2449bdeccfb7f30cb32"))
stopifnot("values in one or more character columns in answer11 are not correct"= setequal(digest(paste(toString(if (any(sapply(answer11, is.character))) sum(sapply(answer11[sapply(answer11, is.character)], function(x) length(unique(x)))) else 0), "b1656")), "010da7c27c1c5d77b8b2ba0135016795"))
stopifnot("values in one or more factor columns in answer11 are not correct"= setequal(digest(paste(toString(if (any(sapply(answer11, is.factor))) sum(sapply(answer11[, sapply(answer11, is.factor)], function(col) length(unique(col)))) else 0), "b1656")), "45cb8b49ff2019614de5f62e7e59d191"))

print('Success!')

## Question 12: `pmap`

{points: 1}


Using the `palmerpenguins::penguins` tibble, calculate a 95% confidence interval based on the one-sample t-test for the mean body mass in grams of each penguin species. Here is code to do this for the Adelie penguins: 

In [None]:
get_adelie_stats <- penguins %>% 
  filter(species == "Adelie") %>% 
  summarise(mean = mean(body_mass_g, na.rm = TRUE), 
            sd = sd(body_mass_g, na.rm = TRUE), 
            n = n())

# Given a sample mean, sample standard deviation, and sample size, 
# calculate a 95% confidence interval for the population mean
make_95p_conf_int_t <- function(mean, sd, n) { 
  c(lower_95p_ci = mean - qt(0.975, df=n-1), upper_95p_ci = mean + qt(0.975, df=n-1))  
}

make_95p_conf_int_t(get_adelie_stats$mean, get_adelie_stats$var, get_adelie_stats$n)

To do this for all three species of penguins, we will need a `purrr` function to plug the three parameters of `make_95p_conf_int_t()` into. 

Starter code:

```
answer12  <- penguins %>% 
  group_by(species) %>% 
  summarise(mean = mean(body_mass_g, na.rm = TRUE),
            sd  = sd(body_mass_g, na.rm = TRUE), 
            n = n()) %>% 
  mutate(ci = pmap(FILL_THIS_IN, FILL_THIS_IN))
```

Your answer should be a tibble with five columns named `species`, `mean`, `sd`, `n`, and `ci`. 

In [None]:
# your code here
fail() # No Answer - remove if you provide an answer
print(answer12)

In [None]:
library(digest)
stopifnot("answer12 should be a data frame"= setequal(digest(paste(toString('data.frame' %in% class(answer12)), "1c223")), "86d198252f75b8982df5ed641b59bec0"))
stopifnot("dimensions of answer12 are not correct"= setequal(digest(paste(toString(dim(answer12)), "1c223")), "aa9de852789b2abec1efa4f2019e976a"))
stopifnot("column names of answer12 are not correct"= setequal(digest(paste(toString(sort(colnames(answer12))), "1c223")), "4317a2a9127e000f298cdd68ebbf0593"))
stopifnot("types of columns in answer12 are not correct"= setequal(digest(paste(toString(sort(unlist(sapply(answer12, class)))), "1c223")), "290e1b9bbfc2878d704ffb6354eb1913"))
stopifnot("values in one or more numerical columns in answer12 are not correct"= setequal(digest(paste(toString(if (any(sapply(answer12, is.numeric))) sort(round(sapply(answer12[, sapply(answer12, is.numeric)], sum, na.rm = TRUE), 2)) else 0), "1c223")), "7b4db1c8f268b6399d60802c8c6cb305"))
stopifnot("values in one or more character columns in answer12 are not correct"= setequal(digest(paste(toString(if (any(sapply(answer12, is.character))) sum(sapply(answer12[sapply(answer12, is.character)], function(x) length(unique(x)))) else 0), "1c223")), "ddf6c55a5b4cbcf483f4627444cad21d"))
stopifnot("values in one or more factor columns in answer12 are not correct"= setequal(digest(paste(toString(if (any(sapply(answer12, is.factor))) sum(sapply(answer12[, sapply(answer12, is.factor)], function(col) length(unique(col)))) else 0), "1c223")), "9c4f8fd655fb80934ca32ca185f8fffe"))

print('Success!')

## Question 13 `unnest()`

{points: 1}


`unnest()` need not always be paired with `nest()`. Make a tibble `answer13` that is a longer version of `answer12` where the column `ci` is of double type. 

Starter code:

```
answer13 <- answer12 %>% unnest(FILL_THIS_IN)
```

In [None]:
# your code here
fail() # No Answer - remove if you provide an answer
print(answer13)

In [None]:
library(digest)
stopifnot("answer13 should be a data frame"= setequal(digest(paste(toString('data.frame' %in% class(answer13)), "87c96")), "ad72bc4fa83d1ce9e6516e67f6f71e79"))
stopifnot("dimensions of answer13 are not correct"= setequal(digest(paste(toString(dim(answer13)), "87c96")), "a359feb4008b2a82d0654958fcb5e1ff"))
stopifnot("column names of answer13 are not correct"= setequal(digest(paste(toString(sort(colnames(answer13))), "87c96")), "d4a0318aad0b7b5ea302d3baeaf6f3f0"))
stopifnot("types of columns in answer13 are not correct"= setequal(digest(paste(toString(sort(unlist(sapply(answer13, class)))), "87c96")), "a2a39b55d7ebcc156530904d7cbcd821"))
stopifnot("values in one or more numerical columns in answer13 are not correct"= setequal(digest(paste(toString(if (any(sapply(answer13, is.numeric))) sort(round(sapply(answer13[, sapply(answer13, is.numeric)], sum, na.rm = TRUE), 2)) else 0), "87c96")), "69e7e5a04bf64ae67b634f0c17f80b79"))
stopifnot("values in one or more character columns in answer13 are not correct"= setequal(digest(paste(toString(if (any(sapply(answer13, is.character))) sum(sapply(answer13[sapply(answer13, is.character)], function(x) length(unique(x)))) else 0), "87c96")), "9f69111fd5a130f61f4ee5dfc6540689"))
stopifnot("values in one or more factor columns in answer13 are not correct"= setequal(digest(paste(toString(if (any(sapply(answer13, is.factor))) sum(sapply(answer13[, sapply(answer13, is.factor)], function(col) length(unique(col)))) else 0), "87c96")), "6e5c65b7e07466a2a094049a3f4f3e4b"))

print('Success!')

## Question 14

{points: 1}


Output a list of gapminder tibbles, one for each continent. Do not include the `continent` column in the divided tibbles -- the name of each list entry should be the continent name. 

_Hint_: Check out the `enframe()` and `deframe()` functions. 

Starter code:

```
answer14 <- gapminder %>% 
    nest(FILL_THIS_IN) %>% 
    FILL_THIS_IN()
```


In [None]:
# your code here
fail() # No Answer - remove if you provide an answer
print(answer14)

In [None]:
library(digest)
stopifnot("type of answer14 is not list"= setequal(digest(paste(toString(class(answer14)), "de4e")), "9ed367519f53843c1cdfe34a4253b5b8"))
stopifnot("length of answer14 is not correct"= setequal(digest(paste(toString(length(answer14)), "de4e")), "8a0e33a92d7fdf5e8c7e2a4f07e9cc60"))
stopifnot("values of answer14 names are not correct"= setequal(digest(paste(toString(sort(c(names(answer14)))), "de4e")), "082005a2d5d104975ae338ea0c5fd414"))
stopifnot("order of elements of answer14 is not correct"= setequal(digest(paste(toString(answer14), "de4e")), "06a1942655a9c28e513f20e84ee5a4eb"))

print('Success!')

## Question 15 

{points: 1}


Sometimes the vector/list we're iterating over has names, and it's useful to use those names. To access these names, use the `imap` family.

For the list of tibbles made in the above question, save each one to file using the appropriate purrr function, using the names as the file names.

Starter code:

```
answer15 <- FILL_THIS_IN(answer14, ~ write_csv(FILL_THIS_IN, glue::glue(FILL_THIS_IN, ".csv")))
```

In [None]:
# your code here
fail() # No Answer - remove if you provide an answer
dir()

In [None]:
library(digest)
stopifnot("type of (test_that(\"Question 15\", {expect_true(\"Africa.csv\" %in% dir())}) == TRUE) is not logical"= setequal(digest(paste(toString(class((test_that("Question 15", {expect_true("Africa.csv" %in% dir())}) == TRUE))), "b77bf")), "712a549f79be2e43a4d4016acb7c2f49"))
stopifnot("logical value of (test_that(\"Question 15\", {expect_true(\"Africa.csv\" %in% dir())}) == TRUE) is not correct"= setequal(digest(paste(toString((test_that("Question 15", {expect_true("Africa.csv" %in% dir())}) == TRUE)), "b77bf")), "88647aaa2cfa568eba49a0c0c1abd8c9"))

stopifnot("type of (test_that(\"Question 15\", {expect_true(\"Oceania.csv\" %in% dir())}) == TRUE) is not logical"= setequal(digest(paste(toString(class((test_that("Question 15", {expect_true("Oceania.csv" %in% dir())}) == TRUE))), "b77c0")), "c6b75d8799cbe154e114041aa8e8464d"))
stopifnot("logical value of (test_that(\"Question 15\", {expect_true(\"Oceania.csv\" %in% dir())}) == TRUE) is not correct"= setequal(digest(paste(toString((test_that("Question 15", {expect_true("Oceania.csv" %in% dir())}) == TRUE)), "b77c0")), "40c639a1a5a1fc04bab04a7bb8c5b4bf"))

print('Success!')

# Part 3: Recursive Lists

We won't focus much on recursive lists in this course, but here is a little taste of it.

## Question 16: `unnest_wider()` and `unnest_longer()`

{points: 1}


Explore the `repurrrsive::got_chars` nested list. It contains information about Game of Thrones characters.

In [None]:
str(got_chars, list.len = 4)

Put the list in a tibble:

In [None]:
got_chars_tbl <- tibble(character = repurrrsive::got_chars)
print(got_chars_tbl, n = 5)

Would widening the list column work best, or lengthening? Do it.

```
answer16 <- FILL_THIS_IN(got_chars_tbl, character)
```

In [None]:
# your code here
fail() # No Answer - remove if you provide an answer
print(answer16, n = 5)

In [None]:
library(digest)
stopifnot("answer16 should be a data frame"= setequal(digest(paste(toString('data.frame' %in% class(answer16)), "932c4")), "beed9247bace4e0d2f54dda84048d1c5"))
stopifnot("dimensions of answer16 are not correct"= setequal(digest(paste(toString(dim(answer16)), "932c4")), "01107d73b7437fcd2642c0e822e33f44"))
stopifnot("column names of answer16 are not correct"= setequal(digest(paste(toString(sort(colnames(answer16))), "932c4")), "d6603b5915dc048f4ef487b2ff3efa25"))
stopifnot("types of columns in answer16 are not correct"= setequal(digest(paste(toString(sort(unlist(sapply(answer16, class)))), "932c4")), "0d490e191b6c4530d8f709f9ca5a5db3"))
stopifnot("values in one or more numerical columns in answer16 are not correct"= setequal(digest(paste(toString(if (any(sapply(answer16, is.numeric))) sort(round(sapply(answer16[, sapply(answer16, is.numeric)], sum, na.rm = TRUE), 2)) else 0), "932c4")), "ad1980c0006cd1d68e90952977be52b4"))
stopifnot("values in one or more character columns in answer16 are not correct"= setequal(digest(paste(toString(if (any(sapply(answer16, is.character))) sum(sapply(answer16[sapply(answer16, is.character)], function(x) length(unique(x)))) else 0), "932c4")), "c665790db2076c499c3df2e6c25b6a0c"))
stopifnot("values in one or more factor columns in answer16 are not correct"= setequal(digest(paste(toString(if (any(sapply(answer16, is.factor))) sum(sapply(answer16[, sapply(answer16, is.factor)], function(col) length(unique(col)))) else 0), "932c4")), "8671c94ed835509da2697454c129fa5d"))

print('Success!')

Final worksheet completed! ðŸ¤©

Restart, run all, save, and submit to canvas!