Run the following cell to load all the required packages...

In [None]:
library('purrr')
library('tidyverse')
library('moderndive')
library('infer')

Use `sample_n` to create one sample of 100 balls from the `bowl` data set. Make sure you are sampling **without** replacement. Call your sample `one_sample_bowl`.

With bootstrap resampling we can calculate a confidence interval for the proportion of red balls even though we only have one sample.

To find a confidence interval,

1. use `rep_sample_n` with `replace = TRUE` to compute 1000 bootstrap replicates from your `one_sample_bowl` table. Each bootstrap should be the same size as your original sample.
2. `group_by` the replicate column, and,
3. find the proportion of red balls in each bootstrap replicate group using `summarize`.
4. Last, find the confidence interval bounds using this code snippet:

```r
summarize(
    lower_ci = quantile(prop_red, 0.025), 
    upper_ci = quantile(prop_red, 0.975)
)
```

Is the true value (i.e. the actual proportion of red balls in `bowl`) inside your confidence interval?

**Let's do the same thing as above for 1,000 iterations...**

Execute the following code. This runs the *sample* plus *bootstrap resample* steps 1,000 times and captures the output in a dataframe called `bowl_ci_data`. Each row in `bowl_ci_data` holds a confidence interval created from a sample of size 100 and 1,000 bootstrap resamples.

In [None]:
bowl_samples = rep_sample_n(bowl, replace = FALSE, size = 100, reps = 1000) |>
    rename(original_sample = replicate)

bowl_ci_data = bowl_samples |> 
    group_by(original_sample) |>
    group_split() |> 
    map_dfr(
        ~rep_sample_n(.x, replace = TRUE, size = 100, reps = 1000) |>
            rename(bs_sample = replicate) |>
            group_by(original_sample, bs_sample) |>
            summarize(prop_red = sum(color == 'red') / n(), .groups = 'drop') |>
            group_by(original_sample) |>
            summarize(
                lower_ci = quantile(prop_red, 0.025), 
                upper_ci = quantile(prop_red, 0.975),
            )
    )

Use `geom_segment()` to plot 100 confidence intervals. Add a vertical line showing the position of the "true value."

**HINT:**
- you can use `sample_n` to select only 100 CIs from `bowl_ci_data`, and,
- you can use `mutate(y_pos = row_number())` to get a column to map to `y` in your chart.

**[Here is an example chart](https://raw.githubusercontent.com/UNC-DATA-730/lecture-notebooks/main/07-bootstrap-resampling/in-class-exercises/ci_plot_example.png)**

`mutate(inside_ci = lower_ci <= 0.375 & upper_ci >= 0.375)` determines if a confidence interval in `bowl_ci_data` holds the "true value." Combine this `mutate` operation with `summarize` to calculate the proportion of confidence intervals that contain the true value.

Does this value make sense? Consider that we are calculating 95% confidence intervals in this exercise...