# Worksheet 12 - The Bootstrap

We will cover 2 things today:

1. [Estimation when you only have 1 sample (like in real life!)](#Estimation-with-only-one-sample)
2. [Getting Python + Jupyter working for you outside of our course](#Getting-Python-+-Jupyter-working-for-you-outside-of-our-course)

### Lecture and Tutorial Learning Goals:

After completing this week's lecture and tutorial work, you will be able to:
- Explain why we don't have a sampling distribution in practice/real life.
- Define bootstrapping.
- Use Python to create a bootstrap distribution to approximate a sampling distribution.
- Contrast bootstrap and sampling distributions.

This worksheet covers parts of [Chapter 10](https://python.datasciencebook.ca/inference) of the online textbook. You should read this chapter before attempting this assignment. Any place you see `___`, you must fill in the function, variable, or data to complete the code. Substitute the `raise NotImplementedError` with your completed code and answers then proceed to run the cell.

In [None]:
### Run this cell before continuing.
import altair as alt
import numpy as np
import pandas as pd

# Simplify working with large datasets in Altair
alt.data_transformers.disable_max_rows()

**Question 1.1** True/False:
<br> {points: 1}

In real life, we typically take many samples from the population and create a sampling distribution when we perform estimation. True or false?

*Assign your answer to an object called `answer1_1`. Your answer should be a boolean. i.e. `True` or `False`.*

In [None]:
# your code here
raise NotImplementedError

In [None]:
from hashlib import sha1
assert sha1(str(type(answer1_1)).encode("utf-8")+b"8da4bbc0859f604a").hexdigest() == "b76620c1c8bc935d3179b39f2a5204ded4b903a1", "type of answer1_1 is not bool. answer1_1 should be a bool"
assert sha1(str(answer1_1).encode("utf-8")+b"8da4bbc0859f604a").hexdigest() == "31eef0f177c268972d150b081401556810440f74", "boolean value of answer1_1 is not correct"

print('Success!')

**Question 1.2** Ordering
<br> {points: 1}

Correctly re-order the steps for creating a bootstrap sample from those listed below. 

1. record the observation's value
2. repeat the above the same number of times as there are observations in the original sample 
3. return the observation to the original sample
4. randomly draw an observation from the original sample (which was drawn from the population)

Create your answer by reordering values below in the `answer1_2` list with the correct order for the steps above for creating a bootstrap sample.

In [None]:
answer1_2 = [1, 2, 3, 4]  # reorder the values!
# your code here
raise NotImplementedError

In [None]:
from hashlib import sha1
assert sha1(str(type("".join(map(str, answer1_2)))).encode("utf-8")+b"d1d0f0ca1c34363f").hexdigest() == "09e7deb757bb1d7e956b40e620828fe40b5fb353", "type of \"\".join(map(str, answer1_2)) is not str. \"\".join(map(str, answer1_2)) should be an str"
assert sha1(str(len("".join(map(str, answer1_2)))).encode("utf-8")+b"d1d0f0ca1c34363f").hexdigest() == "238d23ad17748b8d1d630bb0f0c84478506ad7f0", "length of \"\".join(map(str, answer1_2)) is not correct"
assert sha1(str("".join(map(str, answer1_2)).lower()).encode("utf-8")+b"d1d0f0ca1c34363f").hexdigest() == "644d626173633edddc1653f2bc4f82c861daf9fa", "value of \"\".join(map(str, answer1_2)) is not correct"
assert sha1(str("".join(map(str, answer1_2))).encode("utf-8")+b"d1d0f0ca1c34363f").hexdigest() == "644d626173633edddc1653f2bc4f82c861daf9fa", "correct string value of \"\".join(map(str, answer1_2)) but incorrect case of letters"

print('Success!')

**Question 1.3** Multiple choice
<br> {points: 1}

From the list below, choose the correct description of a bootstrap distribution for a point estimate:

A. a list of point estimates calculated from many samples drawn with replacement from the population

B. a list of point estimates calculated from many samples drawn without replacement from the population

C. a list of point estimates calculated from bootstrap samples drawn with replacement from a single sample (that was drawn from the population)

D. a list of point estimates calculated from bootstrap samples drawn without replacement from a single sample (that was drawn from the population)

*Assign your answer to an object called `answer1_3`. Your answer should be an uppercase letter and is surrounded by quotes. (e.g. `"F"`)*

In [None]:
# your code here
raise NotImplementedError

In [None]:
from hashlib import sha1
assert sha1(str(type(answer1_3)).encode("utf-8")+b"696d3cd8aef8f352").hexdigest() == "d69c63d5d41a39ff40ec930ed727a4773c34f5de", "type of answer1_3 is not str. answer1_3 should be an str"
assert sha1(str(len(answer1_3)).encode("utf-8")+b"696d3cd8aef8f352").hexdigest() == "8c36003872fa22cdf617118c1869e86ae49512c1", "length of answer1_3 is not correct"
assert sha1(str(answer1_3.lower()).encode("utf-8")+b"696d3cd8aef8f352").hexdigest() == "e053209ed89dba4b8bc001d6f1b6bbf2ce407ad1", "value of answer1_3 is not correct"
assert sha1(str(answer1_3).encode("utf-8")+b"696d3cd8aef8f352").hexdigest() == "d76047d9b3801e57b310a1e67c3dd86277d15926", "correct string value of answer1_3 but incorrect case of letters"

print('Success!')

**Question 1.4** Multiple choice
<br> {points: 1}

From the list below, choose the correct explanation of why, when performing estimation, we want to report a plausible **range** for the true population quantity we are trying to estimate along with the point estimate:

A. The point estimate is our best guess at the true population quantity we are trying to estimate

B. The point estimate will often not be the exact value of the true population quantity we are trying to estimate

C. The value of a point estimate from one sample might very well be different than the value of a point estimate from another sample.

D. B & C

F. A & C

E. None of the above

*Assign your answer to an object called `answer1_4`. Your answer should be an uppercase letter and is surrounded by quotes (e.g. `"F"`).*

In [None]:
# your code here
raise NotImplementedError

In [None]:
from hashlib import sha1
assert sha1(str(type(answer1_4)).encode("utf-8")+b"79b871a543ff7413").hexdigest() == "adcc024cdc4142ddf63d2662c2b29d5caa7755ac", "type of answer1_4 is not str. answer1_4 should be an str"
assert sha1(str(len(answer1_4)).encode("utf-8")+b"79b871a543ff7413").hexdigest() == "f4149939b26bc1109fd9f5d005dc6da0c16f0e3e", "length of answer1_4 is not correct"
assert sha1(str(answer1_4.lower()).encode("utf-8")+b"79b871a543ff7413").hexdigest() == "1918537ccd231d61285f0c0c1c944121c35b7b6c", "value of answer1_4 is not correct"
assert sha1(str(answer1_4).encode("utf-8")+b"79b871a543ff7413").hexdigest() == "1e09197c80d6953c73f8a886348773d07fea357c", "correct string value of answer1_4 but incorrect case of letters"

print('Success!')

###  Continuing with our virtual population of Canadian seniors from last worksheet

Here we re-create the virtual population (ages of all Canadian seniors) we used in the last worksheet. It was bounded by realistic values ($\geq$ 65 and $\leq$ 118):

In [None]:
# Run this cell to simulate a large finite population
# Don't change the seed!
np.random.seed(4321)

can_seniors = pd.DataFrame({
    'age': np.random.exponential(1 / 0.1, 2000000) ** 2 + 65,
}).query(
    "65 <= age <= 118"
)
can_seniors

Let's remind ourselves of what this population looks like:

In [None]:
# Run this cell
pop_dist = alt.Chart(can_seniors, title='Population distribution').mark_bar().encode(
    x=alt.X("age")
        .title("Age (years)")
        .bin(maxbins=30),
    y="count()"
)
pop_dist

### Estimate the mean age of Canadian Seniors

Let's say we are interested in estimating the mean age of Canadian Seniors. Given that we have the population (we created it) we could just calculate the mean age from this population data. However in real life, we usually only have one small-ish sample from the population. Also, from our experimentation with sampling distributions, we know that different random samples will give us different point estimates. We also know from these experiments that the point estimates from different random samples will mostly be close to the true population quanitity we are trying to estimate, and how close depends on the sample size.

What about in real life though, when we only have one sample? Can we say how close? Or at least give some plausible range of where we would expect the population quanitity we are trying to estimate to fall? Yes! We can do this using a method called bootstrapping! Let's explore how to create a bootstrap distribution from a single sample using Python and then we will discuss how the bootstrap distribution relates to the sampling distribution, and what it can tell us about the true population quantity we are trying to estimate.

Let's draw a single sample of size 40 from the population and visualize it:

In [None]:
# Run this cell
one_sample = can_seniors.sample(40, random_state=12345)
one_sample

In [None]:
# Run this cell
one_sample_dist = alt.Chart(one_sample, title="Distribution of one sample").mark_bar().encode(
    x=alt.X("age")
        .title("Age (years)")
        .bin(maxbins=30),
    y="count()"
)
one_sample_dist

**Question 1.5** 
<br> {points: 1}

Calculate the mean age (our point estimate of interest) from the random sample you just took (`one_sample`). Assign the result to a variable called `one_sample_estimates`.

In [None]:
# your code here
raise NotImplementedError
one_sample_estimates

In [None]:
from hashlib import sha1
assert sha1(str(type(one_sample_estimates.shape[0])).encode("utf-8")+b"5a8bfbf75ef41be0").hexdigest() == "256fbed4833c4eee2a87636e236be8a1a587130d", "type of one_sample_estimates.shape[0] is not int. Please make sure it is int and not np.int64, etc. You can cast your value into an int using int()"
assert sha1(str(one_sample_estimates.shape[0]).encode("utf-8")+b"5a8bfbf75ef41be0").hexdigest() == "dcb6fc42a583d727cdc927e0e72c368ed13c435c", "value of one_sample_estimates.shape[0] is not correct"

assert sha1(str(type(round(one_sample_estimates, 2))).encode("utf-8")+b"aa4592981272dd3e").hexdigest() == "779b44a8bb321f66fa2ad9582fe1889cd5e3e7ca", "type of round(one_sample_estimates, 2) is not correct"
assert sha1(str(round(one_sample_estimates, 2)).encode("utf-8")+b"aa4592981272dd3e").hexdigest() == "2d7e692b12287413d27a5810bef5907cb3ac163d", "value of round(one_sample_estimates, 2) is not correct"

print('Success!')

**Question 1.6** 
<br> {points: 1}

To generate a single bootstrap sample in Python, we can use the `sample` method with `frac=1` to indicate that the bootstrap sample size is the same as the original sample. In contrast to when we created a sampling distribution from a population, we will set `replace=True` to ensure we don't end up with the exact same sample each time when perfroming the bootstrap.

Use `sample` to take a single bootstrap sample from the sample you drew from the population. Use 4321 as the `random_state` and name this bootstrap sample `boot1`.

In [None]:
# ___ = one_sample.___(frac=___, replace=___, random_state=4321)

# your code here
raise NotImplementedError
boot1

In [None]:
from hashlib import sha1
assert sha1(str(type("".join(boot1.columns))).encode("utf-8")+b"baced830d0eb0145").hexdigest() == "06e06d8967e9cb183c071aac11539da1ba85c40a", "type of \"\".join(boot1.columns) is not str. \"\".join(boot1.columns) should be an str"
assert sha1(str(len("".join(boot1.columns))).encode("utf-8")+b"baced830d0eb0145").hexdigest() == "797272f83add551f7a69f8a4d396b94641cc8216", "length of \"\".join(boot1.columns) is not correct"
assert sha1(str("".join(boot1.columns).lower()).encode("utf-8")+b"baced830d0eb0145").hexdigest() == "65ff91f48f8fe3dfaba0bd1cadf4f772c21a10b3", "value of \"\".join(boot1.columns) is not correct"
assert sha1(str("".join(boot1.columns)).encode("utf-8")+b"baced830d0eb0145").hexdigest() == "65ff91f48f8fe3dfaba0bd1cadf4f772c21a10b3", "correct string value of \"\".join(boot1.columns) but incorrect case of letters"

assert sha1(str(type(boot1.shape[0])).encode("utf-8")+b"c353a3e60c1b902f").hexdigest() == "e36a6a96b65ca68beffbba92b246509d53d6f3fe", "type of boot1.shape[0] is not int. Please make sure it is int and not np.int64, etc. You can cast your value into an int using int()"
assert sha1(str(boot1.shape[0]).encode("utf-8")+b"c353a3e60c1b902f").hexdigest() == "804c9f32f182a37fe8def5c75c664fa4c5bab046", "value of boot1.shape[0] is not correct"

assert sha1(str(type(round(sum(boot1.age), 2))).encode("utf-8")+b"c7271b9193fb13c6").hexdigest() == "0b32ed5ec57a9d1fc80bfb84945a21c3016b9201", "type of round(sum(boot1.age), 2) is not float. Please make sure it is float and not np.float64, etc. You can cast your value into a float using float()"
assert sha1(str(round(round(sum(boot1.age), 2), 2)).encode("utf-8")+b"c7271b9193fb13c6").hexdigest() == "d8cb3b7632db9d1235c4d95280cdd4607115e416", "value of round(sum(boot1.age), 2) is not correct (rounded to 2 decimal places)"

assert sha1(str(type(sum(range(boot1.shape[0])))).encode("utf-8")+b"43bd071a6a5ee204").hexdigest() == "e5b85d9434cbafb29cfa79327dc75bfa1874c141", "type of sum(range(boot1.shape[0])) is not int. Please make sure it is int and not np.int64, etc. You can cast your value into an int using int()"
assert sha1(str(sum(range(boot1.shape[0]))).encode("utf-8")+b"43bd071a6a5ee204").hexdigest() == "4a484ec0109380ae7a86670a7ac891d015a195d3", "value of sum(range(boot1.shape[0])) is not correct"

print('Success!')

**Question 1.7** Multiple choice
<br> {points: 1}

Why do we change `replace` to `TRUE`?

A. Taking a bootstrap sample involves drawing observations from the original population without replacement

B. Taking a bootstrap sample involves drawing observations from the original population with replacement

C. Taking a bootstrap sample involves drawing observations from the original sample without replacement

D. Taking a bootstrap sample involves drawing observations from the original sample with replacement

*Assign your answer to an object called `answer1_7`. Your answer should be an uppercase letter and is surrounded by quotes (e.g. `"F"`).*

In [None]:
# your code here
raise NotImplementedError

In [None]:
from hashlib import sha1
assert sha1(str(type(answer1_7)).encode("utf-8")+b"e4c9d91535aa4410").hexdigest() == "96de88c83150d4408080a7c64dd8cf2b562ee6b8", "type of answer1_7 is not str. answer1_7 should be an str"
assert sha1(str(len(answer1_7)).encode("utf-8")+b"e4c9d91535aa4410").hexdigest() == "ed9d66f2fde029578a25a39c8318befc731e2a40", "length of answer1_7 is not correct"
assert sha1(str(answer1_7.lower()).encode("utf-8")+b"e4c9d91535aa4410").hexdigest() == "14371fd21f08b8b7a53fc79f10719dad2045246b", "value of answer1_7 is not correct"
assert sha1(str(answer1_7).encode("utf-8")+b"e4c9d91535aa4410").hexdigest() == "a8c78f5a68c55ca6b6945502492bbcc93a069e7b", "correct string value of answer1_7 but incorrect case of letters"

print('Success!')

**Question 1.8** 
<br> {points: 1}

Visualize the distribution of the bootstrap sample you just took (`boot1`). Set `maxbins=30`, name the plot `boot1_dist`, and give the plot and the x-axis a descriptive title.

In [None]:
# your code here
raise NotImplementedError
boot1_dist

In [None]:
from hashlib import sha1
assert sha1(str(type(boot1_dist.encoding.x['shorthand'])).encode("utf-8")+b"ffcf0644d4cac0b7").hexdigest() == "2a91f8ee4262bb3bb21ae560fcf4a0b111b902a2", "type of boot1_dist.encoding.x['shorthand'] is not str. boot1_dist.encoding.x['shorthand'] should be an str"
assert sha1(str(len(boot1_dist.encoding.x['shorthand'])).encode("utf-8")+b"ffcf0644d4cac0b7").hexdigest() == "32162f2cb4f7d448cdb73a39462b87b48d2339c3", "length of boot1_dist.encoding.x['shorthand'] is not correct"
assert sha1(str(boot1_dist.encoding.x['shorthand'].lower()).encode("utf-8")+b"ffcf0644d4cac0b7").hexdigest() == "ca74598a9232f8e5070ec1a7495efbe79de6bfb0", "value of boot1_dist.encoding.x['shorthand'] is not correct"
assert sha1(str(boot1_dist.encoding.x['shorthand']).encode("utf-8")+b"ffcf0644d4cac0b7").hexdigest() == "ca74598a9232f8e5070ec1a7495efbe79de6bfb0", "correct string value of boot1_dist.encoding.x['shorthand'] but incorrect case of letters"

assert sha1(str(type(boot1_dist.encoding.y['shorthand'])).encode("utf-8")+b"638d94695be1919a").hexdigest() == "3240649eeb9808ed7bb97e8442ae351df2c8cb25", "type of boot1_dist.encoding.y['shorthand'] is not str. boot1_dist.encoding.y['shorthand'] should be an str"
assert sha1(str(len(boot1_dist.encoding.y['shorthand'])).encode("utf-8")+b"638d94695be1919a").hexdigest() == "d7bdb292156397d131a86ae5eab4bc223020e5ef", "length of boot1_dist.encoding.y['shorthand'] is not correct"
assert sha1(str(boot1_dist.encoding.y['shorthand'].lower()).encode("utf-8")+b"638d94695be1919a").hexdigest() == "162963803f5f38c479d7577f35666552d1dbb3f4", "value of boot1_dist.encoding.y['shorthand'] is not correct"
assert sha1(str(boot1_dist.encoding.y['shorthand']).encode("utf-8")+b"638d94695be1919a").hexdigest() == "162963803f5f38c479d7577f35666552d1dbb3f4", "correct string value of boot1_dist.encoding.y['shorthand'] but incorrect case of letters"

assert sha1(str(type(boot1_dist.mark)).encode("utf-8")+b"d57e07d32077ca4c").hexdigest() == "e83e1a5ce622f159643709f11c04541f950fbfda", "type of boot1_dist.mark is not str. boot1_dist.mark should be an str"
assert sha1(str(len(boot1_dist.mark)).encode("utf-8")+b"d57e07d32077ca4c").hexdigest() == "5b0b208b6c910f142c432c417fbc7f80b88360fe", "length of boot1_dist.mark is not correct"
assert sha1(str(boot1_dist.mark.lower()).encode("utf-8")+b"d57e07d32077ca4c").hexdigest() == "3cf0be5388af4375859773f005103cc1b2830cf0", "value of boot1_dist.mark is not correct"
assert sha1(str(boot1_dist.mark).encode("utf-8")+b"d57e07d32077ca4c").hexdigest() == "3cf0be5388af4375859773f005103cc1b2830cf0", "correct string value of boot1_dist.mark but incorrect case of letters"

assert sha1(str(type(boot1_dist.data.shape[0])).encode("utf-8")+b"65c18456b1808eaa").hexdigest() == "9b391bb28cec73e6b5625fca3b4eff46e097b18f", "type of boot1_dist.data.shape[0] is not int. Please make sure it is int and not np.int64, etc. You can cast your value into an int using int()"
assert sha1(str(boot1_dist.data.shape[0]).encode("utf-8")+b"65c18456b1808eaa").hexdigest() == "affbbde9733a3f171f1f513755a626c658478cba", "value of boot1_dist.data.shape[0] is not correct"

assert sha1(str(type(round(boot1_dist.data.sum(), 2))).encode("utf-8")+b"b92c15ad17e02c21").hexdigest() == "343af00aacb8fca94df22ce3b7560d4b4687f0b1", "type of round(boot1_dist.data.sum(), 2) is not correct"
assert sha1(str(round(boot1_dist.data.sum(), 2)).encode("utf-8")+b"b92c15ad17e02c21").hexdigest() == "f98becc5bd1a63538f0caf207b99a60147f1250c", "value of round(boot1_dist.data.sum(), 2) is not correct"

assert sha1(str(type(isinstance(boot1_dist.encoding.x['title'], str))).encode("utf-8")+b"5616060c4d5b8f76").hexdigest() == "09a86aa9e356a55ecd3f848b696ff18f6d7214c5", "type of isinstance(boot1_dist.encoding.x['title'], str) is not bool. isinstance(boot1_dist.encoding.x['title'], str) should be a bool"
assert sha1(str(isinstance(boot1_dist.encoding.x['title'], str)).encode("utf-8")+b"5616060c4d5b8f76").hexdigest() == "66bc343574c1686fc4b97878cea58f0b729d477f", "boolean value of isinstance(boot1_dist.encoding.x['title'], str) is not correct"

assert sha1(str(type(isinstance(boot1_dist.encoding.y['title'], str))).encode("utf-8")+b"6e6a5b3c1de74405").hexdigest() == "cdc0da723eb71c0aeb3d4862a1cbb59e877c269c", "type of isinstance(boot1_dist.encoding.y['title'], str) is not bool. isinstance(boot1_dist.encoding.y['title'], str) should be a bool"
assert sha1(str(isinstance(boot1_dist.encoding.y['title'], str)).encode("utf-8")+b"6e6a5b3c1de74405").hexdigest() == "df12da08aff52f88798a634d3bc3b84032b2e318", "boolean value of isinstance(boot1_dist.encoding.y['title'], str) is not correct"

assert sha1(str(type(boot1_dist.title is not None)).encode("utf-8")+b"97c80974ddd0a031").hexdigest() == "3f3b85b832740760b62eae5b2f4918799ff1a265", "type of boot1_dist.title is not None is not bool. boot1_dist.title is not None should be a bool"
assert sha1(str(boot1_dist.title is not None).encode("utf-8")+b"97c80974ddd0a031").hexdigest() == "46cfdcf11414ef300ee3650e11c9c1f95bc44aa3", "boolean value of boot1_dist.title is not None is not correct"

print('Success!')

Let's now compare our bootstrap sample to the original random sample that we drew from the population:

In [None]:
# Run this code cell
one_sample_dist & boot1_dist

Earlier we calculate the mean of our original sample to be about 79.6 years. What is the mean of our bootstrap sample?

In [None]:
# Run this cell
boot1.mean()

We see that original sample distrbution and the bootstrap sample distribution are of similar shape, but not identical. They also have different means. The difference of the frequency of the values in the bootstrap sample (and the difference of the value of the mean) comes from sampling from the original sample with replacement. Why sample with replacement? If we didn't we would end up with the original sample again. What we are trying to do with bootstrapping is to mimic drawing another sample from the population, without actually doing that. 

Why are we doing this? As mentioned earlier, in real life we typically only have one sample and thus we cannot create a sampling distribution that we can use to tell us about how we might expect our point estimate to behave if we took another sample. What we can do instead, is to use our sample as an estimate of our population, and sample from that with replacement (i.e., bootstrapping) many times to create many bootstrap samples. We can then calculate point estimates for each bootstrap sample and create a bootstrap distribution of our point estimates and use this as a proxy for a sampling distribution. We can finally use this bootstrap distribution of our point estimates to suggest how we might expected our point estimate to behave if we took another sample.

**Question 1.9** 
<br> {points: 1}

What do 6 different bootstrap samples look like? Use the `sample` method to create a single data frame with 6 bootstrap samples drawn from the original sample we drew from the population, `one_sample`. Assign a new column called `replicate` to mark the sample number `(from 0 to 5)`. Name the data frame `boot6`.

Set the seed as `1234`.

In [None]:
np.random.seed(1234)  # DO NOT CHANGE!
# your code here
raise NotImplementedError
boot6

In [None]:
from hashlib import sha1
assert sha1(str(type("".join(boot6.columns))).encode("utf-8")+b"941f9e4fa84c020e").hexdigest() == "53df5ce8b4c00b72b7d583d318fd538111ae9631", "type of \"\".join(boot6.columns) is not str. \"\".join(boot6.columns) should be an str"
assert sha1(str(len("".join(boot6.columns))).encode("utf-8")+b"941f9e4fa84c020e").hexdigest() == "f7389b568c9e860ffd96927e768be8d7e186ad8e", "length of \"\".join(boot6.columns) is not correct"
assert sha1(str("".join(boot6.columns).lower()).encode("utf-8")+b"941f9e4fa84c020e").hexdigest() == "c6fd06f360ab300782c5619b3d2f0c83bdf164aa", "value of \"\".join(boot6.columns) is not correct"
assert sha1(str("".join(boot6.columns)).encode("utf-8")+b"941f9e4fa84c020e").hexdigest() == "c6fd06f360ab300782c5619b3d2f0c83bdf164aa", "correct string value of \"\".join(boot6.columns) but incorrect case of letters"

assert sha1(str(type(boot6.shape[0])).encode("utf-8")+b"780d823fce2b7da7").hexdigest() == "51605d12b87d25e22153d723c00a9d3c18c15588", "type of boot6.shape[0] is not int. Please make sure it is int and not np.int64, etc. You can cast your value into an int using int()"
assert sha1(str(boot6.shape[0]).encode("utf-8")+b"780d823fce2b7da7").hexdigest() == "2e09203e98ed8d6097b3e6afb72b2df319beba4e", "value of boot6.shape[0] is not correct"

assert sha1(str(type(round(sum(boot6.age), 2))).encode("utf-8")+b"d6757d2ded553b7f").hexdigest() == "8757ab32c3c9ff2b60b1aced1ddac1e2b8b7835e", "type of round(sum(boot6.age), 2) is not float. Please make sure it is float and not np.float64, etc. You can cast your value into a float using float()"
assert sha1(str(round(round(sum(boot6.age), 2), 2)).encode("utf-8")+b"d6757d2ded553b7f").hexdigest() == "68f7a7955f13fc24158346f272468aeba5b440af", "value of round(sum(boot6.age), 2) is not correct (rounded to 2 decimal places)"

assert sha1(str(type(sum(boot6.replicate.unique()))).encode("utf-8")+b"5d02f6bcd0f693f0").hexdigest() == "d6e67da831cece8c871df89e1a3fe27c080138bf", "type of sum(boot6.replicate.unique()) is not correct"
assert sha1(str(sum(boot6.replicate.unique())).encode("utf-8")+b"5d02f6bcd0f693f0").hexdigest() == "d6b71232e41e23119479fe5d8fb416be94fe7e7a", "value of sum(boot6.replicate.unique()) is not correct"

print('Success!')

**Question 2.0** 
<br> {points: 1}

Now visualize the six bootstrap sample distributions from `boot6` by faceting the `replicate` column. To facilitate comparing the distribution, lay the plots out in a single column and set each plot's height to 100. Name the plot object `boot6_dist` and give the plot and the x-axis a descriptive title.

In [None]:
# ___ = alt.Chart(___, title=___).___().___(
#     x=alt.X(___)
#         .title(___)
#         .___(maxbins=30),
#     y=___
# ).___(
#     ___=100
# ).facet(
#     ___,
#     ___=1
# )

# your code here
raise NotImplementedError
boot6_dist

In [None]:
from hashlib import sha1
assert sha1(str(type(boot6_dist.facet['shorthand'])).encode("utf-8")+b"d1c2fe63e79c708c").hexdigest() == "554afdb3b28d2d426afc03af592f332e0a3deb2f", "type of boot6_dist.facet['shorthand'] is not str. boot6_dist.facet['shorthand'] should be an str"
assert sha1(str(len(boot6_dist.facet['shorthand'])).encode("utf-8")+b"d1c2fe63e79c708c").hexdigest() == "32dec15683cd0ff9d0fdb7cbda6a737ff606201a", "length of boot6_dist.facet['shorthand'] is not correct"
assert sha1(str(boot6_dist.facet['shorthand'].lower()).encode("utf-8")+b"d1c2fe63e79c708c").hexdigest() == "b1aba475fec8a1ac975a96de874e14c4916ba0f4", "value of boot6_dist.facet['shorthand'] is not correct"
assert sha1(str(boot6_dist.facet['shorthand']).encode("utf-8")+b"d1c2fe63e79c708c").hexdigest() == "b1aba475fec8a1ac975a96de874e14c4916ba0f4", "correct string value of boot6_dist.facet['shorthand'] but incorrect case of letters"

assert sha1(str(type(boot6_dist.data.shape[0])).encode("utf-8")+b"edf1ae5de24351a3").hexdigest() == "33cb8db0de5524c4b458c6439f099850ff6bfdb9", "type of boot6_dist.data.shape[0] is not int. Please make sure it is int and not np.int64, etc. You can cast your value into an int using int()"
assert sha1(str(boot6_dist.data.shape[0]).encode("utf-8")+b"edf1ae5de24351a3").hexdigest() == "485960a1686769baa489c434b35cf6683a205f29", "value of boot6_dist.data.shape[0] is not correct"

assert sha1(str(type(boot6_dist.title is not None)).encode("utf-8")+b"7160d84fd646426e").hexdigest() == "d0b299ae2a4e86999fd21fffdfd41f9f2d392d5f", "type of boot6_dist.title is not None is not bool. boot6_dist.title is not None should be a bool"
assert sha1(str(boot6_dist.title is not None).encode("utf-8")+b"7160d84fd646426e").hexdigest() == "2aef56d483ae116657f6163a1730d66e4c1f084e", "boolean value of boot6_dist.title is not None is not correct"

print('Success!')

**Question 2.1** 
<br> {points: 1}

Calculate the mean of these 6 bootstrap samples using `groupby` and `mean` and save result into a column called `mean_age`. Use `reset_index` so that the resulting data frame has two columns: `replicate` and `mean`. Name the data frame `boot6_means`.

In [None]:
# your code here
raise NotImplementedError
boot6_means

In [None]:
from hashlib import sha1
assert sha1(str(type(boot6_means.shape[0])).encode("utf-8")+b"da6b67a5dc64285f").hexdigest() == "517b2813255e2e9e6eb4681f24218eef4d042686", "type of boot6_means.shape[0] is not int. Please make sure it is int and not np.int64, etc. You can cast your value into an int using int()"
assert sha1(str(boot6_means.shape[0]).encode("utf-8")+b"da6b67a5dc64285f").hexdigest() == "6fa0df9151383a1100d8bfc3f5ea7e0748cf9eb9", "value of boot6_means.shape[0] is not correct"

assert sha1(str(type(boot6_means.shape[1])).encode("utf-8")+b"a3dff58db9c9c0e6").hexdigest() == "74eeb061932ac0545a9771b84d08f089fb9863b6", "type of boot6_means.shape[1] is not int. Please make sure it is int and not np.int64, etc. You can cast your value into an int using int()"
assert sha1(str(boot6_means.shape[1]).encode("utf-8")+b"a3dff58db9c9c0e6").hexdigest() == "ae4610041f4439a0f81cc4701fca6a3b941356a0", "value of boot6_means.shape[1] is not correct"

assert sha1(str(type("".join(boot6_means.columns))).encode("utf-8")+b"30ef07ddac45c626").hexdigest() == "f1493e7841f82032c7d85097df8d32b88fe55fde", "type of \"\".join(boot6_means.columns) is not str. \"\".join(boot6_means.columns) should be an str"
assert sha1(str(len("".join(boot6_means.columns))).encode("utf-8")+b"30ef07ddac45c626").hexdigest() == "43bf199914c67e360bd7ecf69d2c1004a4594c5e", "length of \"\".join(boot6_means.columns) is not correct"
assert sha1(str("".join(boot6_means.columns).lower()).encode("utf-8")+b"30ef07ddac45c626").hexdigest() == "de471b8b41dfebd8970cced11a31122044db83ef", "value of \"\".join(boot6_means.columns) is not correct"
assert sha1(str("".join(boot6_means.columns)).encode("utf-8")+b"30ef07ddac45c626").hexdigest() == "de471b8b41dfebd8970cced11a31122044db83ef", "correct string value of \"\".join(boot6_means.columns) but incorrect case of letters"

assert sha1(str(type(round(boot6_means["mean_age"][0], 2))).encode("utf-8")+b"f313f2d35d51f4b0").hexdigest() == "428d1612fe134ab110e709e9198f8fc7d76abfed", "type of round(boot6_means[\"mean_age\"][0], 2) is not correct"
assert sha1(str(round(boot6_means["mean_age"][0], 2)).encode("utf-8")+b"f313f2d35d51f4b0").hexdigest() == "3b9953cb5ba4be1675c83a7c50a0ad1555318ff5", "value of round(boot6_means[\"mean_age\"][0], 2) is not correct"

print('Success!')

**Question 2.2** 
<br> {points: 1}

Let's now take 1000 bootstrap samples from the original sample we drew from the population (`one_sample`). As previously, assign a new column called `replicate` to mark the sample number `(from 0 to 999)`. Name the data frame `boot1000`.

Set the seed as `1234`.

In [None]:
np.random.seed(1234)  # DO NOT CHANGE!
# your code here
raise NotImplementedError
boot1000

In [None]:
from hashlib import sha1
assert sha1(str(type("".join(boot1000.columns))).encode("utf-8")+b"fab82ccb08ca682f").hexdigest() == "bdbcca31139addc3224b375019f0a8608f6239d9", "type of \"\".join(boot1000.columns) is not str. \"\".join(boot1000.columns) should be an str"
assert sha1(str(len("".join(boot1000.columns))).encode("utf-8")+b"fab82ccb08ca682f").hexdigest() == "50d28b3e4b41d77cd3442c96f074e84e61951977", "length of \"\".join(boot1000.columns) is not correct"
assert sha1(str("".join(boot1000.columns).lower()).encode("utf-8")+b"fab82ccb08ca682f").hexdigest() == "156ccedf4fcba6c6b340a709289e746a3bcf0196", "value of \"\".join(boot1000.columns) is not correct"
assert sha1(str("".join(boot1000.columns)).encode("utf-8")+b"fab82ccb08ca682f").hexdigest() == "156ccedf4fcba6c6b340a709289e746a3bcf0196", "correct string value of \"\".join(boot1000.columns) but incorrect case of letters"

assert sha1(str(type(boot1000.shape[0])).encode("utf-8")+b"d85870fec100fde1").hexdigest() == "01fc8671b6c1f67acfa96c7a719ef2c9d94fa019", "type of boot1000.shape[0] is not int. Please make sure it is int and not np.int64, etc. You can cast your value into an int using int()"
assert sha1(str(boot1000.shape[0]).encode("utf-8")+b"d85870fec100fde1").hexdigest() == "3db4bb3fa25e1e6f46e493de68c1680b704477e7", "value of boot1000.shape[0] is not correct"

assert sha1(str(type(round(sum(boot1000.age), 2))).encode("utf-8")+b"ed7f3bebacd8daab").hexdigest() == "b268b07f763d1daac43e0530ea3937e5f0538772", "type of round(sum(boot1000.age), 2) is not float. Please make sure it is float and not np.float64, etc. You can cast your value into a float using float()"
assert sha1(str(round(round(sum(boot1000.age), 2), 2)).encode("utf-8")+b"ed7f3bebacd8daab").hexdigest() == "822716009bb248f11d5017f59fd077e73fbfeced", "value of round(sum(boot1000.age), 2) is not correct (rounded to 2 decimal places)"

assert sha1(str(type(sum([x for x in range(boot1000.shape[0])]))).encode("utf-8")+b"ca1e2b4af0f838cd").hexdigest() == "73222ccdbf9f74501ab6f670ca43d0629f9fda59", "type of sum([x for x in range(boot1000.shape[0])]) is not int. Please make sure it is int and not np.int64, etc. You can cast your value into an int using int()"
assert sha1(str(sum([x for x in range(boot1000.shape[0])])).encode("utf-8")+b"ca1e2b4af0f838cd").hexdigest() == "63d9b6968f52f6abf489b1e3d9d8a8d2e8c21880", "value of sum([x for x in range(boot1000.shape[0])]) is not correct"

print('Success!')

**Question 2.3** 
<br> {points: 1}
Calculate the mean of these 1000 bootstrap samples using `groupby` and `mean` and save result into a column called `mean_age`. Use `reset_index` so that the resulting data frame has two columns: `replicate` and `mean`. Name the data frame `boot1000_means`.

In [None]:
# your code here
raise NotImplementedError
boot1000_means

In [None]:
from hashlib import sha1
assert sha1(str(type(boot1000_means.shape[0])).encode("utf-8")+b"14406c237a8080ef").hexdigest() == "277b26acc62bc184a61c9563f7d04191bc358384", "type of boot1000_means.shape[0] is not int. Please make sure it is int and not np.int64, etc. You can cast your value into an int using int()"
assert sha1(str(boot1000_means.shape[0]).encode("utf-8")+b"14406c237a8080ef").hexdigest() == "b94c84f9d48abf9215ea4f19ef73e42f5ec7986c", "value of boot1000_means.shape[0] is not correct"

assert sha1(str(type(boot1000_means.shape[1])).encode("utf-8")+b"ee16ef4810ccd00e").hexdigest() == "e4fcd95927b06502c835fc867424e7d6fe4ac9cd", "type of boot1000_means.shape[1] is not int. Please make sure it is int and not np.int64, etc. You can cast your value into an int using int()"
assert sha1(str(boot1000_means.shape[1]).encode("utf-8")+b"ee16ef4810ccd00e").hexdigest() == "795391dd59904cbdb2083f4da2409c5e32a34053", "value of boot1000_means.shape[1] is not correct"

assert sha1(str(type("".join(boot1000_means.columns))).encode("utf-8")+b"f82706726418d8f3").hexdigest() == "2d84da79bffb66ab5169a0169f107eaeebc9725f", "type of \"\".join(boot1000_means.columns) is not str. \"\".join(boot1000_means.columns) should be an str"
assert sha1(str(len("".join(boot1000_means.columns))).encode("utf-8")+b"f82706726418d8f3").hexdigest() == "c52f7f550e9820b2be9800aef198e63fb172107b", "length of \"\".join(boot1000_means.columns) is not correct"
assert sha1(str("".join(boot1000_means.columns).lower()).encode("utf-8")+b"f82706726418d8f3").hexdigest() == "aa53ea505136cf70d768e577d83347dc2530e7da", "value of \"\".join(boot1000_means.columns) is not correct"
assert sha1(str("".join(boot1000_means.columns)).encode("utf-8")+b"f82706726418d8f3").hexdigest() == "aa53ea505136cf70d768e577d83347dc2530e7da", "correct string value of \"\".join(boot1000_means.columns) but incorrect case of letters"

assert sha1(str(type(round(boot1000_means["mean_age"][0], 2))).encode("utf-8")+b"1d9accfce20d4ac7").hexdigest() == "79fab0be5873c6a8d760c1ff9be9c783818722cf", "type of round(boot1000_means[\"mean_age\"][0], 2) is not correct"
assert sha1(str(round(boot1000_means["mean_age"][0], 2)).encode("utf-8")+b"1d9accfce20d4ac7").hexdigest() == "1d416a84d3c4b5a0db51a53753178d2fdcbcfd92", "value of round(boot1000_means[\"mean_age\"][0], 2) is not correct"

print('Success!')

**Question 2.4** 
<br> {points: 1}

Visualize the distribution of the bootstrap sample point estimates (`boot1000_means`) you just calculated by plotting a histogram with `maxbins=30`. Name the plot `boot_est_dist` and give the plot and the x-axis a descriptive title.

In [None]:
# your code here
raise NotImplementedError
boot_est_dist

In [None]:
from hashlib import sha1
assert sha1(str(type(boot_est_dist.encoding.x.field)).encode("utf-8")+b"22048e0dbe3d3f50").hexdigest() == "755c27b781a23c17f44e3c72ecaeb8242ffe79e9", "type of boot_est_dist.encoding.x.field is not correct"
assert sha1(str(boot_est_dist.encoding.x.field).encode("utf-8")+b"22048e0dbe3d3f50").hexdigest() == "05309ec03d2008ceb920cd852e2146dfe9fc8065", "value of boot_est_dist.encoding.x.field is not correct"

assert sha1(str(type(boot_est_dist.mark)).encode("utf-8")+b"ed58c5b3e239c248").hexdigest() == "3bb96e7f5c0ca83dbf4cff6818b79e77067eacbd", "type of boot_est_dist.mark is not str. boot_est_dist.mark should be an str"
assert sha1(str(len(boot_est_dist.mark)).encode("utf-8")+b"ed58c5b3e239c248").hexdigest() == "22655a894f282f13145724a9cdb5a8db862d2334", "length of boot_est_dist.mark is not correct"
assert sha1(str(boot_est_dist.mark.lower()).encode("utf-8")+b"ed58c5b3e239c248").hexdigest() == "6cc2b7dc8341f270154bac9b7cb3651274bc208b", "value of boot_est_dist.mark is not correct"
assert sha1(str(boot_est_dist.mark).encode("utf-8")+b"ed58c5b3e239c248").hexdigest() == "6cc2b7dc8341f270154bac9b7cb3651274bc208b", "correct string value of boot_est_dist.mark but incorrect case of letters"

assert sha1(str(type(boot_est_dist.data.shape[0])).encode("utf-8")+b"c37e1be3479089dc").hexdigest() == "eb0676b4ed465a79ea9024d6b07c94febfeba79c", "type of boot_est_dist.data.shape[0] is not int. Please make sure it is int and not np.int64, etc. You can cast your value into an int using int()"
assert sha1(str(boot_est_dist.data.shape[0]).encode("utf-8")+b"c37e1be3479089dc").hexdigest() == "7115d517a3c7b6085636007ebac4a1c88c9c8180", "value of boot_est_dist.data.shape[0] is not correct"

assert sha1(str(type(round(sum(boot_est_dist.data.sum()), 2))).encode("utf-8")+b"bfab34be361ac256").hexdigest() == "0ba6110bd4a5eca1ce460343e3753d2f946bc5dc", "type of round(sum(boot_est_dist.data.sum()), 2) is not float. Please make sure it is float and not np.float64, etc. You can cast your value into a float using float()"
assert sha1(str(round(round(sum(boot_est_dist.data.sum()), 2), 2)).encode("utf-8")+b"bfab34be361ac256").hexdigest() == "356c52883daaf216e8eb1fbeb71deb19f8a6ad6f", "value of round(sum(boot_est_dist.data.sum()), 2) is not correct (rounded to 2 decimal places)"

assert sha1(str(type(boot_est_dist.encoding.x.field != boot_est_dist.encoding.x.title)).encode("utf-8")+b"64a097ac8fab43d2").hexdigest() == "32618dae6126eca1fe456371056d9570d4efe86a", "type of boot_est_dist.encoding.x.field != boot_est_dist.encoding.x.title is not bool. boot_est_dist.encoding.x.field != boot_est_dist.encoding.x.title should be a bool"
assert sha1(str(boot_est_dist.encoding.x.field != boot_est_dist.encoding.x.title).encode("utf-8")+b"64a097ac8fab43d2").hexdigest() == "d683a3af45f4c410adc3b11cb175b027bf165d5e", "boolean value of boot_est_dist.encoding.x.field != boot_est_dist.encoding.x.title is not correct"

assert sha1(str(type(boot_est_dist.title is not None)).encode("utf-8")+b"c4d1ad0e3b54845f").hexdigest() == "b7c593e5fa0b6bc3bbf7dc3f8c611edc5f40fdb9", "type of boot_est_dist.title is not None is not bool. boot_est_dist.title is not None should be a bool"
assert sha1(str(boot_est_dist.title is not None).encode("utf-8")+b"c4d1ad0e3b54845f").hexdigest() == "758095916899fdf1775ed4a898e11776d185644d", "boolean value of boot_est_dist.title is not None is not correct"

print('Success!')

How does the bootstrap distribution above compare to the sampling distribution? Let's visualize them side by side:

In [None]:
# Run this cell

# Create sampling distribution from the population
np.random.seed(4321)
samples = pd.concat([
    can_seniors.sample(40).assign(replicate=n)
    for n in range(1000)
])

sample_estimates = (
    samples.groupby("replicate")
    .mean()
    .reset_index()
    .rename(columns={"age": "mean_age"})
)

In [None]:
# Visualize the sampling distribution
sampling_dist = alt.Chart(
    sample_estimates,
    title=[
        "Sampling distribution",
        f'mean = {sample_estimates["mean_age"].mean().round(1)}'
    ]
).mark_bar().encode(
    x=alt.X("mean_age")
        .title("Sample mean age (years)")
        .bin(maxbins=30, extent=(72, 94)),
    y="count()"
).properties(
    height=150
)

# Plot both distribution
boot_est_dist.encoding.x['bin']['extent'] = (72, 94)
sampling_dist & boot_est_dist.properties(
    height=150,
    title=[
        "Bootstrap distribution",
        f'mean = {boot1000_means["mean_age"].mean().round(1)}'
    ]
)

Reminder: the true population quantity we are trying to estimate, the population mean, is about 79 years. We know this because we created this population and calculated this value. In real life we wouldn't know this value.

**Question 2.5** True/False
<br> {points: 1}

The mean of the bootstrap distribution is the same value as the mean of the sampling distribution of the sample means. True or false?

*Assign your answer to an object called `answer2_5`. Your answer should be a boolean. i.e. `True` or `False`.*

In [None]:
# your code here
raise NotImplementedError

In [None]:
from hashlib import sha1
assert sha1(str(type(answer2_5)).encode("utf-8")+b"416ba23a942572e2").hexdigest() == "f3cb4e785b52da86f3deac20e896447180701c49", "type of answer2_5 is not bool. answer2_5 should be a bool"
assert sha1(str(answer2_5).encode("utf-8")+b"416ba23a942572e2").hexdigest() == "bd4fa09eb4438887270d18dc5ad3cbef71b198e3", "boolean value of answer2_5 is not correct"

print('Success!')

**Question 2.6** True/False
<br> {points: 1}

The mean of the bootstrap distribution is not the same value as the mean of the sampling distribution because the bootstrap distribution was created from samples drawn from a single sample, whereas the sampling distribution was created from samples drawn from the population. True or false?

*Assign your answer to an object called `answer2_6`. Your answer should be a boolean. i.e. `True` or `False`.*

In [None]:
# your code here
raise NotImplementedError

In [None]:
from hashlib import sha1
assert sha1(str(type(answer2_6)).encode("utf-8")+b"7a8e1bca45b2049b").hexdigest() == "9d339753b5ac9e7c30edebf0fd609ad060dc15a9", "type of answer2_6 is not bool. answer2_6 should be a bool"
assert sha1(str(answer2_6).encode("utf-8")+b"7a8e1bca45b2049b").hexdigest() == "46a61683b1d7d3e1811a19a4aa6f123ee0237d71", "boolean value of answer2_6 is not correct"

print('Success!')

**Question 2.7** True/False
<br> {points: 1}

The shape and spread (i.e. width) of the distribution of the bootstrap sample means is a poor approximation of the shape and spread of the sampling distribution of the sample means. True or false?

*Assign your answer to an object called `answer2_7`. Your answer should be a boolean. i.e. `True` or `False`.*

In [None]:
# your code here
raise NotImplementedError

In [None]:
from hashlib import sha1
assert sha1(str(type(answer2_7)).encode("utf-8")+b"c9e7c687b8d4b2a4").hexdigest() == "6f46310549b19d7e2e1a887c8f272abd90cdb0b7", "type of answer2_7 is not bool. answer2_7 should be a bool"
assert sha1(str(answer2_7).encode("utf-8")+b"c9e7c687b8d4b2a4").hexdigest() == "1d3295a05057824c61183b2ea28d83393c8ffcdf", "boolean value of answer2_7 is not correct"

print('Success!')

**Question 2.8** True/False
<br> {points: 1}

In real life, where we only have one sample and cannot create a sampling distribution, the distribution of the bootstrap sample estimates (here means) can suggest how we might expect our point estimate to behave if we took another sample. True or false?

*Assign your answer to an object called `answer2_8`. Your answer should be a boolean. i.e. `True` or `False`.*

In [None]:
# your code here
raise NotImplementedError

In [None]:
from hashlib import sha1
assert sha1(str(type(answer2_8)).encode("utf-8")+b"c99b7194247c469e").hexdigest() == "3667b761c05968723112cc1bd4e7731159531948", "type of answer2_8 is not bool. answer2_8 should be a bool"
assert sha1(str(answer2_8).encode("utf-8")+b"c99b7194247c469e").hexdigest() == "1d9983b1f3d2b8506f2d05ae998165297f74b3fc", "boolean value of answer2_8 is not correct"

print('Success!')

### Using the bootstrap distribution to calculate a plausible range for point estimates

Once we have created a bootstrap distribution, we can use it to suggest a plausible range where we might expect the true population quantity to lie. One formal name for a commonly used plausible range is called a confidence interval. Confidence intervals can be set at different levels, an example of a commonly used level is 95%. When we report a point estimate with a 95% confidence interval as the plausible range, formally we are saying that if we repeated this process of building confidence intervals more times with more samples, we’d expect ~ 95% of them to contain the value of the population quantity.

> How do you choose a level for a confidence interval? You have to consider the downstream application of your estimation and what the cost/consequence of an incorrect estimate would be. The higher the cost/consequence, the higher a confidence level you would want to use. You will learn more about this in later Statistics courses.

To calculate an approximate 95% confidence interval using bootstrapping, we essentially order the values in our bootstrap distribution and then take the value at the 2.5th percentile as the lower bound of the plausible range, and the 97.5th percentile as the upper bound of the plausible range. 

In [None]:
# Run this cell
# A "quantile" is a at 100th of a percentile (similar to a proportion vs a percentage)
boot1000_means["mean_age"].quantile([0.025, 0.975])

Thus, to finish our estimation of the population quantity that we are trying to estimate, we would report the point estimate and the lower and upper bounds of our confidence interval. We would say something like this:

Our sample mean age for Canadian seniors was measured to be 83.7 years, and we’re 95% "confident" that the true population mean for Canadian seniors is between 78.8 and 89.2. 

Here our 95% confidence interval does contain the true population mean for Canadian seniors, 79 years - pretty neat! However, in real life we would never be able to know this because we only have observations from a single sample, not the whole population.

**Question 2.9** True/False
<br> {points: 1}

For any sample we take, if we use bootstrapping to calculate the 95% confidence intervals, the true population quantity we are trying to estimate would always fall within the lower and upper bounds of the confidence interval. True or false?

*Assign your answer to an object called `answer2_9`. Your answer should be a boolean. i.e. `True` or `False`.*

In [None]:
# your code here
raise NotImplementedError

In [None]:
from hashlib import sha1
assert sha1(str(type(answer2_9)).encode("utf-8")+b"d8cb590d181270bf").hexdigest() == "51a50aacd6479a8fc3411a3a1752c141362eeb8c", "type of answer2_9 is not bool. answer2_9 should be a bool"
assert sha1(str(answer2_9).encode("utf-8")+b"d8cb590d181270bf").hexdigest() == "a52d19b6c70dd180dab23a73a91be741928e071c", "boolean value of answer2_9 is not correct"

print('Success!')