In [None]:
# Run cells by clicking on them and hitting CTRL + ENTER on your keyboard
from IPython.display import YouTubeVideo
from datascience import *
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')
%matplotlib inline

# Module 5.1 Part 2: Interpreting Confidence Intervals

In this short lecture guide, you'll learn how to properly interpret the confidence intervals generated by the
Bootstrap method.

3 videos make up this notebook, for a total run time of 18:53.

1. [Applying the Bootstrap](#section1) *1 videos, total runtime 11:04*
2. [Interpreting Confidence Intervals](#section2) *1 videos, total runtime 5:53*
3. [Confidence Intervals and Hypothesis Tests](#section3) *1 video, total runtime 1:56*
4. [Check for Understanding](#section4)

Textbook readings:
- [Chapter 13.3: Confidence Intervals](https://www.inferentialthinking.com/chapters/13/3/Confidence_Intervals.html)
- [Chapter 13.4: Using Confidence Intervals](https://www.inferentialthinking.com/chapters/13/4/Using_Confidence_Intervals.html)

<a id='section1'></a>
## 1. Applying the Bootstrap

In the next video, we'll review when and how to use the Bootstrap method to generate confidence intervals of a
population parameter. Follow along with Professor DeNero's example using the data loaded in the code cell below
the recording.

In [None]:
YouTubeVideo('fx8R_TjjWyU')

In [None]:
births = Table.read_table('https://www.inferentialthinking.com/data/baby.csv')
babies = births.select('Birth Weight', 'Gestational Days')
babies.show(5)

In [None]:
# follow along here!
...

<a id='section2'></a>
## 2. Interpreting Confidence Intervals

In the next lecture, you'll learn about the scenarios in which the Bootstrap method is (in)appropriate to use.

In [None]:
YouTubeVideo('eTxVMADJfmQ')

<a id='section3'></a>
## 3. Confidence Intervals and Hypothesis Tests

The next lecture video introduces the relationship between hypothesis testing and confidence intervals.

In [None]:
YouTubeVideo('94_M7SwFGTc')

<a id='section4'></a>
## 4. Check for Understanding

**A. In which of the following scenarios will the Bootstrap provide an unreliable confidence interval?**

<ol>
    <li>When you're attempting to estimate a median.
    <li>When the probability distribution of your statistic is not approximately bell shaped.
    <li>When the parameter you're attempting to estimate is affected by outliers, and that there
        are likely many outliers in the population.
    <li>When the original sample size is relatively large.
    <li>When the parameter you're trying to estimate is a maximum.


<details>
    <summary>Solution</summary>
    The Bootstrap will provide unreliable confidence intervals in scenarios 2, 3, and 5.
</details>
<br>

**B. Consider problem D from Module 5.1 Part 1's Check for Understanding. Say that the approximate 90% confidence
interval based on the Bootstrap procedure is [120.25, 124.78]. Which of the following statements correctly interprets
this confidence interval?**

<ol>
    <li>There is a 90% probability that the birth weights of babies born to non-smoking mothers
        are between 120.25 oz and 124.78 oz.
    <li>About 90% of babies born to non-smoking mothers have a birth weight between 120.25 oz and 124.78 oz.
    <li>We are 90% confident that the mean birth weight among the population of babies born to non-smoking mothers
        is in the range of 120.25 oz and 124.78 oz.
    <li>If we were to repeatedly sample from the population of non-smoking mothers and compute the 90% confidence
        interval using the Bootstrap technique for each of the new samples, we would expect that the true average
        baby weight would be contained in about 90% of these intervals.


<details>
    <summary>Solution</summary>
    4. is the only correct interpretation of the confidence interval. If you found this question difficult, we recommend that you
    re-watching lecture 22.2.
</details>
<br>

**C. Suppose you wanted to perform a hypothesis test regarding the true average baby weight born to non-smoking mothers:
Based on previous research, you believe that this population parameter is equal to 120 oz. However, a recent scientific report
on the subject has stated that the actual parameter is likely larger than 120 oz.**

**What are the null and alternative hypotheses? Using a significance cutoff of 10% and the confidence interval given in B,
what do you conclude?**

<details>
    <summary>Solution</summary>
    The null hypothesis is that the population average birth weight of babies born to non-smoking mothers is 120 oz. The alternative hypothesis is that this parameter is in fact larger than 120 oz.
    <br><br>
    Because the null hypothesis' parameter value does not fall within the approximate 90% confidence interval given in B, we
    can conclude that the data is inconsistent with the null hypothesis. That is, there is evidence to suggest that the average
    weight of a baby born to a non-smoking mother is over 120 oz.
</details>