<h2> Jupyter Homework 6: Exploring confidence intervals </h2>


<img src="images/CIs.png" style="float: right; width: 50%">


This week, we're going to experiment a bit with confidence intervals and generating them from data. One of the most subtle things about confidence intervals is that they do *not* represent the probability that a parameter $\mu$ is in a particular interval $(\ell, u)$ -- it either is or it isn't. What *is* true is that if we generate a large number of confidence intervals at level $CL$, then they should contain the parameter with probability $CL$.






Let's demonstrate this with our standard normal. We'll do the following as a single trial:
* Generate $50$ normally distributed numbers from an $N(0, 1)$ distribution using `np.random.normal()`.
* Compute the sample mean $\overline{X}$
* Construct a $90\%$ confidence interval $(\overline{X} \pm 1.645 / \sqrt{50})$.
* Check if the true mean $\mu = 0$ is in the confidence interval.

We'll then carry out $10,000$ trials of this and see how close we came:


In [None]:
import numpy as np
import scipy
from scipy.stats import norm

Let's first create a function that creates a confidence interval given the proper $n$ value and the proper $z$-score. The function returns $1$ if our mean is in the interval and $0$ if not. 

In [None]:
def is_in_CI(n,z):
    # Set the true mean  to 0 and variance to 1
    mu = 0
    sigma = 1
    # Generate n random data points
    data = np.random.normal(size = n)
    # Calculate the sample mean
    xbar = np.mean(data)
    # Calculate the Standard Error
    SE = z*sigma/np.sqrt(n)
    # Check if 0 is in the confidence interval
    if xbar - SE < mu < xbar + SE:
        return 1
    else:
        return 0

**Example Usage**:

In [None]:
n = 50
CL = 0.9
alpha = 1-CL
z = norm.isf(alpha/2)
print("Our z-score: ",z)

# Run this 10K times and count the successes
count = 0
for _ in range(10000):
    count += is_in_CI(n,z)

print(count)


<h3> Questions </h3>

#### Question 1: 

* Modify the cell below to construct a $95\%$ confidence interval, this time taking only $n=8$ samples each time from the normal distribution. Your code should again print out a `count` just like above, this time it should be close to $9,500$.

In [None]:
# Put your answer to question 1 here


n = ...
CL = ...
alpha = ...
z = ...
print("Our z-score: ",z)

# Run this 10K times and count the successes
count1 = 0
for _ in range(10000):
    count1 += is_in_CI(n,z)

print("The true mean was in the interval",count1,"times")

#### Question 2: 
* Keeping the same $n=8$ and $CL = 95\%$, replace the standard error $SE = z* \sigma/\sqrt{n}$ using the **sample standard deviation**: $SE = z*s_n/\sqrt{n}$. You can calculate the sample standard deviation $s_n$ using `np.std(data)`.


In [None]:
# Put your answer to question 2 here



def is_in_CI_with_sample_std(n,z):
    # Set the true mean  to 0
    mu = ...
    # Generate n random data points
    data = ...
    # Calculate the sample mean,sample std, and SE
    xbar = ...
    std_dev = ...
    SE = ...
    # Check if 0 is in the confidence interval
    if ...
        return ...
    else:
        return ...

n = ...
CL = ...
alpha = ...
z = ...
print("Our z-score: ",z)

# Run this 10K times and count the successes
count2 = 0
for _ in range(10000):
    count2 += is_in_CI_with_sample_std(n,z)

print("The true mean was in the interval",count2,"times")

*  Estimate the corresponding confidence level; is it higher or lower than $95\%$? Does this match your expectation?

In [None]:
# Put your response in the string

Response = "..."
print(Response)

#### Question 3:
* Adapting your code from the previous part, estimate a value of $t$ so that $(\overline{X} \pm t \cdot s_{8} / \sqrt{8})$ is a $95\%$ confidence interval for the mean.


*Note*: To verify your answer, you can look up $t_{7, 0.025}$ in Table B.2 in the textbook, or run `scipy.stats.t.isf(alpha/2,df = n-1)`.


In [None]:
# Put your answer to question 3 here


...


t = ...
print(t)

## -----------------------------------------------------------------------

**Submission** : Export as a `.ipynb` file and upload it onto *Gradescope* under **Jupyter Homework 6 -- in-class**. You should be able to upload one submission per group.

## -----------------------------------------------------------------------