# Confidence Intervals

So far, we’ve inspected the null distribution and calculated the minimum and maximum values. While the number of purchases in each simulated sample ranged roughly from 25 to 75 by random chance, upon further inspection of the distribution, we saw that those extreme values happened very rarely.

By reporting an interval covering 95% of the values instead of the full range, we can say something like: “we are 95% confident that, if each visitor has a 10% chance of making a purchase, a random sample of 500 visitors will make between 37 and 63 purchases.” We can use the `np.percentile()` function to calculate this 95% interval as follows:

```python
np.percentile(outcomes, [2.5,97.5])
# output: [37. 63.]
```

We calculated the 2.5th and 97.5th percentiles so that exactly 5% of the data falls outside those percentiles (2.5% above the 97.5th percentile, and 2.5% below the 2.5th percentile). This leaves us with a range covering 95% of the data.

If our observed statistic falls outside this interval, then we can conclude it is unlikely that the null hypothesis is true. In this example, because 41 falls within the 95% interval (37 - 63), it is still reasonably likely that we observed a lower purchase rate by random chance, even though the null hypothesis was true.


## Instructions

1. The code to generate `null_outcomes` has been provided for you. Calculate an interval covering the middle 90% of the values in `null_outcomes`. Save the output in a variable named `null_90CI` and print it out. Is the observed value of 41 purchases inside or outside this interval?

    <details>
        <summary>Stuck? Get a hint</summary>
    
    For a 90% interval, we need to calculate the 5th and 95th percentiles.
    </details>


In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

null_outcomes = []

for i in range(10000):
  simulated_monthly_visitors = np.random.choice(['y', 'n'], size=500, p=[0.1, 0.9])

  num_purchased = np.sum(simulated_monthly_visitors == 'y')

  null_outcomes.append(num_purchased)

#calculate the 90% interval here:


### Solution

In [3]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

null_outcomes = []

for i in range(10000):
  simulated_monthly_visitors = np.random.choice(['y', 'n'], size=500, p=[0.1, 0.9])

  num_purchased = np.sum(simulated_monthly_visitors == 'y')

  null_outcomes.append(num_purchased)

#calculate the 90% interval here:
null_90CI = np.percentile(null_outcomes, [5, 95])
null_90CI

array([39., 61.])