# Confidence Intervals
## From Sampling Distributions to CIs.
> We can use sampling distributions and bootstrapping to understand the values of a statistic that are possible.
In the real world, we usually don't know the value of a parameter (We don't have all the info we'd like to, to determine the values of the parameter). We can build CIs for different parameters, and to infer the range of a desired parameter.

For instance, here I am using bootstrapping to determine the CIs for the difference in heights of coffee drinkers and non-drinkers.

In [7]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
sns.set()

%matplotlib inline

df = pd.read_csv('data/coffee_dataset.csv')
sample_data = df.sample(200)

In [8]:
diff = []
for _ in range(10000):
    bootsample = sample_data.sample(200, replace=True)
    mean_t = bootsample.query('drinks_coffee == True')['height'].mean()
    mean_f = bootsample.query('drinks_coffee == False')['height'].mean()
    diff.append(mean_t - mean_f)
    
np.percentile(diff, 2.5), np.percentile(diff, 97.5)

(0.5178840323124586, 2.4609562477726725)

> This has many real world applications, such as testing for patient outcomes for two groups of patients (drug group vs. placebo) and A/B testing, a method of comparing two versions of a webpage or app against each other to determine which one performs better.

## Statistical vs. Practical Significance
Comparing the mean of 2 groups rather than just the point values are important.

> - **Statistical significance:** evidence from hypothesis tests and CIs that H1 is true.
> - **Practical Significance:** making decisions based on the real-world application and practicality, rather than purely relying on numbers.

## Other Language Associated with Confidence Intervals

Assuming you control all other items of your analysis:

> Increasing your sample size will decrease the width of your confidence interval.
Increasing your confidence level (say 95% to 99%) will increase the width of your confidence interval.

You saw that you can compute:

> The confidence interval width as the difference between your upper and lower bounds of your confidence interval.
The margin of error is half the confidence interval width, and the value that you add and subtract from your sample estimate to achieve your confidence interval final results.

## Correct Interpretations of Confidence Intervals
> CI aimed at the parameter, or a single numeric value in our population (i.e. std, mean, difference in mean, etc.) It **does not** allow us to say anything specific about an individual in a population. Instead, it takes an **aggregate** approach at looking at our data.