# How are Confidence Intervals Calculated?

Our equation for calculating confidence intervals is as follows:

$$Best\ Estimate \pm Margin\ of\ Error$$

Where the *Best Estimate* is the **observed population proportion or mean** and the *Margin of Error* is the **t-multiplier**.

The t-multiplier is calculated based on the degrees of freedom and desired confidence level.  For samples with more than 30 observations and a confidence level of 95%, the t-multiplier is 1.96

The equation to create a 95% confidence interval can also be shown as:

$$Population\ Proportion\ or\ Mean\ \pm (t-multiplier *\ Standard\ Error)$$

Lastly, the Standard Error is calculated differenly for population proportion and mean:

$$Standard\ Error \ for\ Population\ Proportion = \sqrt{\frac{Population\ Proportion * (1 - Population\ Proportion)}{Number\ Of\ Observations}}$$

$$Standard\ Error \ for\ Mean = \frac{Standard\ Deviation}{\sqrt{Number\ Of\ Observations}}$$

# Estimating 'Population Proportion' with Confidence

### RESEARCH QUESTION - I:

Let's look at the following example:<br>
**What proportion of parents report that they use a car seat for all travel with their toddler?**<br>

**GOAL:** Construct a 95% confidence interval for the population proportion of parents reporting they use a car sear for all travel with their toddler.<br>

* A sample of 659 parents with a toddler was taken and asked if they used a car seat for all travel with their toddler.
* 540 parents responded 'YES' to this question.

#### Calculating Confidence Interval Manually

In [4]:
import numpy as np

In [8]:
t_multiplier = 1.96 #(for 95% Confidence Interval we use 1.96)
n = 659
p = 560/659 #0.85

In [10]:
# Calculating Standard Error
se = np.sqrt((p*(1-p))/n)
se

0.013918213333087017

In [11]:
# Calculating Lower and Upper Confidence Interval Bound
lcb = p - t_multiplier*se
ucb = p + t_multiplier*se
(lcb, ucb)

(0.8224926842647214, 0.8770520805304226)

#### Calculating Confidence Interval using Library

In [12]:
import statsmodels.api as sm

In [13]:
sm.stats.proportion_confint(n * p, n)

(0.8224931855355763, 0.8770515792595678)

# Estimating 'Population Mean' with Confidence

### RESEARCH QUESTION - III:

Let's look at the following example:<br>
**What is the average cartwheel distance (in inches) for adults?**<br>

**GOAL:** Construct a 95% confidence interval for the mean cartwheel distance for the population of all such adults.<br>

* Here n = 25
* $\bar{x} = 82.48$

In [14]:
import pandas as pd

In [15]:
df = pd.read_csv('CartWheelData.csv')

In [16]:
df.head()

Unnamed: 0,ID,Age,Gender,GenderGroup,Glasses,GlassesGroup,Height,Wingspan,CWDistance,Complete,CompleteGroup,Score
0,1,56,F,1,Y,1,62.0,61.0,79,Y,1,7
1,2,26,F,1,Y,1,62.0,60.0,70,Y,1,8
2,3,33,F,1,Y,1,66.0,64.0,85,Y,1,7
3,4,39,F,1,N,0,64.0,63.0,87,Y,1,10
4,5,27,M,2,N,0,73.0,75.0,72,N,0,4


In [17]:
df.shape #We've 25 rows

(25, 12)

In [21]:
mean = df.loc[:,'CWDistance'].mean()
sd = df.loc[:,'CWDistance'].std()
n = len(df)

print("n:",n)
print("Mean:",mean)
print("Standard Deviation:",sd)

n: 25
Mean: 82.48
Standard Deviation: 15.058552387264855


#### Calculating Confidence Interval Manually

In [22]:
# For n=25 and 95% confidence level we've t_multiplier=2.064 using t-distribution

In [23]:
t_multiplier=2.064

se = sd/np.sqrt(n)

se

3.0117104774529713

In [25]:
lcb = mean - t_multiplier * se
ucb = mean + t_multiplier * se
(lcb, ucb)

(76.26382957453707, 88.69617042546294)

#### Calculating Confidence Interval using Library

In [None]:
import statsmodels.api as sm

In [26]:
sm.stats.DescrStatsW(df["CWDistance"]).zconfint_mean()

(76.57715593233026, 88.38284406766975)