# Confidence Intervals and Statistical Inference 

## Confidence Intervals 

Recall from statistics course, the confidence interval is the sample mean plus and minus the critical value times the standard error (for the normal population with unknown mean and variance). 

In order to see how to compute all these ingredients manually, let's consider the example given in the text. In this example we are given the scrap rates for two years and we are interested in creating a 95% confidence interval for the change between the two years. 

In this example, we assume the change in scrap rates has a normal distribution. 

In [1]:
import numpy as np
import scipy.stats as stats
import matplotlib.pyplot as plt

In [8]:
# enter raw observations 
sr87 = np.array([10, 1, 6, .45, 1.25, 1.3, 1.06, 3, 8.18, 1.67,
                 .98, 1, .45, 5.03, 8, 9, 18, .28, 7, 3.97])

sr88 = np.array([3, 1, 5, .5, 1.54, 1.5, .8, 2, .67, 1.17, .51,
                 .5, .61, 6.7, 4, 7, 19, .2, 5, 3.83])

In [9]:
# calculate change in rates between years 
change = sr88 - sr87    # vectorized 

In [13]:
# start computing ingredients to a CI formula 
change_avg = np.mean(change)    # y-bar
print(f'Sample average: {y_bar}\n')

Sample average: -1.1544999999999999



In [16]:
n = len(change)
change_std = np.std(change, ddof=1)    # remember, sample
change_se = change_std / np.sqrt(n)
print(f'Standard error: {change_se}\n')

Standard error: 0.5367992249386514



In [17]:
crit_val = stats.t.ppf(.975, n-1)
print(f'Critical value: {crit_val}\n')

Critical value: 2.093024054408263



In [20]:
# produce confidence interval 
lower = change_avg - (crit_val * change_se)
upper = change_avg + (crit_val * change_se)
print(f'95% Confidence Interval: ({np.round_(lower, 4)}, {np.round_(upper, 4)})\n')

95% Confidence Interval: (-2.278, -0.031)



<br>Notice, the confidence interval does not contain the value 0, therefore we conclude the with "95% confidence", that the average change in scrap rates in the population is not zero. In other words, there is a difference in the scrap rates between the two years. 

Also note, from an econometric standpoint, this conclusion is likely flawed. There are many other potential factors that we have not taken into account. We only accounted for one measure. However, this was simply an illustrative example. 