# Estimating Population Means ($\sigma$ Unknown)

## Objectives
- Construct confidence intervals for population means in populations where the standard deviation is not known.
- Identify and interpret the margin of error for a confidence interval.

## Confidence Intervals using Student's $t$-Distribution

As mentioned before, we rarely actually know the population standard deviation $\sigma$ when constructing a confidence interval for the population mean $\mu$. To approximate the population standard deviation $\sigma$ with the sample standard deviation $s$, we must use a $t$-distribution with $n - 1$ degrees of freedom (where $n$ is the sample size) instead of a normal distribution.

With the exception of this change, the process of constructing a confidence interval for the population mean is largely the same:

1. Find the sample mean $\bar{x}$ and the sample standard deviation $s$.
2. Find $t_{\alpha/2}$, the $t$-score with area $\alpha/2$ to its right and $n-1$ degrees of freedom.
3. Calculate the margin of error using the formula $E = t_{\alpha/2} \dfrac{s}{\sqrt{n}}$.
4. Construct the confidence interval $(\bar{x} - E, \bar{x} + E)$.

***

### Example 5.4.1
Forty-five newborn elephants are sampled and found to have the following weights, in pounds:

333, 248, 303, 248, 153, 168, 280, 256, 195, 234, 366, 250, 325, 266, 164, 253, 262, 343, 244, 425, 345, 343, 277, 215, 226, 254, 289, 296, 268, 195, 268, 202, 249, 256, 284, 257, 205, 215, 251, 257, 144, 323, 238, 257, 218

Construct a $95\%$ confidence interval for the mean weight of a newborn elephant.

#### Solution
Note that we are *not* told what the population standard deviation $\sigma$ is. That means we will need to approximate it using the sample standard deviation $s$, and we'll need to use a $t$-distribution.

We are given that

\begin{align*}
n &= 25 \\
CL &= 0.95
\end{align*}

##### Step 1: Find the sample mean $\bar{x}$ and the sample standard deviation $s$.

In [1]:
x = c(333, 248, 303, 248, 153, 168, 280, 256, 195, 234, 366, 250, 325, 266, 164, 253, 262, 343, 244, 425, 345, 343, 277, 215, 226, 254, 289, 296, 268, 195, 268, 202, 249, 256, 284, 257, 205, 215, 251, 257, 144, 323, 238, 257, 218)
n = length(x)

xbar = sum(x)/n
xbar

Then the sample mean is $\bar{x} = 258.84444$.

To find the sample standard deviation, recall that we use the formula

$$ s = \sqrt{\frac{\sum (x - \bar{x})^2}{n - 1}}. $$

Let's translate this formula to R.

In [2]:
s = sqrt(sum( (x - xbar)^2 )/(n-1))
s

The sample standard deviation is $s = 57.38425$.

##### Step 2: Find $t_{\alpha/2}$.
First, note the degrees of freedom for the $t$-distribution is

$$ df = n-1 = 45 - 1 = 44. $$

Next, since $\text{CL} = 0.95$, the area outside of the confidence interval is

$$ \alpha = 1 - \text{CL} = 1 - 0.95 = 0.05. $$

So $\alpha/2 = 0.05/2 = 0.025$. We want to find $t_{\alpha/2} = t_{0.025}$, the $t$-score with an area of $0.025$ to its right.

In [20]:
qt(p = 1 - 0.025, df = 44)*sd(x)/sqrt(45)

So $t_{0.025} = 2.01537$.

##### Step 3: Calculate the Margin of Error.
The margin of error is

$$ E = t_{0.025}\frac{s}{\sqrt{n}} = 2.01537\left(\frac{57.38425}{\sqrt{45}}\right) = 17.24014. $$

##### Step 4: Construct the Confidence Interval.
The confidence interval is

$$(\bar{x} - E, \bar{x} + E) = (258.84444 - 17.24014, 258.84444 + 17.24014) = (241.60431, 276.08458).$$

We are $95\%$ confident that the average weight of a newborn elephant is between $241.60431$ pounds and $276.08458$ pounds.

***


### Example 5.4.2
A Menifee High School math teacher, Mr. DeLeon, wants to know the average GPA of students at the high school. He randomly asks $30$ students what their GPA is, and obtains the following data:

3.55, 3.51, 3.27, 4.30, 3.17, 3.61, 3.24, 3.74, 3.40, 3.91, 3.00, 1.88, 2.54, 3.15, 4.35, 2.62, 4.01, 3.69, 3.82, 3.18, 2.60, 3.49, 3.05, 2.91, 3.28, 2.97, 3.09, 3.49, 3.49, 3.05

Construct a $98\%$ confidence interval for the mean GPA.

#### Solution
We are not told the population standard deviation $\sigma$, so we will need to approximate it using the sample standard deviation $s$ and use a $t$-distribution to find the margin of error.

We are told that

\begin{align*}
n &= 30 \\
CL &= 0.98
\end{align*}

##### Step 1: Find the Sample Mean $\bar{x}$ and the Sample Standard Deviation $s$.


In [1]:
x = c(3.55, 3.51, 3.27, 4.30, 3.17, 3.61, 3.24, 3.74, 3.40, 3.91, 3.00, 1.88, 2.54, 3.15, 4.35, 2.62, 4.01, 3.69, 3.82, 3.18, 2.60, 3.49, 3.05, 2.91, 3.28, 2.97, 3.09, 3.49, 3.49, 3.05)
n = length(x)

xbar = sum(x)/n
xbar

So the sample mean is $\bar{x} = 3.312$.

In [2]:
s = sqrt(sum( (x - xbar)^2 )/(n-1))
s

The sample standard deviation is $s = 0.5264$.

##### Step 2: Find $t_{\alpha/2}$.
First, note that the degrees of freedom for our $t$-distribution is

$$ df = n-1 = 30-1 = 29. $$

Next, since the area inside the confidence interval is $CL = 0.98$, the area outside the confidence interval is

$$ \alpha = 1 - CL = 1 - 0.98 = 0.02. $$

So the area remaining in each tail of the $t$-distribution is $\alpha/2 = 0.02/2 = 0.01$. We want to find $t_{\alpha/2} = t_{0.01}$, the $t$-value with a area of $0.01$ to its right.

In [24]:
qt(p = 1 - 0.01, df = 29)

Then $t_{0.01} = 2.4620$.

##### Step 3: Calculate the Margin of Error.
The margin of error is

$$ E = t_{0.01}\frac{s}{\sqrt{n}} = 2.4620\left(\frac{0.5264}{\sqrt{30}}\right) = 0.2366. $$

##### Step 4: Construct the Confidence Interval.
The confidence interval is

$$(\bar{x} - E, \bar{x} + E) = (3.312 - 0.2366, 3.312 + 0.2366) = (3.0754, 3.5486).$$

We are $98\%$ confident that the average GPA of students at Menifee High School is between $3.0754$ and $3.5486$.
