# Hypothesis Testing: Two Population Means

## Objectives

- Conduct a hypothesis test comparing the means of two populations.

## Hypothesis Test with Two Population Means
In previous sections, we learned how to test the mean of a population in relation to a particular value. But sometimes, we want to compare the means of two different populations. For example, if we have one population with mean $\mu_1$ and another population with mean $\mu_2$, we might want to test if $\mu_1$ is smaller than $\mu_2$. The null and alternative hypotheses for this hypothesis test would be

\begin{align*}
H_0:\ \mu_1 \geq \mu_2, \\
H_a:\ \mu_1 < \mu_2.
\end{align*}

There is another way to write these hypotheses that will be more convenient for our purposes. If we subtract $\mu_2$ from both sides of the inequalities, our hypotheses become

\begin{align*}
H_0:\ \mu_1 - \mu_2 \geq 0, \\
H_a:\ \mu_1 - \mu_2 < 0.
\end{align*}

This second way to write these hypotheses is equivalent to the first way, but it allows us to focus on the difference in the population means rather than comparing them directly. This is useful because, to find the $p$-value for a hypothesis test like this, we are going to use the distribution of the random variable $D = \bar{X}_1 - \bar{X}_2$.

Let $n_1$ be the size of the sample drawn from population $1$, and let $n_2$ be the size of the sample drawn from population $2$. Recall from the central limit theorem that $\bar{X}_1$ and $\bar{X}_2$ are normally distributed as long as $n_1, n_2 \geq 30$ or the underlying populations are normally distributed. It turns out that the random variable $D = \bar{X}_1 - \bar{X}_2$ is also normally distributed with mean $\mu_1 - \mu_2$ and standard error $\sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}$, where $\sigma_1$ and $\sigma_2$ are the standard deviations of the respective populations. As with the central limit theorem, we still require here that either $n_1, n_2 \geq 30$ or the underlying populations are normally distributed.

Of course, we almost never know $\sigma_1$ and $\sigma_2$ in practice. Instead, we approximate the population standard deviations with the sample standard deviations from the two samples, $s_1$ and $s_2$. In this case, the mean of the sampling distribution is

$$ \mu_1 - \mu_2 $$

and the standard error is approximated by

$$ \sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}. $$

Because we are approximating the population standard deviations with sample standard deviations, we need to use a $t$-distribution to find the $p$-value instead of the standard normal distribution. The $t$-score is given by the formula

$$ t = \frac{(\bar{x}_1 - \bar{x}_2) - (\mu_1 - \mu_2)}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}}, $$

and the degrees of freedom $df$ is the smaller of the values $n_1 - 1$ and $n_2 - 1$.


The fundamental steps for conducting a hypothesis test remain the same:

1. State the null and alternative hypotheses.
2. Assuming the null hypothesis is true, identify the sampling distribution.
3. Find the $p$-value.
4. Draw a conclusion.

***


### Example 9.1.1
A study is done by a community group in two neighboring colleges to determine which one graduates students with more math classes. College A samples $11$ graduates, with an average of four math classes and a standard deviation of $1.5$ math classes. College B samples nine graduates. Their average is $3.5$ math classes with a standard deviation of one math class. The community group believes that a student who graduates from college A has taken more math classes, on the average. Both populations have a normal distribution. Test at a $1\%$ significance level.

#### Solution
First, gather the given information:

```{list-table} Statistics from the samples taken from college A and college B.
:header-rows: 1
:stub-columns: 1
* -
  - College A
  - College B
* - Sample Size
  - $n_A = 11$
  - $n_B = 9$
* - Sample Mean
  - $\bar{x}_A = 4$
  - $\bar{x}_B = 3.5$
* - Sample Standard Deviation
  - $s_A = 1.5$
  - $s_B = 1$
```

##### Step 1: State the null and alternative hypotheses.
The community group believes that a student who graduates from college A has taken more math classes, on the average, than a student who graguates from college B. Mathematically, this means the community group believes $\mu_A > \mu_B$, where $\mu_A$ is the average number of math classes a graduate from college A takes, and $\mu_B$ is the average number of math classes a graduate from college B takes. But to perform a hypothesis test, we need to test the difference in the population means, so we rewrite $\mu_A > \mu_B$ as  $\mu_A - \mu_B > 0$. Thus,

\begin{align*}
H_0:&\ \mu_A - \mu_B \leq 0 \\
H_a:&\ \mu_A - \mu_B > 0
\end{align*}

##### Step 2: Assuming the null hypothesis is true, identify the sampling distribution.
If the null hypothesis is true, then the mean of the sampling distribution is

$$ \mu_{\overline{D}} = \mu_A - \mu_B = 0, $$

and the standard error is

$$ \sqrt{\frac{s_A^2}{n_A} + \frac{s_B^2}{n_B}} = \sqrt{\frac{1.5^2}{11} + \frac{1^2}{9}} = 0.5618. $$

Since $n_A - 1 > n_B - 1$, we will use a $t$-distribution with $df = n_B - 1 = 9 - 1 = 8$ degrees of freedom to find the $p$-value.

##### Step 3: Find the  $p$-value.
The point estimate of $\mu_A - \mu_B$ is 

$$ \bar{x}_A - \bar{x}_B = 4 - 3.5 = 0.5. $$

The test statistic is given by the $t$-score

$$ t = \frac{(\bar{x}_A - \bar{x}_B) - (\mu_A - \mu_B)}{\sqrt{\frac{s_A^2}{n_A} + \frac{s_B^2}{n_B}}} = \frac{0.5 - 0}{0.5618} = 0.8900. $$

Since the alternative hypothesis $H_a$ uses a "greather-than" symbol, we will perform a right-tailed test. The $p$-value is $P(t \geq 0.8900)$.

In [2]:
1 - pt(q = 0.8900, df = 8)

So the $p$-value is $P(t \geq 0.8900) = 0.1997$. In other words, if the null hypothesis *is* true, there is a $19.97\%$ chance that the mean number of math classes taken by the students randomly sampled from college A is at least $0.5$ classes more than the mean number of math classes taken by the students randomly sampled from college B.

##### Step 4: Make a conclusion about the null hypothesis.
We are testing the hypothesis at the $1\%$ significance level, meaning that $\alpha = 0.01$. Since

$$p\text{-value} = 0.1997 \geq 0.01 = \alpha,$$

we do not reject the null hypothesis.

The evidence is insufficient to conclude that the average number of math classes taken by graduates from college A is greater than the average number of math classes taken by graduates from college B.

***


### Example 9.1.2
A professor at a large community college wanted to determine whether there is a difference in the means of final exam scores between students who took his statistics course online and the students who took his face-to-face statistics class. The randomly selected final exam scores for the two courses are listed below. Test whether or not the exam scores are different at the $5\%$ significance level.

**Sample 1: Thirty final exam scores from the online class:**

67.6, 70.66, 94.1, 41.2, 38.22, 88.2, 85.3, 61.8, 64.7, 55.9, 88.2, 55.9, 82.4, 70.6, 88.2, 91.2, 58.8, 97.1, 73.5, 91.2, 85.3, 94.1, 73.5, 61.8, 64.7, 82.4, 79.4, 64.7, 35.5, 79.4

**Sample 2: Thirty final exam scores from the face-to-face class:**

77.9, 95.3, 81.2, 74.1, 98.8, 88.2, 84.9, 92.9, 87.1, 88.2, 69.4, 57.6, 69.4, 67.1, 97.6, 85.9, 88.2, 91.8, 78.8, 71.8, 98.8, 61.2, 92.9, 90.6, 97.6, 100, 95.3, 83.5, 92.9, 89.4

#### Solution
##### Step 1: State the null and alternative hypotheses.
We want to test whether or not the average final exam scores for the online class are different than the average final exam scores for the face-to-face class. This means we want to know if $\mu_1 = \mu_2$ or if $\mu_1 \neq \mu_2$, where $\mu_1$ is the average final exam score for the online class and $\mu_2$ is the average final exam score for the face-to-face class. But to perform a hypothesis test, we need to express these as the difference between population means. Rewriting $\mu_1 = \mu_2$ as $\mu_1 - \mu_2 = 0$ and $\mu_1 \neq \mu_2$ as $\mu_1 - \mu_2 \neq 0$ gives our hypotheses:

\begin{align*}
H_0:&\ \mu_1 - \mu_2 = 0 \\
H_a:&\ \mu_1 - \mu_2 \neq 0
\end{align*}

##### Step 2: Assuming the null hypothesis is true, identify the sampling distribution.

Assuming the null hypothesis is true, the mean of the distribution is

$$ \mu_1 - \mu_2 = 0. $$

The standard error is given by the formula

$$ \sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}, $$

which means we need to first calculate $s_1$ and $s_2$.

**Finding $s_1$:**

To find $s_1$, the sample standard deviation for the online class, we first must find $\bar{x}_1$.

In [5]:
x1 = c(67.6, 70.66, 94.1, 41.2, 38.22, 88.2, 85.3, 61.8, 64.7, 55.9, 88.2, 55.9, 82.4, 70.6, 88.2, 91.2, 58.8, 97.1, 73.5, 91.2, 85.3, 94.1, 73.5, 61.8, 64.7, 82.4, 79.4, 64.7, 35.5, 79.4)
n1 = length(x1)

xbar1 = sum(x1)/n1
xbar1

So the mean for the sample from the online class is $\bar{x}_1 = 72.8527$. Next, we calculate the sample standard deviation.

In [6]:
s1 = sqrt(sum( (x1 - xbar1)^2 )/(n1 - 1))
s1

The sample standard deviation for the online class is $s_1 = 16.9171$.

**Finding $s_2$:**

To find $s_2$, the sample standard deviation for the face-to-face class, we first must find $\bar{x}_2$.

In [7]:
x2 = c(77.9, 95.3, 81.2, 74.1, 98.8, 88.2, 84.9, 92.9, 87.1, 88.2, 69.4, 57.6, 69.4, 67.1, 97.6, 85.9, 88.2, 91.8, 78.8, 71.8, 98.8, 61.2, 92.9, 90.6, 97.6, 100, 95.3, 83.5, 92.9, 89.4)
n2 = length(x2)

xbar2 = sum(x2)/n2
xbar2

So the mean for the sample from the face-to-face class is $\bar{x}_2 = 84.9467$. Next, calculate the sample standard deviation.

In [8]:
s2 = sqrt(sum( (x2 - xbar2)^2 )/(n2 - 1))
s2

The sample standard deviation for the face-to-face class is $s_2 = 11.7129$.

Now we can find the standard error:

$$ \sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}} = \sqrt{\frac{16.9171^2}{30} + \frac{11.7129^2}{30}} = 3.7567. $$

To find the $p$-value, we will use a $t$-distribution with $df = n_1 - 1 = 30 - 1 = 29$ degrees of freedom. (Since $n_1 - 1 = n_2 - 1$, it doesn't matter which we use to calculate the degrees of freedom.)

##### Step 3: Find the  $p$-value.
The point estimate of $\mu_1 - \mu_2$ is

$$ \bar{x}_1 - \bar{x}_2 = 72.8527 - 84.9467 = -12.094. $$

The test statistic is the $t$-score

$$ t = \frac{(\bar{x}_1 - \bar{x}_2) - (\mu_1 - \mu_2)}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}} = \frac{-12.094 - 0}{3.7567} = -3.2193. $$

Since the alternative hypothesis $H_a$ uses a "not-equal-to" symbol, we will perform a two-tailed test. That means that *half* the $p$-value is $P(t \leq -3.1293)$. Let's calculate using R.

In [9]:
pt(q = -3.2193, df = 29)

So $P(t \leq -3.1293) = 0.0016$, meaing the $p$-value is

$$ p\text{-value} = 2(0.0016) = 0.0032. $$

##### Step 4: Make a conclusion about the null hypothesis.
The level of significance for this test is $5\%$, so $\alpha = 0.05$. Since

$$ p\text{-value} = 0.0032 < 0.05 = \alpha, $$

we reject the null hypothesis. The chance of getting the point estimate we did if the null hypothesis were true is so small, we think it is more likely that the null hypothesis is not true.

We conclude that the average final exam score for the online statistics class does not match the average final exam score for the face-to-face statistics class.
