# Goodness-of-Fit Test

## Objectives
- Use a goodness-of-fit hypothesis test to test if observed values fit an expected discrete distribution.

## Goodness-of-Fit Test
In this type of hypothesis test, we determine whether the data *"fit"* a particular discrete distribution or not. For example, we may suspect the unknown data fit a binomial distribution. We use a goodness-of-fit test to determine if there is a fit or not.

The test statistic for a goodness-of-fit test is:

$$ \chi^2 = \sum \frac{(O - E)^2}{E}, $$

where

- $O$ = observed values (sample data)
- $E$ = expected values (from theory)

The observed values are the data values from the sample. The expected values are the values we would expect to get if the null hypothesis were true. The sampling distribution of the test statistic is a $\chi^2$-distribution. The number of degrees of freedom of the distribution is $df = k - 1$, where $k$ is the number of different categories in the hypothesized discrete distribution.

Generally, the hypotheses of a goodness-of-fit test are

\begin{align*}
H_0:&\ \text{The actual population fits the expected distribution.} \\
H_a:&\ \text{The actual population does not fit the expected distribution.}
\end{align*}

These hypotheses may be written as sentences and should be expressed in the context of the particular problem.

A goodness-of-fit test is almost always right-tailed. The further apart observed values and expected values are from each other, the further out in the right tail the test statistic will be. (A left-tailed goodness of fit test would test if the observed values fit the expected values too well, which is not generally something we are concerned with.)

For a goodness-of-fit test to be valid, the expected value for each category needs to be at least five.

As with any hypothesis test, the fundamental steps for performing a goodness-of-fit hypothesis test are:

1. State the null and alternative hypotheses.
2. Assuming the null hypothesis is true, identify the sampling distribution.
3. Find the $p$-value.
4. Draw a conclusion.

***


### Example 7.3.1
Absenteeism of college students from math classes is a major concern to math instructors because missing class appears to increase the drop rate. Suppose that a study was done to determine if the actual student absenteeism rate follows faculty perception. The faculty expected that any randomly chosen group of $100$ students would miss class according to {numref}`goodness-of-fit-1-expected`.

```{list-table} Expected student absenteeism frequency in math classes.
:header-rows: 1
:name: goodness-of-fit-1-expected
* - Number of Absences per Term
  - Expected Number of Students
* - $0$-$2$
  - $50$
* - $3$-$5$
  - $30$
* - $6$-$8$
  - $12$
* - $9$+
  - $8$
```

A random survey across of $100$ students all mathematics courses was then done to determine the actual number of absences in a course. {numref}`goodness-of-fit-1-observed` displays the results of the survey.

```{list-table} Observed student absenteeism frequency in math classes.
:header-rows: 1
:name: goodness-of-fit-1-observed
* - Number of Absences per Term
  - Expected Number of Students
* - $0$-$2$
  - $35$
* - $3$-$5$
  - $40$
* - $6$-$8$
  - $20$
* - $9$+
  - $5$
```

Perform a goodness-of-fit test at a $1\%$ level of significance to determine whether or not student absenteeism fits faculty perception.

#### Solution
##### Step 1: State the null and alternative hypotheses.
The null and alternative hypotheses are

\begin{align*}
H_0:&\ \text{Student absenteeism fits faculty perception} \\
H_a:&\ \text{Student absenteeism does not fit faculty perception}
\end{align*}

##### Step 2: Assuming the null hypothesis is true, identify the sampling distribution.
This is a goodness-of-fit test. We are testing to see if student absenteeism is distributed in the same way that faculty expect it to be distributed; that is, we are testing whether or not faculty perception of student absenteeism is a *good fit* for actual student absenteeism. Since this is a goodness-of-fit test, the sampling distribution is a $\chi^2$ distribution. The categories of the expected distribution are $0$-$2$, $3$-$5$, $6$-$8$, and $9$+, for a total of $k = 4$ categories. Therefore, the sampling distribution has $df = k - 1 = 4 - 1 = 3$ degrees of freedom.

##### Step 3: Find the  $p$-value.
To find the $p$-value, we first must calculate the test statistic

$$ \chi^2 = \sum \frac{(O - E)^2}{E}. $$

The observed values $O$ are the actual number of students observed in each category. The expected values $E$ are the numbers of students the faculty expect to be in each category. These value are found in the two tables above. We will use R to calculate the test statistic.

In [1]:
O = c(35, 40, 20, 5)
E = c(50, 30, 12, 8)

chisq = sum( (O - E)^2 / E )
chisq

The test statistic is $\chi^2 = 14.2917$. Since goodness-of-fit tests are almost always right-tailed tests, we will perform a right-tailed test to find the $p$-value. That means the $p$-value is $P(\chi^2 \geq 14.2917)$.

In [1]:
1 - pchisq(q = 14.2917, df = 3)

So $P(\chi^2 \geq 14.2917) = 0.0025$. That is, if student absenteeism actually does fit faculty perception, then there is only a $0.25\%$ chance that a random sample of $100$ students would deviate from the expected distribution as far as our sample did.

##### Step 4: Make a conclusion about the null hypothesis.
We are conducting this hypothesis test at the $1\%$ level of significance, so $\alpha = 0.01$. Since

$$ p\text{-value} = 0.0025 < 0.01 = \alpha, $$

we reject the null hypothesis.

We conclude that the actual distribution of student absenteeism does not fit faculty perception.

***


### Example 7.3.2
Suppose you roll a six-sided die $80$ times, with the following results:

```{list-table} Observed outcomes from rolling a six-sided die $80$ times.
:header-rows: 1
:name: goodness-of-fit-2-observed
* - Face Value
  - Number of Rolls
* - 1
  - 16
* - 2
  - 21
* - 3
  - 14
* - 4
  - 9
* - 5
  - 7
* - 6
  - 13
```

Use a goodness-of-fit test with a $5\%$ level of significance to determine whether or not the die is fair.

#### Solution
##### Part 1: State the null and alternative hypotheses.
The null hypothesis is that the true distribution of rolls of your die matches the distribution of rolls of a fair die. This could be said more succinctly as:

\begin{align*}
H_0:&\ \text{The die is fair.} \\
H_a:&\ \text{The die is not fair.}
\end{align*}

##### Part 2: Assuming the null hypothesis is true, identify the sampling distribution.
Since we want to see if the distribution of rolls of a fair die is a *good fit* for the distribution of rolls of your die, we will use a $\chi^2$-distribution to test the hypothesis. Since there are $k = 6$ categories or outcomes, the sampling distribution has $df = k - 1 = 6 - 1 = 5$ degrees of freedom.

#### Part 3: Find the $p$-value.
To find the $p$-value, we first must calculate the test statistic

$$ \chi^2 = \sum \frac{(O - E)^2}{E}. $$

Note that the table above are the observed values. But if we roll a fair die $80$ times, we *expect* to roll each number $\frac{1}{6}80 = 13.3333$ times:

```{list-table} Expected outcomes from rolling a six-sided die $80$ times.
:header-rows: 1
:name: goodness-of-fit-2-expected
* - Face Value
  - Number of Rolls
* - 1
  - 13.3333
* - 2
  - 13.3333
* - 3
  - 13.3333
* - 4
  - 13.3333
* - 5
  - 13.3333
* - 6
  - 13.3333
```

Let's use R to calculate the test statistic $\chi^2$:

In [1]:
O = c(16, 21, 14, 9, 7, 13)
E = c(13.3333, 13.3333, 13.3333, 13.3333, 13.3333, 13.3333)

chisq = sum( (O - E)^2/E )
chisq

The test statistic is $\chi^2 = 9.400$.

Since a goodness-of-fit test is almost always right-tailed, the $p$-value we want to find is equal to $P(\chi^2 \geq 9.400)$.

In [2]:
1 - pchisq(q = 9.400, df = 5)

So the $p$-value is $P(\chi^2 \geq 9.400) = 0.0941$. That is, assuming the die is fair, there is a $9.41\%$ chance of obtaining the distribution we observed.

##### Step 4: Draw a conclusion.
Since the level of significance for this hypothesis test is $5\%$, the value of $\alpha = 0.05$. Because

$$ p\text{-value} = 0.0941 \geq 0.05 = \alpha, $$

we do not reject the null hypothesis.

There is not enough evidence to conclude that your die is not fair.