<a href="https://colab.research.google.com/github/gibsonea/Biostats/blob/main/Labs/12_Hypothesis_Single_Population.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# <a name="23intro">4.1: Parametric Hypothesis Tests</a>

---

Hypothesis tests and confidence intervals are two examples of
statistical inference. There is an unknown population parameter, and we
would like estimate or assess claims about the parameter. We collect
sample data, and:

-   With **confidence intervals (CI)**, we **estimate** the value of a population parameter building in a **margin of error** that is derived from a sampling distribution.
-   With **hypothesis tests**, we use a null distribution to measure whether a sample test statistic sufficiently contradicts the null claim. Then we **assess the competing claims** in $H_0$ and $H_1$.

We can apply [Monte Carlo resampling (bootstrapping) methods](https://githubtocolab.com/CU-Denver-MathStats-OER/Statistical-Theory/blob/main/Chap5/17-Bootstrap-Confidence-Int.ipynb) and/or
[parametric methods](https://githubtocolab.com/CU-Denver-MathStats-OER/Statistical-Theory/blob/main/Chap5/19-Parametric-CI-Means.ipynb) (using the Central
Limit Theorem) to model a sampling distribution that is the foundation
for constructing a confidence interval. Similarly, for hypothesis tests
we have both resampling and parametric methods for measuring the
significance of test statistics.

-   We explored [permutation distributions](https://githubtocolab.com/CU-Denver-MathStats-OER/Statistical-Theory/blob/main/Chap6/22-Permutation-Tests.ipynb) as one resampling method for calculating p-values.
-   We can use the Central Limit Theorem (CLT) to model null distributions and calculate p-values.

# <a name="23what-sig">What is Significant Enough?</a>

---

The general process form performing a hypothesis test is informally:

1.  State the null and alternative hypotheses in terms of population parameter(s).
2.  Collect data from a sample and calculate a test statistic.
3.  Calculate the p-value to measure the significance of the test statistic.
4.  Make a conclusion (if possible).
5.  Clearly summarize the results for a general audience.

We have discussed [Steps 1 and 2](https://githubtocolab.com/CU-Denver-MathStats-OER/Statistical-Theory/blob/main/Chap6/21-Intro-Hypothesis-Tests.ipynb) and
used [resampling methods](https://githubtocolab.com/CU-Denver-MathStats-OER/Statistical-Theory/blob/main/Chap6/22-Permutation-Tests.ipynb) as one method to
calculate p-values in Step 3. Refer to [Appendix A](#23appenda)
for a summary of the steps outlined above. Before investigating
parametric methods for computing p-values, let's discuss steps and 4 and
5:

> How do we use p-values to decide whether the evidence is significant
> enough to reject $H_0$ and accept the claim we hoped to prove in
> $H_a$?

-   The **smaller the p-value**, the **stronger the evidence** contradicting $H_0$.
-   How small does the p-value need to be in order to claim the evidence is strong enough to reject $H_0$?

## <a name="23sig-level">The Significance Level</a>

---

The <font color="dodgerblue">**significance level**</font> of a
test, denoted ${\color{dodgerblue}{\alpha}}$, is the value **we choose**
that is used to **determine whether the p-value is small enough** to
claim the result is statistically significant and reject $H_0$.

- <font color="mediumseagreen">If p-value $\leq \alpha$, we reject $H_0$.</font>
- <font color="tomato">If p-value $> \alpha$, we do not reject $H_0$.</font>

Generally speaking, $H_0$ is a claim we currently accept as true. $H_a$
is some new and interesting result that if true would contradict the
currently accepted belief in $H_0$. We typically require compelling
evidence, beyond a “reasonable doubt”, to reject the currently accepted
claim in $H_0$ in favor of a new and competing claim in $H_a$.

-  The default significance level is typically 5% (or $\alpha = 0.05$).
-  Some other (less) common significance levels are $\alpha = 0.1$, $0.01$ or $0.001$.
-  The smaller we set $\alpha$, the more certainty we require to reject $H_0$.

<br>

<font color="mediumseagreen">*Note: The significance level is not something we compute. We choose the significance level for the test, and the significance level should be determined prior to our analysis. Do not first calculate the p-value, and then retroactively choose the significance level to ensure the result is significant.*</font>

## <a name="23sum-results">Summarizing the Results</a>

---

There are two possible results with hypothesis tests:

-   If  <font color="mediumseagreen">p-value $\leq \alpha$</font>, the test is  <font color="mediumseagreen">statistically significant</font>.

    -   There is strong enough evidence to reject $H_0$.
    -   And thus, we accept the competing claim in $H_1$.

-   If <font color="tomato">p-value $> \alpha$</font>, the test is <font color="tomato">not statistically significant</font> and   there is not sufficient evidence to reject $H_0$:

    -   We fail to reject $H_0$ (which is different from accepting $H_0$).
    -   The test is inconclusive regarding the claims in $H_0$ and $H_1$.

In the end, we want to be sure we communicate the results **clearly**,
in **proper context**, to a more **general audience** that may not have
an advanced background in statistics and mathematics.

## <a name="23q1">Question 1</a>

---


<figure>
<img
src="https://upload.wikimedia.org/wikipedia/commons/4/4f/Hurricane-en.svg"
alt="Tropical Cyclone Structure" width = "60%"/>
<figcaption aria-hidden="true">
Credit: Kelvinsong, <a
href="https://creativecommons.org/licenses/by/3.0">CC BY 3.0</a>, via
Wikimedia Commons
</figcaption>
</figure>


Pressure is a common measurement used to characterize the strength of a
storm. The lower the storm pressure, the higher the wind speeds, and the
more dangerous the storm. Let $\mu$ denote the mean pressure (in
millibars) of all storms in the North Atlantic. Suppose we set up the
following hypothesis to test claims about the value of $\mu$.

-   $H_0$: $\mu = 950$. The mean pressure of all storms in the North Atlantic is 950 millibars.
-   $H_1$: $\mu \ne 950$. The mean pressure of all storms in the North Atlantic is not 950 millibars.

We collect a random sample of $n$ storm pressure observations with a sample mean larger than 950 millibars that has a p-value equal to $0.012$.

### <a name="23q1a1a">Question 1a</a>

---

Summarize the results if we perform the hypothesis test using a 5%
significance level. Be sure to explain in the context of the example
using terminology a more general audience would understand.

#### <a name="23sol1a">Solution to Question 1a</a>

---

<br>  
<br> <br>

### <a name="23q1b">Question 1b</a>

---

Summarize the results if we perform the hypothesis test using a 10%
significance level. Be sure to explain in the context of the example
using terminology a more general audience would understand.

#### <a name="23sol1b">Solution to Question 1b</a>

---

<br>  
<br> <br>

### <a name="23q1c">Question 1c</a>

---

Summarize the results if we perform the hypothesis test using a 1%
significance level. Be sure to explain in the context of the example
using terminology a more general audience would understand.

#### <a name="23sol1c">Solution to Question 1c</a>

---

<br>  
<br> <br>

### <a name="23q1d">Question 1d</a>

---

Suppose we instead we want to show the mean storm pressure is **greater than** 950 millibars.

-   $H_0$: $\mu = 950$. The mean pressure of all storms in the North Atlantic is 950 millibars.
-   $H_1$: $\mu > 950$. The mean pressure of all storms in the North Atlantic is greater than 950 millibars.

Assume we still have the same sample of size $n$ with the same sample mean (which is greater than 950 millibars) as in [Question 1](#23q1). Recall this sample has a p-value equal to $0.012$ for a two-tailed test. Using the same sample, we now test the one-tailed hypotheses instead.

-   What would be the p-value for this same sample if we use the hypotheses for a one-tailed test stated above?
-   Summarize the result of the one-tailed test in practical terms if we use a significance level of 5%.
-   Summarize the result of the one-tailed test in practical terms if we use a significance level of 1%.

#### <a name="23sol1d">Solution to Question 1d</a>

---

<br>  
<br> <br>

# <a name="23test-known">Test for a Single Mean: Known $\sigma^2$</a>

---

## <a name="23q3">Question 3</a>

---

The mean height of all adult males in the United Kingdom is claimed<sup>2</sup> to be $68.5$ inches ($5$ foot $8.5$ inches or $173.9$ cm) with a standard deviation of $2.5$ inches (or $6.35$ cm). A physician suspects males in her town seem to be taller than average when compared to the population of all adult males in the UK. She collects data from a random sample of $n=25$ adult males from the town and calculates the mean
height of the sample is $69.25$ inches.

<br>

<font size=2>2. [“Height, Weight, and Body Mass of the British Population Since 1820”](https://www.nber.org/system/files/working_papers/h0108/h0108.pdf) by Roderick Floud, National Bureau of of Economic Research, October 1998.</font>

### <a name="23q3a">Question 3a</a>

---

Set up the null and alternative hypotheses (both in words and using
appropriate notation) to test the physician's claim that adult males in
the town are taller than the national average height for all adult males
in the UK.

#### <a name="23sol3a">Solution to Question 3a</a>

---

-   $H_0$: ??

-   $H_a$: ??

<br>  
<br>

### <a name="23q3b">Question 3b</a>

---

Compute the test statistic.

#### <a name="23sol3b">Solution to Question 3b</a>

---

<br>  
<br>  
<br>

### <a name="23q3c3c">Question 3c</a>

---

What is a reasonable null distribution to use to perform this test?
Standardize the test statistic (give the $z$-score) from [Question
3b](#23q3b). Interpret the meaning of the standardized test statistic.

#### <a name="23sol3c">Solution to Question 3c</a>

---

<br>  
<br>  
<br>

### <a name="23q3d">Question 3d</a>

---

Compute the p-value and interpret the meaning in practical terms.

#### <a name="23sol3d">Solution to Question 3d</a>

---

<br>  
<br>  
<br>

### <a name="23q3e">Question 3e</a>

---

Shade area(s) under the graph of a null distribution corresponding to
the p-value. Either make an informal sketch on paper or see [Appendix
B](#23appendb-known) to plot in R.

#### <a name="23sol3e">Solution to Question 3e</a>

---

<br>  
<br>  
<br>

### <a name="23q3f">Question 3f</a>

---

-   If a 5% significance level is chosen, summarize the result in practical terms.
-   If a 10% significance level is chosen, summarize the result in practical terms.

#### <a name="23sol3f">Solution to Question 3f</a>

---

<br>  
<br>  
<br>

## <a name=“23p-known\>p-value for a Single Mean: Known $\sigma^2$</a>

---

Suppose a random sample size $n$ is picked from a population with known
population variance $\sigma^2$ but unknown mean $\mu$. If we are doing a
hypothesis test on a single mean with null claim
${\color{tomato}{H_0: \mu = \mu_0}}$, then as long as the population is
symmetric or the sample size is large enough $(n \geq 30)$, we can
use the Central Limit Theorem for means to:

-   Model the **null distribution** with the sampling distribution ${\color{dodgerblue}{\overline{X} \sim N \left( {\color{tomato}{\mu_0}}, \frac{\sigma}{\sqrt{n}} \right)}}$.
-   Calculate the **standardized test statistic** which is the $z$-score of the sample mean:

$${\color{dodgerblue}{z = \frac{\mbox{sample stat}- {\color{tomato}{\mbox{null claim}}}}{\mbox{SE}(\overline{X})} = \dfrac{\bar{x} - {\color{tomato}{\mu_0}}}{\frac{\sigma}{\sqrt{n}}}}}.$$

# <a name="23test-unknown">Test for a Single Mean: Unknown $\sigma^2$</a>

---

As with confidence intervals, when estimating an unknown population
mean, we often do not know the population variance. Nevertheless, we can
still conduct a hypothesis test on a single mean, but there will be some
additional uncertainty due to our need to estimate $\sigma^2$. Below we work through an example using the `storms` data frame
in the `dplyr` package to devise a method for computing p-values under
these circumstances.

## <a name="23sample-press">Picking a Random Sample of Storm Pressures</a>

---

The `storms` data set is from the [NOAA Hurricane Best Track
Data](https://www.nhc.noaa.gov/data/#hurdat). We will perform a
hypothesis test to test claims about the mean storm pressure, so we will
need to analyze the variable `pressure`.

-   Run the code cell below to load the `dplyr` package.

In [None]:
library(dplyr)  # load dplyr package

-   Run the code cell below to pick a random sample of $n=32$ storm pressures from `storms`.

In [None]:
my.sample <- sample(storms$pressure, size=32, replace=FALSE)

## <a name="23q4">Question 4</a>

---

It is claimed<sup>3</sup> that the average pressure of all North Atlantic storms is 950 millibars. You believe this claim is inaccurate and would like to show the average pressure of all storms is not 950 millibars.

<br>

<font size=2>3. [The University of Arizona Hydrology and Atmospheric Sciences](http://www.atmo.arizona.edu/students/courselinks/fall10/atmo336/lectures/sec2/hurricanes.html),
accessed July 13, 2023.</font>

### <a name="23q4a">Question 4a</a>

---

Set up the null and alternative hypotheses both in words and using
appropriate notation.

#### <a name="23sol4a">Solution to Question 4a</a>

---

-   $H_0$: ??

-   $H_1$: ??

<br>  
<br>

### <a name="23q4b">Question 4b</a>

---

Compute the test statistic.

#### <a name="23sol4b">Solution to Question 4b</a>

---

<br>  
<br>  
<br>

### <a name="23q4c">Question 4c</a>

---

What is a reasonable standardized null distribution to use to perform
this test? Standardize the test statistic from [Question 4b](#23q4b) and
interpret its meaning.

#### <a name="23sol4c">Solution to Question 4c</a>

---

<br>  
<br>  
<br>

### <a name="23q4d">Question 4d</a>

---

Compute the p-value and interpret the meaning in practical terms.

#### <a name="23sol4d">Solution to Question 4d</a>

---

<br>  
<br>  
<br>

### <a name="23q4e">Question 4e</a>

---

Shade area(s) under the graph of a null distribution corresponding to
the p-value in [Question 4d](#23q4d). Either make an informal sketch on paper
or see [Appendix B](#23appendb-unknown) to plot in R.

#### <a name="23sol4e">Solution to Question 4e</a>

---

<br>  
<br>  
<br>

### <a name="23q4f">Question 4f</a>

---

If a 5% significance level is chosen, summarize the result in practical
terms.

#### <a name="23sol4f">Solution to Question 4f</a>

---

<br>  
<br>  
<br>

## <a name="23p-unknown">p-value for a Single Mean: Unknown $\sigma^2$</a>

---

Suppose a random sample size $n$ is picked from a population with
unknown population mean and variance. If we are doing a hypothesis test
on a single mean with null claim $H_0: \mu = \mu_0$, then as long as the
population is symmetric or the sample size is large enough
$(n \geq 30)$:

-   The **standardized test statistic** is called the  <font color="dodgerblue">**t-test statistic**</font>:

$${\large \boxed{{\color{dodgerblue}{{\color{tomato}{t}} = \frac{\mbox{sample stat}-\mbox{null claim}}{\mbox{SE}(\overline{X})} = \dfrac{\bar{x} - \mu_0}{\frac{{\color{tomato}{s}}}{\sqrt{n}}}}}}}.$$

-   The **null distribution** is the distribution of t-test statistics that we model using a  <font color="dodgerblue">**$t$-distribution**</font> with  <font color="dodgerblue">**$n-1$ degrees of freedom**</font>.

In R, we can use the command
`t.test(x, mu = [null], alt = [direction])`.

-   Sample data is stored in the vector `x`.
-   Set the option `mu` equal to the value, $\mu_0$, claimed in $H_0$.
-   Set the option `alt` based on the inequality used in $H_a$.
    -   Use `"greater"` for right-tailed test.
    -   Use `"less"` for left-tailed test.
    -   Use `"two.sided"` for a two-tailed test.
    -   If you do not indicate any `alt` option, the default is a two-tailed test.

## <a name="23q5">Question 5</a>

---

Check your results for the hypothesis test in [Question 4](#23q4) using the
`t.test()` function.

### <a name="23sol5">Solution to Question 5</a>

---

Fill in the options for the `t.test()` function in the code cell below.

In [None]:
t.test(??)

<br>  
<br>

## <a name="23q6">Question 6</a>

---

The output of `t.test()` gives both a p-value and a 95% confidence
interval (by default). Let's interpret the confidence interval and see
if we obtain a result that is consistent with our summary in [Question
4f](#23q4f).

### <a name="23q6a">Question 6a</a>

---

Based on the output of your code in [Question 5](#23q5), what is a 95%
confidence interval for the mean pressure of all North Atlantic storms?

#### <a name="23sol6a">Solution to Question 6a</a>

---

A 95% confidence interval for the mean pressure of all storms is from ??
millibars to ?? millibars.

<br> <br>

### <a name="23q6b">Question 6b</a>

---

Based on the 95% confidence interval [Question 6a](#23q6a), is 950 millibars
(the null claim for $\mu$ in $H_0$) a plausible estimate for $\mu$? Is
this consistent with your answer in [Question 4f](#23q4f)? Explain why or why
not.

#### <a name="23sol6b">Solution to Question 6b</a>

---

<br> <br> <br>

# <a name="23ci-test">Connection Between CI's and Two-Tailed Tests</a>

---

If we are performing a two-tailed hypothesis test using a significance
level $\alpha=0.05$, then we can reject the null hypothesis if either:

-   The $p$-value of our sample is less than or equal to $\alpha=0.05$, or
-   The value $\mu_0$ claimed in $H_0$ is NOT inside a 95% confidence interval.
-   The two statements above are equivalent to one another. We do not need to check both!

We can adjust the statements above for a hypothesis test performed at
other significance levels. For example, if we are conducting a
two-tailed test at a 1% significance level, we can use 99% confidence
interval instead of calculating a p-value.

<br>

#### Be Careful with One-Tailed Tests

---

Confidence intervals include the middle 95% of samples by excluding the
most extreme values in both tails, each with area $\frac{\alpha}{2}$. If
we are performing a one-tailed test, then we only focus on area in one
tail of the null distribution and if the area in one tail is less than
or equal to $\alpha$. Thus,  <font color="tomato">**we cannot
interpret confidence intervals in the same fashion for one-tailed
tests.**</font>

# <a name="23appenda">Appendix A: Summary of Hypothesis Tests</a>

---

1.  State the  <font color="dodgerblue">**hypotheses**</font> and identify (from the alternative claim in $H_a$) if it is a one or two-tailed test.

    -   $H_0$ is the “boring” claim. Express using an equal sign $=$.
    -   $H_1$ is the claim we want to show is likely true. Use inequality sign ($>$, $<$, or $\ne$).
    -   State both $H_0$ and $H_1$ in terms of population parameters such as $\mu$ and $p$.

2.  Compute the  <font color="dodgerblue">**test statistic**</font>.

    -   If the observed sample contradicts the null claim, the result is significant.
    -   A standardized test statistic measures how many SE's the observed stat is from the null claim.
    -   A standardized test statistic with a large absolute value is supporting evidence to reject $H_0$.

3.  Using the null distribution, compute the  <font color="dodgerblue">**p-value**</font>. The p-value is the probability of getting a sample with a test statistic as or more extreme than the observed sample assuming $H_0$ is true.

    -   The p-value is the area in one or both tails beyond the test statistic.
    -   The p-value is a probability, so we have $0 < \mbox{p-value} < 1$.
    -   The smaller the p-value, the stronger the evidence to reject $H_0$.

4.  Based on the  <font color="dodgerblue">**significance level**</font>, $\alpha$, make a decision to reject or not reject the null hypothesis

    -   If p-value $\leq \alpha$, we reject $H_0$.
    -   If p-value $> \alpha$, we do not reject $H_0$.

5. <font color="dodgerblue">**Summarize the results**</font> in practical terms, **in the context of the example**.

    -   If we reject $H_0$, this means there is enough evidence to support the claim in $H_1$.
    -   If we do not reject $H_0$, this means there is not evidence to reject $H_0$ nor support $H_1$. The test is inconclusive.

# <a name="23appendb">Appendix B: Illustrating p-values in R</a>

---

The plots below require the package `ggplot2` that is loaded in the code
cell below. **Be sure to first run the code cell below to load `ggplot2`
before running any of the code cells that follow.**

In [None]:
library(ggplot2)

## <a name="23appendb-known">Illustrating p-values: Known Population Variance</a>

---

If we are performing a hypothesis test on a single mean for a population
whose variance is known, then we can either use:

-   The null distribution is ${\color{dodgerblue}{\overline{X} \sim N \left( \mu_0, \frac{\sigma}{\sqrt{n}} \right)}}$ with test statistic is $\bar{x}$, or
-   The standardized normal distribution $Z \sim N(0,1)$ with standardized test statistic

$${\color{dodgerblue}{z =  \dfrac{\bar{x} - \mu_0}{\frac{\sigma}{\sqrt{n}}}}}.$$

-   In the code cell below, enter values for the mean and standard error of the null distribution as well as the test statistic.

In [None]:
null.mean <- ??  # mean of the null distribution
null.se <- ??  # standard error of the null distribution
test.stat <- ??  # test statistic

### <a name="23two-known">Two-Tailed Test: Known Variance</a>

---

To illustrate the p-value for a two-tailed test:

-   Be sure you have already loaded the `ggplot2` package and defined `null.mean`, `null.se` and `test.stat`.
-   Run the code cell below. No edits are needed.

In [None]:
################################################
# for a two-tailed test run the code cell below
################################################
x.diff <- abs(null.mean - test.stat)
lower.x <- null.mean - x.diff
upper.x <- null.mean + x.diff

end.diff <- max(x.diff, 4*null.se)
xmax <- null.mean + end.diff
xmin <- null.mean - end.diff


ggplot(NULL, aes(c(xmin, xmax))) +
  geom_area(stat = "function",   fun = dnorm,
            args = list(mean = null.mean, sd = null.se),
            color = "black", fill = NA,
            xlim = c(lower.x, upper.x)) +
  geom_area(stat = "function",   fun = dnorm,
            args = list(mean = null.mean, sd = null.se),
            color = "black", fill = "firebrick2",
            xlim = c(xmin, lower.x)) +
  geom_area(stat = "function",   fun = dnorm,
            args = list(mean = null.mean, sd = null.se),
            color = "black", fill = "firebrick2",
            xlim = c(upper.x, xmax)) +
  geom_vline(xintercept = c(lower.x, upper.x), linetype="dashed",
                color = "firebrick2", linewidth = 1) +
  labs(x = "test statistic", y = "") +
  scale_y_continuous(breaks = NULL) +
  scale_x_continuous(breaks=c(lower.x,  upper.x)) +
  geom_hline(yintercept=0) +
  theme_bw() +
  theme(axis.text.x=element_text(size=15, color = "firebrick2"))

### <a name="23left-known">Left-Tailed Test: Known Variance</a>

---

To illustrate the p-value for a left-tailed test:

-   Be sure you have already loaded the `ggplot2` package and defined `null.mean`, `null.se` and `test.stat`.
-   Run the code cell below. No edits are needed.

In [None]:
################################################
# for a left-tailed test run the code cell below
################################################
xmin <- min(null.mean - 4*null.se, test.stat)
xmax <- null.mean + 4*null.se


ggplot(NULL, aes(c(xmin, xmax))) +
  geom_area(stat = "function",   fun = dnorm,
            args = list(mean = null.mean, sd = null.se),
            color = "black", fill = NA,
            xlim = c(test.stat, xmax)) +
  geom_area(stat = "function",   fun = dnorm,
            args = list(mean = null.mean, sd = null.se),
            color = "black", fill = "firebrick2",
            xlim = c(xmin, test.stat)) +
  geom_vline(xintercept = test.stat, linetype="dashed",
                color = "firebrick2", linewidth = 1) +
  labs(x = "test statistic", y = "") +
  scale_y_continuous(breaks = NULL) +
  scale_x_continuous(breaks=test.stat) +
  geom_hline(yintercept=0) +
  theme_bw() +
  theme(axis.text.x=element_text(size=15, color = "firebrick2"))

### <a name="23right-known">Right-Tailed Test: Known Variance</a>

---

To illustrate the p-value for a right-tailed test:

-   Be sure you have already loaded the `ggplot2` package and defined `null.mean`, `null.se` and `test.stat`.
-   Run the code cell below. No edits are needed.

In [None]:
################################################
# for a right-tailed test run the code cell below
################################################
xmax <- max(null.mean + 4*null.se, test.stat)
xmin <- null.mean - 4*null.se


ggplot(NULL, aes(c(xmin, xmax))) +
  geom_area(stat = "function",   fun = dnorm,
            args = list(mean = null.mean, sd = null.se),
            color = "black", fill = NA,
            xlim = c(xmin, test.stat)) +
  geom_area(stat = "function",   fun = dnorm,
            args = list(mean = null.mean, sd = null.se),
            color = "black", fill = "firebrick2",
            xlim = c(test.stat, xmax)) +
  geom_vline(xintercept = test.stat, linetype="dashed",
                color = "firebrick2", linewidth = 1) +
  labs(x = "test statistic", y = "") +
  scale_y_continuous(breaks = NULL) +
  scale_x_continuous(breaks=test.stat) +
  geom_hline(yintercept=0) +
  theme_bw() +
  theme(axis.text.x=element_text(size=15, color = "firebrick2"))

## <a name="23appendb-unknown">Illustrating p-values: Unknown Population Variance</a>

---

If we are performing a hypothesis test on a single mean for a population
whose variance is unknown, then we use <font color="dodgerblue">**$\mathbf{t_{n-1}}$, a $t$-distribution
with $n-1$ degrees of freedoms**</font>, for the null distribution and
have  <font color="dodgerblue">**t-test statistic**</font>

$${\color{dodgerblue}{{\color{tomato}{t}} = \dfrac{\bar{x} - \mu_0}{\frac{{\color{tomato}{s}}}{\sqrt{n}}}}}.$$

-   In the code cell below, enter the value of the t-test statistic and the degrees of freedom.

In [None]:
test.stat <- ??  # t-test statistic
deg.free <- ??  # degrees of freedom

### <a name="23two-unknown">Two-Tailed Test: Unknown Variance</a>

---

To illustrate the p-value for a two-tailed test:

-   Be sure you have already loaded the `ggplot2` package and defined `test.stat` and `deg.free`.
-   Run the code cell below. No edits are needed.

In [None]:
################################################
# for a two-tailed test run the code cell below
################################################
v <- deg.free
end.t <- qt(0.9997, v)
test.stat <- abs(test.stat)
xmax <- max(end.t, test.stat)
xmin <- -xmax


ggplot(NULL, aes(c(xmin, xmax))) +
  geom_area(stat = "function",   fun = dt,
            args = list(df = v),
            color = "black", fill = NA,
            xlim = c(-test.stat, test.stat)) +
  geom_area(stat = "function",   fun = dt,
            args = list(df = v),
            color = "black", fill = "firebrick2",
            xlim = c(xmin, -test.stat)) +
  geom_area(stat = "function",   fun = dt,
            args = list(df = v),
            color = "black", fill = "firebrick2",
            xlim = c(test.stat, xmax)) +
  geom_vline(xintercept = c(-test.stat, test.stat), linetype="dashed",
                color = "firebrick2", linewidth = 1) +
  labs(x = "test statistic", y = "") +
  scale_y_continuous(breaks = NULL) +
  scale_x_continuous(breaks=c(-test.stat,  test.stat)) +
  geom_hline(yintercept=0) +
  theme_bw() +
  theme(axis.text.x=element_text(size=15, color = "firebrick2"))

### <a name="23left-unknown">Left-Tailed Test: Unknown Variance</a>

---

To illustrate the p-value for a left-tailed test:

-   Be sure you have already loaded the `ggplot2` package and defined `test.stat` and `deg.free`.
-   Run the code cell below. No edits are needed.

In [None]:
################################################
# for a left-tailed test run the code cell below
################################################
v <- deg.free
end.t <- qt(0.9997, v)
xmin <- min(-end.t, test.stat)
xmax <- end.t


ggplot(NULL, aes(c(xmin, xmax))) +
  geom_area(stat = "function",   fun = dt,
            args = list(df = v),
            color = "black", fill = NA,
            xlim = c(test.stat, xmax)) +
  geom_area(stat = "function",   fun = dt,
            args = list(df = v),
            color = "black", fill = "firebrick2",
            xlim = c(xmin, test.stat)) +
  geom_vline(xintercept = test.stat, linetype="dashed",
                color = "firebrick2", linewidth = 1) +
  labs(x = "test statistic", y = "") +
  scale_y_continuous(breaks = NULL) +
  scale_x_continuous(breaks=test.stat) +
  geom_hline(yintercept=0) +
  theme_bw() +
  theme(axis.text.x=element_text(size=15, color = "firebrick2"))

### <a name="23right-unknown">Right-Tailed Test: Unknown Variance</a>

---

To illustrate the p-value for a right-tailed test:

-   Be sure you have already loaded the `ggplot2` package and defined `test.stat` and `deg.free`.
-   Run the code cell below. No edits are needed.

In [None]:
################################################
# for a right-tailed test run the code cell below
################################################
v <- deg.free
end.t <- qt(0.9997, v)
xmax <- max(end.t, test.stat)
xmin <- -end.t

ggplot(NULL, aes(c(xmin, xmax))) +
  geom_area(stat = "function",   fun = dt,
            args = list(df = v),
            color = "black", fill = NA,
            xlim = c(xmin, test.stat)) +
  geom_area(stat = "function",   fun = dt,
            args = list(df = v),
            color = "black", fill = "firebrick2",
            xlim = c(test.stat, xmax)) +
  geom_vline(xintercept = test.stat, linetype="dashed",
                color = "firebrick2", linewidth = 1) +
  labs(x = "test statistic", y = "") +
  scale_y_continuous(breaks = NULL) +
  scale_x_continuous(breaks=test.stat) +
  geom_hline(yintercept=0) +
  theme_bw() +
  theme(axis.text.x=element_text(size=15, color = "firebrick2"))

# <a name="23CC License">Creative Commons License Information</a>
---

![Creative Commons License](https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png)

*Statistical Methods: Exploring the Uncertain* by [Adam
Spiegler (University of Colorado Denver)](https://github.com/CU-Denver-MathStats-OER/Statistical-Theory)
is licensed under a [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License](http://creativecommons.org/licenses/by-nc-sa/4.0/). This work is funded by an [Institutional OER Grant from the Colorado Department of Higher Education (CDHE)](https://cdhe.colorado.gov/educators/administration/institutional-groups/open-educational-resources-in-colorado).

For similar interactive OER materials in other courses funded by this project in the Department of Mathematical and Statistical Sciences at the University of Colorado Denver, visit <https://github.com/CU-Denver-MathStats-OER>.