# [Hypothesis Testing](https://stattrek.com/hypothesis-test/hypothesis-testing?tutorial=AP)

### What is Hypothesis Testing?

A statistical hypothesis is an assumption about a population parameter. This assumption may or may not be true. Hypothesis testing refers to the formal procedures used by statisticians to accept or reject statistical hypotheses.

##### Statistical Hypotheses

Ideally, we could examine the entire population. As that is often impractical, we typically examine a random sample from the population. If sample data are not consistent with the statistical hypothesis, the hypothesis is rejected. There are two types of statistical hypotheses.

* **Null Hypothesis**. The null hypothesis, denoted by $H_0$, i susually the hypothesis that the sample observations result purely from chance.
* **Alternative Hypothesis**. The alternative hypothesis, denoted by $H_1$ or $H_a$, is the hypothesis that the sample observations are influenced by some random cause.

##### Statistical Significance

Significance has very specific meaning in statistics. We can say either

* Rejected $H_0$
* Results are not likely due to chance (sampling error)

##### Can We Accept the Null Hypothesis?

Acceptance implies that the null hypothesis is true. Failure to reject implies that the data are not sufficiently persuasive for us to prefer the alternative hypothesis over the null hypothesis.

##### Hypothesis Tests

Statisticians follow a formal process, consisting of four steps, to determine whether to reject a null hypothesis, based on sample data.

1. State the hypothesis.
    1. State both $H_0$ and $H_a$ such that they are mutually exclusive. That is, if one is true, the other must be false.
1. Formulate an analysis plan.
    1. The analysis plan describes how to use sample data to evaluate the null hypothesis. The evaluation often focuses around a single test statistic.
    1. Significance Level.
        1. Researchers often choose significance levels equal to 0.01, 0.05, or 0.10, but any value between 0 and 1 can be used.
    1. Test Method
        1. Typically, the test method inolves a test statistic and a sampling distribution.The test statistic might be a mean score, proportion, difference between means, difference between proportions, z-score, t statistic, chi-square, etc.
1. Analyze sample data.
    1. Find the value of the test statistic (mean, proportion, t statistic, z-score, etc.) described in the analysis plan.
    1. Test statistic.
        1. When $H_0$ involves a mean or proportion, use either of the following equation to compute the test statistic. "Parameter" is the value appearing int he null hypothesis, and "statistic" is the estimate of the "parameter". W$$\text{Test Statistic} = \frac{\text{Statistic - Parameter}}{\text{Standard deviation of statistic}}$$ $$\text{Test Statistic} = \frac{\text{Statistic - Parameter}}{\text{Standard error of statistic}}$$
1. Interpret results.
    1. Apply the decision rule descibed in the analysis plan. If the value of the test statistic is unlikely, based on the null hypothesis, reject the null hypothesis.

##### Decision Errors

* **Type 1 error**
    * A Type 1 error occurs when the research rejects a null hypothesis when it is true. The probability of committing a Type 1 error is called the **significance level**. The probability is also called **alpha** and is often denoted by $\alpha$.
    * Note that $\alpha$ is represented as a proportion, so an $\alpha=0.05$ corresponds to a value which creates a region for which only $5$% of the data would reside.
* **Type II error**
    * A Type 2 error occurs when the researcher fails to reject a null hypothesis that is false. The probability of committing a Type 2 error is called **Beta** and is often denoted by $\beta$. The probability of *not* committing a Type 2 error is called the **Power** of the test and is $1-\beta$.


| | Reject $H_0$ | Fail to Reject $H_0$|
| ---- | ---- | ---- |
| $H_0$ is True | Type 1 Error | |
| $H_0$ is False | | Type 2 Error |

##### Decision Rules

The anlaysis plan for a hypothesis test must include decision rules for rejecting the null hypothesis. In practice, statisticians describe these decision rules in two ways - with reference to a P-value or a region of acceptance.

* **P-value**
    * The strength of evidence in support of a null hypothesis is measured by the P-value.
    * Suppose the test statistic is equale to $S$. The P-value is the probability of observing a test statistic as extreme as $S$, assuming the null hypothesis is true.
    * If the P-value is less than the significance level, we reject the null hypothesis.
* **Region of Acceptance**
    * If the test statistic falls within the region of acceptance, the null hypothesis is not rejected. The region of acceptance is defined so that the chance of making a Type I error is equal to the significance level.
    * The values outside the region of acceptance is called the **region of rejection**.
    * If the test statistic falls within the region of rejection, the null hypothesis is rejected. In such cases, we say that $H_0$ has been rejected at the $\alpha$ level of significance.

Example using z-scores:  We reject $H_0$ when: 1) our sample mean falls within the critical region, 2) the z-score of our sample mean is greater than a positive z-critical value, or 3) the probability of obtaining the sample mean is less than the alpha level.

##### One-Tailed and Two-Tailed Tests

* **One-Tailed Tests**
    * A test of statistical hypothesis where the region of rejection is only one side of the sampling distribution is called a **one-tailed test**. For example, suppose $H_0$ states that $\bar{x} \leq 10$. $H_1$ ($H_a$) would be $\bar{x} \gt 10$. The region of rejection would consist of a range of numbers located on the right side of the sampling distribution, i.e., one-tailed set with numbers greater than $10$.
* **Two-Tailed Tests**
    * A test of statistical hyptohesis where the region of rejection is on both sides of the sampling distribution is called a **two-tailed test**. For example, suppose $H_0$ states $\bar{x}=10$. $H_1$ would be that $\bar{x} \lt 10$ or $\bar{x} \gt 10$. The region of rejection would consist of a range of numbers on both sides of the sampling distribution.
