## Chapter 07
# Hypothesis Testing with One Sample

Adopted from ["Elementary Statistics - Picturing the World" 6th edition](https://www.amazon.com/Elementary-Statistics-Picturing-World-6th/dp/0321911210/)

In [1]:
from notebook.services.config import ConfigManager
cm = ConfigManager()
cm.update('livereveal', {
        'scroll': True,
        'width': "100%",
        'height': "100%",
})

{'scroll': True, 'width': '100%', 'height': '100%'}


## 7.1 <br/>Introduction to Hypothesis Testing

### Hypothesis Tests

- A **hypothesis test** is a process that uses sample statistics to test a claim about the value of a population parameter.
- Researchers in fields such as medicine, psychology, and business rely on hypothesis testing to make informed decisions about new medicines, treatments, and marketing strategies.

### Stating a Hypothesis

- A statement about a population parameter is called a **statistical hypothesis**.
- To test a population parameter, you should carefully state a pair of hypotheses—one that represents the claim and the other, its complement.
- When one of these hypotheses is false, the other must be true. Either hypothesis—the **null hypothesis** or the **alternative hypothesis** —may represent the original claim.
- The term **null hypothesis** was introduced by Ronald Fisher.
- If the statement in the **null hypothesis** is not true, then the **alternative hypothesis** must be true.

### Stating a Hypothesis: Definition

- A **null hypothesis $H_{0}$** is a statistical hypothesis that contains a statement of equality, such as $\le$, $=$, or $\ge$.
- The **alternative hypothesis $H_{a}$** is the complement of the null hypothesis. It is a statement that must be true if $H_{0}$ is false and it contains a statement of strict inequality, such as $\gt$, $\ne$, or $\lt$.

### Stating a Hypothesis

- To write the null and alternative hypotheses, translate the claim made about the population parameter from a verbal statement to a mathematical statement. Then, write its complement. 
- For instance, if the claim value is $k$ and the population parameter is $\mu$, then some possible pairs of null and alternative hypotheses are:
 - $\left\{ \begin{array}{}
       H_{0}:\mu \le k \\
       H_{a}:\mu \gt k 
     \end{array}\right.$
     
 - $\left\{ \begin{array}{}
       H_{0}:\mu \ge k \\
       H_{a}:\mu \lt k 
     \end{array}\right.$

 - $\left\{ \begin{array}{}
       H_{0}:\mu = k \\
       H_{a}:\mu \ne k 
     \end{array}\right.$

- Regardless of which of the three pairs of hypotheses we use, we always assume $\mu = k$ and examine the sampling distribution on the basis of this assumption. 
- Within this sampling distribution, we will determine whether or not a sample statistic is unusual.

### Stating a Hypothesis

- The table shows the relationship between possible verbal statements about the parameter $\mu$ and the corresponding null and alternative hypotheses.
- Similar statements can be made to test other population parameters, such as $p$, $\sigma$, or $\sigma^{2}$.

![](./image/7_1_hypothesis_table.png)

### Stating the Null and Alternative Hypotheses [example 1]

Write the claim as a mathematical statement. State the null and alternative hypotheses, and identify which represents the claim.

- Q1: A school publicizes that the proportion of its students who are involved in at least one extracurricular activity is $61\%$.
- Q2: A car dealership announces that the mean time for an oil change is less than $15$ minutes.
- Q3: A company advertises that the mean life of its furnaces is more than $18$ years.

### Stating the Null and Alternative Hypotheses [solution]

#### Q1: A school publicizes that the proportion of its students who are involved in at least one extracurricular activity is $61\%$.
- The claim can be written as $p = 0.61$. 
- Its complement is $p \ne 0.61$. 
- Because $p = 0.61$ contains the statement of equality, it becomes the **null hypothesis**. 
- In this case, the **null hypothesis** represents the **claim**.
- $\left\{ \begin{array}{}
       H_{0}:p = 0.61\\
       H_{a}:p \ne 0.61 
     \end{array}\right.$
![](./image/7_1_ex_1_q_1_hypothesis.png)

#### Q2: A car dealership announces that the mean time for an oil change is less than $15$ minutes.
- The claim can be written as $\mu \lt 15$.
- Its complement is $\mu \ge 15$. 
- Because $\mu \ge 15$ contains the statement of equality, it becomes the **null hypothesis**. 
- In this case, the **alternative hypothesis** represents the **claim**.
- $\left\{ \begin{array}{}
       H_{0}:\mu \ge 15\\
       H_{a}:\mu \lt 15 
     \end{array}\right.$
![](./image/7_1_ex_1_q_2_hypothesis.png)
 
#### Q3: A company advertises that the mean life of its furnaces is more than $18$ years.
- The claim can be written as $\mu \gt 18$.
- Its complement is $m \le 18$. 
- Because $m \le 18$ contains the statement of equality, it becomes the null hypothesis. 
- In this case, the **alternative hypothesis** represents the **claim**.
- $\left\{ \begin{array}{}
       H_{0}:\mu \le 18\\
       H_{a}:\mu \gt 18 
     \end{array}\right.$
![](./image/7_1_ex_1_q_3_hypothesis.png)

### Types of Errors and Level of Significance

- No matter which hypothesis represents the claim, we always begin a hypothesis test by assuming that the equality condition in the null hypothesis is true. 
- When we perform a hypothesis test, we make one of two decisions:
  1. reject the null hypothesis or
  2. fail to reject the null hypothesis.
- Because your decision is based on a sample rather than the entire population, there is always the possibility you will make the wrong decision.
- The only way to be absolutely certain of whether $H_{0}$ is true or false is to test the entire population. 
- Because your decision — to reject $H_{0}$ or to fail to reject $H_{0}$ — is based on a sample, you must accept the fact that your decision might be incorrect.

### Types of Errors: Definition


- A **type I error** occurs if the null hypothesis is rejected when it is true. **Failed to accept**.
- A **type II error** occurs if the null hypothesis is not rejected when it is false. **Failed to reject**.

![](./image/7_1_type_of_errors_table.png)

### Types of Errors 

Hypothesis testing is sometimes compared to the legal system used in the United States. Under this system, these steps are used:

1. A carefully worded accusation is written.
2. The defendant is assumed innocent ($H_{0}$) until proven guilty. The burden of proof lies with the prosecution. If the evidence is not strong enough, then there is no conviction. A “not guilty” verdict does not prove that a defendant is innocent.
3. The evidence needs to be conclusive beyond a reasonable doubt. The system assumes that more harm is done by convicting the innocent (**type I error**) than by not convicting the guilty (**type II error**).

![](./image/7_1_types_of_errors_legal.png)

### Identifying Type I and Type II Errors [example 2]

- The USDA limit for salmonella contamination for chicken is $20\%$. 
- A meat inspector reports that the chicken produced by a company exceeds the USDA limit. 
- You perform a hypothesis test to determine whether the meat inspector’s claim is true. 
- When will a type I or type II error occur? 
- Which error is more serious?

### Identifying Type I and Type II Errors [solution]

- Let $p$ represent the proportion of the chicken that is contaminated. 
- The meat inspector’s claim is “more than $20\%$ is contaminated.” 
- We can write the null and alternative hypotheses as:

$\left\{ \begin{array}{}
       H_{0}:p \le 0.2\\
       H_{a}:p \gt 0.2 
     \end{array}\right.$

- In this case, the **alternative hypothesis** represents the **claim**.

![](./image/7_1_ex_2_contaminated_chicken.png)

- A **type I error** will occur when the actual proportion of contaminated chicken is less than or equal to $0.2$, but you reject $H_{0}$ . 
- A **type II error** will occur when the actual proportion of contaminated chicken is greater than $0.2$, but you do not reject $H_{0}$.
- With a type I error, you might create a health scare and hurt the sales of chicken producers who were actually meeting the USDA limits.
- With a type II error, you could be allowing chicken that exceeded the USDA contamination limit to be sold to consumers. 
- A type II error is more serious because it could result in sickness or even death.

### Level of Significance

- We will reject the null hypothesis when the sample statistic from the sampling distribution is unusual. 
- We have already identified unusual events to be those that occur with a probability of $0.05$ or less. 
- When statistical tests are used, an unusual event is sometimes required to have a probability of $0.10$ or less, $0.05$ or less, or $0.01$ or less. 
- Because there is variation from sample to sample, there is always a possibility that you will reject a null hypothesis when it is actually true. 
- In other words, although the null hypothesis is true, your sample statistic is determined to be an unusual event in the sampling distribution. 
- We can decrease the probability of this happening by lowering the **level of significance**.

### Level of Significance: Definition

- In a hypothesis test, the level of significance is your maximum allowable probability of making a **type I error**. It is denoted by $\alpha$, the lowercase Greek letter **alpha**.
- The probability of a **type II error** is denoted by $\beta$, the lowercase Greek letter **beta**.

- By setting the level of significance at a small value, we are saying that you want the probability of rejecting a true null hypothesis to be small. 
- Three commonly used levels of significance are $\alpha = 0.10$, $\alpha = 0.05$, and $\alpha = 0.01$.
- When we decrease $\alpha$, we are likely to be increasing $\beta$.

### Statistical Tests and P-Values