# Hypothesis testing, p-values, confidence intervals



When we test hypotheses, the fundamental question we are asking is:

**"Is the observed effect or difference in our sample data likely to exist in the larger population, or could it have occurred by random chance?"**

### It basically works like this:

If we have a gaussian process:
1. we establish the distribution (bell curve), 
2. then when we observe an individual trial, we can say how likely it was, based on where it is in the probability distribution.



### Extremely Simple Example: Coin Toss

**Question:** Is this coin fair?

1. **Formulate Hypotheses:**
   - **H0 (Null Hypothesis):** The coin is fair (50% heads, 50% tails).
   - **H1 (Alternative Hypothesis):** The coin is not fair.

2. **Collect Sample Data:**
   - Toss the coin 10 times. Result: 7 heads and 3 tails.

3. **Analyze Data:**
   - Compare 7 heads (observed) to 5 heads (expected if fair).

4. **Make a Decision:**
   - If getting 7 heads out of 10 is very unlikely by chance (low p-value), we conclude the coin is likely not fair.
   - If it’s not unlikely (high p-value), we conclude the observed result could be due to random chance, and the coin might be fair.
   
   

### p value
**The p value tells you how likely it is to get the observed results (or more extreme results) if the coin is fair.**

**ok, so again, the p value is how likely you are to observe a value as extreme or more extreme than this particular value, given that H0 is true.**

**You usually choose an $\alpha$ significance value, like $\alpha = 0.05$.  If $p < \alpha$, you reject the null hypothesis.**

### Simplified Explanation with a Gaussian Process

1. **Establish the Distribution:**
   - Assume we know the population distribution. For a Gaussian process, this means we know the mean ($\mu$) and standard deviation ($\sigma$) of the population.

2. **Collect Sample Data:**
   - We take a sample and calculate the sample mean ($\bar{x}$).

3. **Formulate Hypotheses:**
   - **Null Hypothesis (H0):** The sample comes from the established population distribution.
   - **Alternative Hypothesis (H1):** The sample comes from a different distribution.

4. **Calculate the Test Statistic:**
   - Compute how far the sample mean is from the population mean in terms of standard deviations. This is often done using a z-score for large samples or a t-score for smaller samples.

5. **Find the P-Value:**
   - Determine the p-value, which is the probability of obtaining a test statistic as extreme as the observed one, under the assumption that the null hypothesis is true.

6. **Interpret the P-Value:**
   - If the p-value is low, it indicates that the observed sample mean is unlikely to occur if the null hypothesis is true, suggesting the sample may come from a different distribution.
   - If the p-value is high, it indicates that the observed sample mean is likely to occur if the null hypothesis is true, suggesting the sample may come from the established distribution.

### Example:

1. **Establish Distribution:**
   - Population mean ($\mu$) = 100
   - Population standard deviation ($\sigma$) = 15

2. **Collect Sample Data:**
   - Sample mean ($\bar{x}$) = 110
   - Sample size ($n$) = 30

3. **Formulate Hypotheses:**
   - $H_0$: The sample mean is 100 (comes from the established distribution).
   - $H_1$: The sample mean is not 100 (comes from a different distribution).

4. **Calculate Test Statistic:**
   - For a large sample, use a z-score:
     $$ z = \frac{\bar{x} - \mu}{\sigma / \sqrt{n}} = \frac{110 - 100}{15 / \sqrt{30}} \approx 3.65 $$

5. **Find the P-Value:**
   - Using the z-score, find the p-value from the standard normal distribution table. For $z = 3.65$, the p-value is very low.

6. **Interpret the P-Value:**
   - A very low p-value suggests that getting a sample mean of 110 is highly unlikely if the population mean is truly 100. Therefore, we might reject the null hypothesis and conclude that the sample likely comes from a different distribution.

By understanding where the observed data falls within the established distribution, we can assess how likely it is that our sample comes from the assumed population, thereby making informed conclusions about our hypotheses.

To review hypothesis testing, p-values, and confidence intervals, let's go over each concept in detail.

### Hypothesis Testing

**Hypothesis Testing** is a statistical method used to make inferences or draw conclusions about a population based on sample data. The process involves:

1. **Formulating Hypotheses:**
   - **Null Hypothesis (H0):** A statement that there is no effect or no difference. It is the hypothesis that we seek to test.
   - **Alternative Hypothesis (H1 or Ha):** A statement that indicates the presence of an effect or a difference. It is what we want to prove.

2. **Selecting a Significance Level (α):**
   - Commonly used significance levels are 0.05, 0.01, and 0.10.
   - This is the probability of rejecting the null hypothesis when it is actually true (Type I error).

3. **Choosing the Appropriate Test:**
   - Based on the type of data and the research question (e.g., t-test, chi-square test, ANOVA).

4. **Calculating the Test Statistic:**
   - A value computed from the sample data that is compared against a critical value from a statistical distribution.

5. **Making a Decision:**
   - Compare the test statistic to the critical value or use the p-value to decide whether to reject or fail to reject the null hypothesis.

### P-Values

**P-Value:** The probability of obtaining a test statistic at least as extreme as the one observed, assuming that the null hypothesis is true.

- **Interpretation:**
  - A small p-value (typically ≤ 0.05) indicates strong evidence against the null hypothesis, so you reject the null hypothesis.
  - A large p-value (> 0.05) indicates weak evidence against the null hypothesis, so you fail to reject the null hypothesis.
- **Threshold:** The significance level (α) is the threshold for deciding whether a p-value indicates a significant result.

### Confidence Intervals

### **the confidence interval is like a function showing all of the acceptable values to accept $H_0$, so it is basically a continuous version of hypothesis testing**


**Confidence Interval (CI):** A range of values derived from the sample data that is likely to contain the true population parameter.

- **Interpretation:**
  - A 95% confidence interval means that if you were to take 100 different samples and compute a CI for each sample, approximately 95 of the 100 confidence intervals would contain the true population parameter.
- **Calculation:**
  - The CI is typically calculated as: $ \text{CI} = \text{Point Estimate} \pm \text{Margin of Error} $
  - For a mean: $ \text{CI} = \bar{x} \pm z \left( \frac{s}{\sqrt{n}} \right) $
    - $\bar{x}$ = sample mean
    - $z$ = z-score corresponding to the desired confidence level
    - $s$ = sample standard deviation
    - $n$ = sample size

### Example: One-Sample T-Test

1. **Formulate Hypotheses:**
   - $ H_0: \mu = \mu_0 $ (the population mean is equal to a specified value)
   - $ H_1: \mu \neq \mu_0 $ (the population mean is not equal to the specified value)

2. **Significance Level:**
   - Choose $ \alpha = 0.05 $.

3. **Calculate Test Statistic:**
   - $ t = \frac{\bar{x} - \mu_0}{s / \sqrt{n}} $

4. **Calculate P-Value:**
   - Determine the probability of observing a test statistic as extreme as $ t $ under the null hypothesis.

5. **Decision:**
   - Compare the p-value to $ \alpha $. If $ p \leq \alpha $, reject $ H_0 $.

6. **Confidence Interval:**
   - $ \text{CI} = \bar{x} \pm t_{\alpha/2, \, df} \left( \frac{s}{\sqrt{n}} \right) $
   - $ t_{\alpha/2, \, df} $ is the critical value from the t-distribution with $ df $ degrees of freedom.

### Summary

- **Hypothesis Testing** helps determine if there is enough evidence to reject a null hypothesis in favor of an alternative hypothesis.
- **P-Values** indicate the probability of obtaining results as extreme as the observed results, under the assumption that the null hypothesis is true.
- **Confidence Intervals** provide a range of values that are likely to contain the true population parameter, giving an estimate of the uncertainty around the sample estimate.

By understanding these concepts, you can apply statistical methods to draw meaningful conclusions from data.

## Relationship Between Hypothesis Testing and Confidence Intervals

Yes, there is a direct relationship between hypothesis testing and confidence intervals. Both concepts are used to make inferences about population parameters based on sample data, and they are often complementary.


1. **Hypothesis Testing:**
   - **Goal:** To determine whether there is enough evidence to reject a null hypothesis ($H_0$) in favor of an alternative hypothesis ($H_1$).
   - **Decision Rule:** Based on a p-value compared to a significance level ($\alpha$). If $p < \alpha$, reject $H_0$.

2. **Confidence Intervals:**
   - **Goal:** To provide a range of values within which the true population parameter is likely to fall, with a certain level of confidence (e.g., 95% confidence interval).
   - **Interpretation:** If the null hypothesis value (e.g., $\mu_0$) lies outside the confidence interval, it suggests that the null hypothesis may be rejected.

### Connection:

- **Confidence Interval and Hypothesis Test for the Mean:**
  - When you construct a $(1 - \alpha) \times 100\%$ confidence interval for a population mean, you are creating an interval that, with $(1 - \alpha)$ confidence, contains the true population mean.
  - If the null hypothesis value (e.g., $\mu_0$) is not within this interval, you would reject $H_0$ at the $\alpha$ significance level in a two-tailed test.

### Example:

1. **Hypothesis Test:**
   - **Null Hypothesis:** $H_0: \mu = \mu_0$
   - **Alternative Hypothesis:** $H_1: \mu \neq \mu_0$
   - **Significance Level:** $\alpha = 0.05$
   - **Decision Rule:** Calculate the p-value. If $p < \alpha$, reject $H_0$.

2. **Confidence Interval:**
   - **Confidence Level:** $(1 - \alpha) \times 100\% = 95\%$
   - **Construct Confidence Interval:** $\bar{x} \pm z_{\alpha/2} \left( \frac{\sigma}{\sqrt{n}} \right)$, where $z_{\alpha/2}$ is the critical value from the standard normal distribution.
   - **Interpretation:** If $\mu_0$ is not within the 95% confidence interval, reject $H_0$ at $\alpha = 0.05$.

### Summary:

- **Equivalence in Decisions:**
  - **Hypothesis Test Result:** If $p < \alpha$, reject $H_0$.
  - **Confidence Interval Result:** If $\mu_0$ is outside the $(1 - \alpha) \times 100\%$ confidence interval, reject $H_0$.

Both methods ultimately provide ways to assess the evidence against the null hypothesis and are used to draw conclusions about the population based on sample data.

## Steps in Hypothesis Testing

1. **Formulate Hypotheses:**
   - **Null Hypothesis (H0):** The statement being tested, typically asserting that there is no effect or no difference. It is the default assumption.
     - Example: $ H_0: \mu = \mu_0 $ (The population mean is equal to a specified value $\mu_0$).
   - **Alternative Hypothesis (H1 or Ha):** The statement you want to test, suggesting that there is an effect or a difference.
     - Example: $ H_1: \mu \neq \mu_0 $ (The population mean is not equal to $\mu_0$).

2. **Select a Significance Level (α):**
   - The significance level ($\alpha$) is the probability of rejecting the null hypothesis when it is true (Type I error). Common choices are 0.05, 0.01, and 0.10.
   - Example: $ \alpha = 0.05 $.

3. **Choose the Appropriate Test:**
   - Depending on the data and the hypothesis, choose a statistical test (e.g., t-test, chi-square test, ANOVA).

4. **Calculate the Test Statistic:**
   - Compute the test statistic based on your sample data. The test statistic measures how far your sample statistic is from the null hypothesis parameter, standardized by the sample variability.
   - Example (one-sample t-test):
     $$ t = \frac{\bar{x} - \mu_0}{s / \sqrt{n}} $$
     where:
     - $\bar{x}$ is the sample mean
     - $\mu_0$ is the hypothesized population mean
     - $s$ is the sample standard deviation
     - $n$ is the sample size

5. **Determine the P-Value or Critical Value:**
   - **P-Value:** The probability of obtaining a test statistic as extreme as, or more extreme than, the observed value under the null hypothesis.
     - If $ p \leq \alpha $, reject $ H_0 $.
   - **Critical Value Approach:** Compare the test statistic to a critical value from the statistical distribution corresponding to the test.
     - If the test statistic falls in the rejection region, reject $ H_0 $.

6. **Make a Decision:**
   - Based on the p-value or the critical value comparison, decide whether to reject or fail to reject the null hypothesis.
   - **Reject $ H_0 $:** There is sufficient evidence to support the alternative hypothesis.
   - **Fail to Reject $ H_0 $:** There is not sufficient evidence to support the alternative hypothesis.

### Example: One-Sample T-Test

1. **Formulate Hypotheses:**
   - $ H_0: \mu = \mu_0 $ (the population mean is equal to a specified value)
   - $ H_1: \mu \neq \mu_0 $ (the population mean is not equal to the specified value)

2. **Significance Level:**
   - Choose $ \alpha = 0.05 $.

3. **Calculate Test Statistic:**
   $$ t = \frac{\bar{x} - \mu_0}{s / \sqrt{n}} $$

4. **Calculate P-Value:**
   - Determine the probability of observing a test statistic as extreme as $ t $ under the null hypothesis.

5. **Decision:**
   - Compare the p-value to $ \alpha $. If $ p \leq \alpha $, reject $ H_0 $.

### Types of Hypothesis Tests

1. **One-Sample T-Test:**
   - Tests whether the mean of a single sample is equal to a known value.
   - Used when the population standard deviation is unknown and the sample size is small.

2. **Two-Sample T-Test:**
   - Tests whether the means of two independent samples are equal.
   - Assumes both samples come from populations with the same variance.

3. **Paired T-Test:**
   - Tests whether the mean difference between paired observations is zero.
   - Used when data are collected in pairs or matched samples.

4. **Chi-Square Test:**
   - Tests the association between categorical variables.
   - Used to test the goodness of fit or independence in contingency tables.

5. **ANOVA (Analysis of Variance):**
   - Tests whether the means of three or more groups are equal.
   - Used to analyze differences among group means in a sample.

### Summary

- **Hypothesis Testing** involves formulating a null and alternative hypothesis, selecting a significance level, choosing the appropriate test, calculating the test statistic, and making a decision based on the p-value or critical value.
- **P-Values** indicate the probability of obtaining results as extreme as the observed results, under the assumption that the null hypothesis is true.
- **Confidence Intervals** provide a range of values that are likely to contain the true population parameter, giving an estimate of the uncertainty around the sample estimate.

By understanding and applying these steps, you can perform hypothesis testing to draw meaningful conclusions from your data.

## Example 1: One-Sample T-Test

**Hypothesis:**
- Null Hypothesis ($H_0$): The mean of the sample is equal to a specified value ($\mu_0$).
- Alternative Hypothesis ($H_1$): The mean of the sample is not equal to the specified value.

**Python Code:**

In [7]:
import numpy as np
from scipy import stats

# Sample data
data = np.array([12, 14, 15, 16, 15, 14, 13, 14, 16, 18])

# Hypothesized population mean
mu_0 = 15

# Perform one-sample t-test
t_stat, p_value = stats.ttest_1samp(data, mu_0)

# Output results
print(f"One-Sample T-Test Results:")
print(f"T-Statistic: {t_stat}")
print(f"P-Value: {p_value}")

# Decision based on significance level
alpha = 0.05
if p_value <= alpha:
    print("Reject the null hypothesis (H0)")
else:
    print("Fail to reject the null hypothesis (H0)")

One-Sample T-Test Results:
T-Statistic: -0.5570860145311568
P-Value: 0.5910512317836039
Fail to reject the null hypothesis (H0)


## Example 2: Two-Sample T-Test

**Hypothesis:**
- Null Hypothesis ($H_0$): The means of the two samples are equal.
- Alternative Hypothesis ($H_1$): The means of the two samples are not equal.

In [8]:
import numpy as np
from scipy import stats

# Sample data
data1 = np.array([12, 14, 15, 16, 15, 14, 13, 14, 16, 18])
data2 = np.array([11, 13, 14, 15, 14, 13, 12, 13, 15, 17])

# Perform two-sample t-test
t_stat, p_value = stats.ttest_ind(data1, data2)

# Output results
print(f"Two-Sample T-Test Results:")
print(f"T-Statistic: {t_stat}")
print(f"P-Value: {p_value}")

# Decision based on significance level
alpha = 0.05
if p_value <= alpha:
    print("Reject the null hypothesis (H0)")
else:
    print("Fail to reject the null hypothesis (H0)")


Two-Sample T-Test Results:
T-Statistic: 1.3130643285972254
P-Value: 0.20565542342928522
Fail to reject the null hypothesis (H0)


## Example 3: Paired T-Test

**Hypothesis:**
- Null Hypothesis ($H_0$): The mean difference between paired observations is zero.
- Alternative Hypothesis ($H_1$): The mean difference between paired observations is not zero.

In [9]:
import numpy as np
from scipy import stats

# Paired sample data
data_before = np.array([10, 12, 14, 16, 18])
data_after = np.array([12, 14, 16, 18, 20])

# Perform paired t-test
t_stat, p_value = stats.ttest_rel(data_before, data_after)

# Output results
print(f"Paired T-Test Results:")
print(f"T-Statistic: {t_stat}")
print(f"P-Value: {p_value}")

# Decision based on significance level
alpha = 0.05
if p_value <= alpha:
    print("Reject the null hypothesis (H0)")
else:
    print("Fail to reject the null hypothesis (H0)")

Paired T-Test Results:
T-Statistic: -inf
P-Value: 0.0
Reject the null hypothesis (H0)


### Example 4: Chi-Square Test for Independence

**Hypothesis:**
- Null Hypothesis ($H_0$): The two categorical variables are independent.
- Alternative Hypothesis ($H_1$): The two categorical variables are not independent.

In [10]:
import numpy as np
from scipy import stats

# Contingency table
observed = np.array([[10, 10, 20], [20, 20, 20]])

# Perform chi-square test for independence
chi2_stat, p_value, dof, expected = stats.chi2_contingency(observed)

# Output results
print(f"Chi-Square Test for Independence Results:")
print(f"Chi-Square Statistic: {chi2_stat}")
print(f"P-Value: {p_value}")
print(f"Degrees of Freedom: {dof}")
print(f"Expected Frequencies: \n{expected}")

# Decision based on significance level
alpha = 0.05
if p_value <= alpha:
    print("Reject the null hypothesis (H0)")
else:
    print("Fail to reject the null hypothesis (H0)")

Chi-Square Test for Independence Results:
Chi-Square Statistic: 2.7777777777777777
P-Value: 0.24935220877729622
Degrees of Freedom: 2
Expected Frequencies: 
[[12. 12. 16.]
 [18. 18. 24.]]
Fail to reject the null hypothesis (H0)


### Example 5: One-Way ANOVA

**Hypothesis:**
- Null Hypothesis ($H_0$): The means of the different groups are equal.
- Alternative Hypothesis ($H_1$): At least one group mean is different from the others.

In [11]:
import numpy as np
from scipy import stats

# Sample data
group1 = np.array([12, 14, 15, 16, 15])
group2 = np.array([11, 13, 14, 15, 14])
group3 = np.array([10, 12, 13, 14, 13])

# Perform one-way ANOVA
f_stat, p_value = stats.f_oneway(group1, group2, group3)

# Output results
print(f"One-Way ANOVA Results:")
print(f"F-Statistic: {f_stat}")
print(f"P-Value: {p_value}")

# Decision based on significance level
alpha = 0.05
if p_value <= alpha:
    print("Reject the null hypothesis (H0)")
else:
    print("Fail to reject the null hypothesis (H0)")

One-Way ANOVA Results:
F-Statistic: 2.1739130434782608
P-Value: 0.15643265735717113
Fail to reject the null hypothesis (H0)


These examples demonstrate how to perform various hypothesis tests in Python using the `scipy.stats` library. Each example includes the steps to formulate hypotheses, calculate the test statistic, determine the p-value, and make a decision based on the significance level.