# Hypothesis Testing

Hypothesis testing is a statistical method used to make decisions about population parameters based on sample data. It involves assessing the validity of a hypothesis regarding a population mean, denoted as \( \mu \). For example, the statement “\( \mu > 11 \)” is a hypothesis about the population mean \( \mu \). To determine how certain we can be that such a hypothesis is true, we must perform a hypothesis test.

A hypothesis test produces a number between 0 and 1 that measures the degree of certainty we may have in the truth of a hypothesis about a quantity, such as a population mean. 

In most situations, the **null hypothesis (H₀)** states that the observed effect is due only to random variation between the sample and the population. In contrast, the **alternative hypothesis (H₁)** claims that the observed effect is real and accurately represents the whole population.

## Types of Hypothesis Tests
- **Z-Test**
- **One-Sample T-Test**
- **Two-Sample T-Test**
- **Paired T-Test**

The choice between a Z-test and a T-test primarily depends on two factors:

1. **Sample Size**:
   - **Z-Test**: Used when the sample size is large (usually \( n >= 30 \)).
   - **T-Test**: More appropriate for small samples (typically \( n < 30 \)).

2. **Population Standard Deviation**:
   - **Z-Test**: Requires knowledge of the population standard deviation.
   - **T-Test**: Used when the population standard deviation is unknown, which is common in real-world scenarios.
---

## Introduction to Hypothesis Testing

In hypothesis testing, we start with two hypotheses:

1. **Null Hypothesis (H₀)**: The statement representing no effect or no difference in the population.
2. **Alternative Hypothesis (H₁)**: The statement we accept if the null hypothesis is rejected, indicating an effect or a difference.

### Key Concepts

- **Significance Level (α)**: The probability of rejecting the null hypothesis when it is actually true. Common values are **0.05** or **0.01**.
- **P-Value**: measures the plausibility of H0.The smaller the P-value, the stronger the evidence is against H0
- **Critical Value**: The value that the test statistic must exceed to reject the null hypothesis.

### Steps to perform Hypothesis Testing

1. **State the Hypotheses**: Define both the null and alternative hypotheses.
2. **Select a Significance Level (α)**: Typically, we use a level of **0.05**.
3. **Choose an Appropriate Test**: Select the test based on data characteristics and assumptions.
4. **Calculate the Test Statistic**: Perform calculations based on the test chosen.
5. **Determine the P-Value or Critical Value**: Compare against the significance level.
6. **Make a Decision**: If the p-value is less than α, reject the null hypothesis.


### Z-Test
A Z-test is typically used when the sample size is large and the population variance is known. It allows us to test if there is a significant difference between a sample statistic (like a mean) and a known population parameter.

**Null Hypothesis (H₀)**: The sample mean is equal to the population mean (no difference).
  
**Alternative Hypothesis (H₁)**: The sample mean is significantly different from the population mean.


In [1]:
# Dependencies
import numpy as np
import pandas as pd
import scipy.stats as stats
import matplotlib.pyplot as plt

In [2]:
# Population parameters
population_mean = 100
population_std_dev = 15

In [3]:
# Sample data (mean score of a small sample of students)
np.random.seed(0)
sample_data = np.random.normal(loc=population_mean, scale=population_std_dev, size=50)
sample_mean = np.mean(sample_data)
sample_mean

np.float64(102.10838908469648)

In [4]:
# Z-test calculation
z_score = (sample_mean - population_mean) / (population_std_dev / np.sqrt(len(sample_data)))
p_value = stats.norm.sf(abs(z_score)) * 2  # Two-tailed test
print("Z-Score:", z_score)
print("P-Value:", p_value)

Z-Score: 0.9939041461123878
P-Value: 0.32026953566597427


Since the p-value (0.32) is greater than the significance level of 0.05, we fail to reject the null hypothesis. This result suggests that there is not enough evidence to conclude that the sample mean significantly differs from the population mean of 100 at a 95% confidence level.

### One-Sample T-Test
A one-sample t-test is used when we have a sample mean and we want to test if it is different from a known or hypothesized population mean.

Example: Testing Mean Age Difference

Let's consider a population of voter ages and a smaller sample from Minnesota voters. We want to test if the average age in Minnesota differs from the population mean.


To conduct the one-sample t-test, we use ` stats.ttest_1samp() `.

- **Null Hypothesis (H₀)**: The mean age of Minnesota voters is equal to the national mean.

- **Alternative Hypothesis (H₁)**:The mean age of Minnesota voters differs from the national mean.

In [5]:
# Set random seed for reproducibility
np.random.seed(6)


In [6]:
# Population data
population_ages1 = stats.poisson.rvs(loc=18, mu=35, size=150000)
population_ages2 = stats.poisson.rvs(loc=18, mu=10, size=100000)
population_ages = np.concatenate((population_ages1, population_ages2))

In [7]:
# Minnesota sample data
minnesota_ages1 = stats.poisson.rvs(loc=18, mu=30, size=30)
minnesota_ages2 = stats.poisson.rvs(loc=18, mu=10, size=20)
minnesota_ages = np.concatenate((minnesota_ages1, minnesota_ages2))


In [8]:
# Calculate means
print("Population Mean Age:", population_ages.mean())
print("Minnesota Sample Mean Age:", minnesota_ages.mean())

Population Mean Age: 43.000112
Minnesota Sample Mean Age: 39.26


In [9]:
# One-Sample T-Test
t_stat, p_value = stats.ttest_1samp(a=minnesota_ages, popmean=population_ages.mean())
print("T-statistic:", t_stat)
print("P-value:", p_value)

T-statistic: -2.5742714883655027
P-value: 0.013118685425061678


The p-value is less than our chosen significance level (0.05), we reject the null hypothesis and conclude that the mean age of Minnesota voters is significantly different from the population mean.

## Two-Sample T-Test

A two-sample t-test investigates whether the means of two independent samples differ significantly. This test is useful when comparing two distinct groups to see if there is a meaningful difference in their average values.

- **Null Hypothesis (H₀)**: The means of both groups are the same.
- **Alternative Hypothesis (H₁)**: The means of the two groups are different.

Unlike the one-sample test, where we compare a sample mean against a known population mean, the two-sample t-test only involves sample means. Here, we'll use the `stats.ttest_ind()` function to conduct a two-sample t-test.

Let's generate a sample of voter age data for Wisconsin and compare it against the Minnesota sample we created earlier.


In [10]:
# Setting a seed for reproducibility
np.random.seed(12)

# Generating voter age data for Wisconsin
wisconsin_ages1 = stats.poisson.rvs(loc=18, mu=33, size=30)
wisconsin_ages2 = stats.poisson.rvs(loc=18, mu=13, size=20)
wisconsin_ages = np.concatenate((wisconsin_ages1, wisconsin_ages2))

# Display the mean age for the Wisconsin sample
print("Wisconsin sample mean age:", wisconsin_ages.mean())

Wisconsin sample mean age: 42.8


In [11]:
# Performing the two-sample t-test
result = stats.ttest_ind(a=minnesota_ages, b=wisconsin_ages, equal_var=False)
print("T-statistic:", result.statistic)
print("P-value:", result.pvalue)


T-statistic: -1.7083870793286842
P-value: 0.09073104343957748


Here we would fail to reject the null hypothesis since the p-value (0.0907) is greater than 0.05.

Conclusion: There is not enough evidence to conclude that the mean age of voters from Wisconsin and Minnesota is significantly different at a 5% significance level.


## Paired T-Test

The basic two-sample t-test is designed for testing differences between independent groups. However, in some cases, you might be interested in testing differences within the same group over time. For instance, a hospital might want to test whether a weight-loss drug is effective by comparing patients' weights before and after treatment.

The **Paired T-Test** is useful in this context, as it allows us to test whether the means of samples from the same group differ at two different points in time.

- **Null Hypothesis (H₀)**: There is no difference in the mean values of the paired samples (e.g., weights before and after treatment).
- **Alternative Hypothesis (H₁)**: There is a significant difference in the mean values of the paired samples.

We can conduct a paired t-test using the `scipy.stats.ttest_rel()` function.

In [12]:
# Setting a seed for reproducibility
np.random.seed(11)

# Generating weight data before and after treatment
before = stats.norm.rvs(scale=30, loc=250, size=100)
after = before + stats.norm.rvs(scale=5, loc=-1.25, size=100)

In [13]:
# Creating a DataFrame to store and summarize data
weight_df = pd.DataFrame({
    "weight_before": before,
    "weight_after": after,
    "weight_change": after - before
})

In [14]:
# Displaying a summary of the data
weight_df.describe()

Unnamed: 0,weight_before,weight_after,weight_change
count,100.0,100.0,100.0
mean,250.345546,249.115171,-1.230375
std,28.132539,28.422183,4.783696
min,170.400443,165.91393,-11.495286
25%,230.421042,229.148236,-4.046211
50%,250.830805,251.134089,-1.413463
75%,270.637145,268.927258,1.738673
max,314.700233,316.720357,9.759282


In [15]:
# Performing the paired t-test
result = stats.ttest_rel(a=before, b=after)
print("T-statistic:", result.statistic)
print("P-value:", result.pvalue)


T-statistic: 2.5720175998568284
P-value: 0.011596444318439857


With a significance level of 0.05 (95% confidence level), we would reject the null hypothesis since the p-value (0.0116) is less than 0.05.

Conclusion:The result suggests that there is a statistically significant difference in weight before and after the treatment, indicating that the treatment may be effective.