# Introduction to Hypothesis Testing

Hypothesis testing is a statistical method used to make decisions about a population parameter based on sample data. It involves making an assumption (the null hypothesis) and then determining whether the observed data provides enough evidence to reject that assumption in favor of an alternative hypothesis.

### Concepts:
- **Null Hypothesis (H0):** The hypothesis that there is no effect or difference. It is the assumption that we start with.
- **Alternative Hypothesis (H1):** The hypothesis that there is an effect or difference.
- **Significance Level (α):** The threshold at which we decide whether to reject the null hypothesis, commonly set at 0.05.
- **P-value:** The probability of observing the test statistic under the null hypothesis. If the p-value is less than α, we reject the null hypothesis.
- **Type I Error:** Rejecting the null hypothesis when it is actually true.
- **Type II Error:** Failing to reject the null hypothesis when it is actually false.

# Steps in Hypothesis Testing

1. **Formulate the Hypotheses:**
   - **H0 (Null Hypothesis):** There is no difference or effect.
   - **H1 (Alternative Hypothesis):** There is a difference or effect.
   
2. **Select a Significance Level (α):** Typically 0.05.

3. **Choose the Appropriate Test:** Depending on the data and the hypothesis, select a test like a t-test, chi-square test, etc.

4. **Compute the Test Statistic:** Use statistical software or formulas to calculate the test statistic from the sample data.

5. **Make a Decision:** Compare the p-value with the significance level to decide whether to reject H0.


For better learning of this concept lets solve an example. we'll test the hypothesis that you have mystical abilities to predict the outcome of coin flips.

## Scenario

- We test your claim by having you predict the outcome of 100 coin flips.
- You correctly predict 57 out of 100 flips.

We'll use a hypothesis test to determine whether your performance is significantly different from random guessing (50%).

## Step 1: Define the Hypotheses

We'll consider two hypotheses:

- **Null Hypothesis (H0):** You have no predictive power, so your true success rate is 50%.
- **Alternative Hypothesis (H1):** You have predictive power.

In [12]:
import scipy.stats as stats

# Number of trials
n = 100

# Number of successes
k = 57

# Probability under the null hypothesis
p = 0.5

# Perform the binomial test
test = stats.binomtest(k, n, p, alternative='greater')
p_value = test.pvalue

print(f"P-value: {p_value}")

# significance level
alpha = 0.05

if p_value < alpha:
    print("Reject the null hypothesis. You might have mystical abilities!")
else:
    print("Fail to reject the null hypothesis. There's no evidence to suggest you're special.")

P-value: 0.0966739522478214
Fail to reject the null hypothesis. There's no evidence to suggest you're special.


Now let's talk about some tests in hypothesis testing

# Z-Test

## When to Use:
- When you want to compare the mean of a sample to a known population mean.
- The sample size is large (n > 30), and the population standard deviation is known.

### Example:
Testing if the average height of a sample of men in a city is different from the national average height of 175 cm.

- **Null Hypothesis (H₀):** The average height of men in the city is equal to the national average height.
$$H_0: \mu = 175$$
- **Alternative Hypothesis (H₁):** The average height of men in the city is different from the national average height.
$$H_1: \mu \neq 175$$


In [2]:
import numpy as np
from scipy import stats


sample_mean = 180
population_mean = 175
std_dev = 10
n = 50  

z = (sample_mean - population_mean) / (std_dev / np.sqrt(n))
p_value = 2 * (1 - stats.norm.cdf(abs(z)))

print(f"Z-Value: {z}")
print(f"P-Value: {p_value}")

Z-Value: 3.5355339059327378
P-Value: 0.00040695201744500586


# T-Test

## When to Use:

- When comparing the means of two groups (independent or paired) or when comparing the mean of a sample to a known population mean.
- Sample size is small (n < 30), and the population standard deviation is unknown.
  
## Types of T-Tests:
- **One-Sample T-Test:** Compare sample mean to a known population mean.
- **Independent Two-Sample T-Test:** Compare the means of two independent groups.
- **Paired T-Test:** Compare the means of two related groups (e.g., before and after treatment).

### Example:
Testing if the average scores of students from two different classes are significantly different.

- Null Hypothesis (H₀): The average scores of students in Class A and Class B are equal.
$$H_0: \mu_A = \mu_B$$
 
- Alternative Hypothesis (H₁): The average scores of students in Class A and Class B are not equal.
$$H_1: \mu_A \neq \mu_B$$
 

In [8]:
class_A_scores = [85, 87, 88, 90, 85]
class_B_scores = [78, 89, 85, 80, 82]

# Independent Two-Sample T-Test
t_stat, p_value = stats.ttest_ind(class_A_scores, class_B_scores)

print(f"T-Statistic: {t_stat}")
print(f"P-Value: {p_value}")

T-Statistic: 1.9498010508590455
P-Value: 0.08701892760875735


# Chi-Square Test

## When to Use:
- When you want to test the independence of two categorical variables or the goodness of fit between observed and expected frequencies.
- Categories should have an expected frequency of at least 5.
  
## Types of Chi-Square Tests:
- **Chi-Square Test for Independence:** Tests if two categorical variables are independent.
- **Chi-Square Goodness-of-Fit Test:** Tests if observed frequencies match expected frequencies.

### Example:
Testing if there is an association between gender and preference for a new product.
- **Null Hypothesis (H₀):** There is no association between gender and preference for the new product (they are independent).
$$H_0: \text{Gender and Preference are independent}$$
- **Alternative Hypothesis (H₁):** There is an association between gender and preference for the new product (they are not independent).
$$H_1: \text{Gender and Preference are not independent}$$


In [9]:
import pandas as pd


data = pd.DataFrame({
    'Male': [30, 10],
    'Female': [15, 45]
}, index=['Like', 'Dislike'])

chi2, p, dof, expected = stats.chi2_contingency(data)

print(f"Chi-Square Statistic: {chi2}")
print(f"P-Value: {p}")

Chi-Square Statistic: 22.264309764309765
P-Value: 2.375816149275661e-06


# ANOVA (Analysis of Variance)

## When to Use:

- When comparing the means of three or more groups.
- Data is normally distributed, and variances are similar across groups.

### Example:
Testing if the average scores are different across three teaching methods.

- **Null Hypothesis (H₀):** The average scores are the same across all three teaching methods.
$$H_0: \mu_A = \mu_B = \mu_C$$
- **Alternative Hypothesis (H₁):** At least one teaching method has a different average score.
$$H_1: \text{At least one } \mu \text{ is different}$$

In [11]:
method_A = [85, 86, 88, 75, 78]
method_B = [82, 79, 85, 89, 90]
method_C = [92, 94, 89, 88, 91]


f_stat, p_value = stats.f_oneway(method_A, method_B, method_C)

print(f"F-Statistic: {f_stat}")
print(f"P-Value: {p_value}")


F-Statistic: 4.741880341880339
P-Value: 0.03036866078635405


# Mann-Whitney U Test

## When to Use:
- When comparing the medians of two independent groups, especially when data is not normally distributed.
- Non-parametric test, no assumption about the distribution of the data.

### Example:
Comparing the satisfaction scores of two different customer groups.

- **Null Hypothesis (H₀):** The median performance before and after the training program is the same.
$$H_0: \text{Distribution of scores in Group 1} = \text{Distribution of scores in Group 2}$$
- **Alternative Hypothesis (H₁):** The median performance before and after the training program is different.
$$H_1: \text{Distribution of scores in Group 1} \neq \text{Distribution of scores in Group 2}$$

In [14]:
group_1 = [55, 65, 70, 75, 80]
group_2 = [60, 62, 68, 74, 78]


u_stat, p_value = stats.mannwhitneyu(group_1, group_2)

print(f"U-Statistic: {u_stat}")
print(f"P-Value: {p_value}")

U-Statistic: 14.0
P-Value: 0.8412698412698413


# Wilcoxon Signed-Rank Test

## When to Use:
- When comparing the medians of two related groups, especially when data is not normally distributed.
- Non-parametric, for paired samples.

### Example:
Comparing the performance of students before and after a training program.

- **Null Hypothesis (H₀):** The median performance before and after the training program is the same.
$$H_0: \text{Median}_{\text{Before}} = \text{Median}_{\text{After}}$$
- **Alternative Hypothesis (H₁):** The median performance before and after the training program is different.
$$H_1: \text{Median}_{\text{Before}} \neq \text{Median}_{\text{After}}$$

In [17]:
before = [50, 55, 60, 62, 65]
after = [60, 65, 70, 72, 75]


w_stat, p_value = stats.wilcoxon(before, after)

print(f"Wilcoxon Statistic: {w_stat}")
print(f"P-Value: {p_value}")

Wilcoxon Statistic: 0.0
P-Value: 0.0625


# Kruskal-Wallis H Test

## When to Use:
- When comparing the medians of three or more independent groups, especially when data is not normally distributed.
- Non-parametric test, no assumption about the distribution of the data.

### Example:
Comparing the performance scores across three different departments in a company.

- **Null Hypothesis (H₀):** The distribution of performance scores is the same across all three departments.

- **Alternative Hypothesis (H₁):** The distribution of performance scores is different in at least one department.