# Key Symbols and Terms in Hypothesis Testing

## 1. Null Hypothesis (H0)
- **Definition**: The null hypothesis states that there is no effect or no difference. It represents the default assumption.
- **Example**: H0: μ = 100 (The mean of a population is 100)
- **Python Example**: We use statistical tests to check the null hypothesis.

In [None]:
# Null hypothesis: mean = 100
import numpy as np
sample = np.random.normal(100, 15, 30)  # Sample data

## 2. Alternative Hypothesis (H1 or Ha)
- **Definition**: The alternative hypothesis is what you want to prove. It contradicts the null hypothesis.
- **Example**: Ha: μ ≠ 100 (The mean is not equal to 100)

## 3. Significance Level (α)
- **Definition**: The probability of rejecting the null hypothesis when it is true (Type I error). Common values: 0.05, 0.01.
- **Example**: α = 0.05 (5% significance level)

## 4. p-value
- **Definition**: The probability of obtaining a result at least as extreme as the observed one, assuming the null hypothesis is true.
- **Decision Rule**: If p ≤ α, reject H0.

In [None]:
from scipy import stats
# Perform a one-sample t-test
t_stat, p_value = stats.ttest_1samp(sample, 100)
print(f"T-statistic: {t_stat}, p-value: {p_value}")
if p_value < 0.05:
    print("Reject H0")
else:
    print("Fail to reject H0")

## 5. Test Statistic
- **Definition**: A standardized value used to decide whether to reject H0.
- **Common Test Statistics**:
  - z-score: For large samples or known population variance.
  - t-score: For small samples or unknown population variance.

In [None]:
import math
sample_mean = np.mean(sample)
population_mean = 100
sample_std = np.std(sample, ddof=1)
n = len(sample)

# z-score (assuming population standard deviation known)
z_score = (sample_mean - population_mean) / (15 / math.sqrt(n))
print(f"z-score: {z_score}")

# t-score (sample standard deviation)
t_score = (sample_mean - population_mean) / (sample_std / math.sqrt(n))
print(f"t-score: {t_score}")

## 6. Critical Value
- **Definition**: The threshold value that defines the rejection region for H0.
- **z-critical values (for a two-tailed test)**:
  - α = 0.05 → z = ±1.96
  - α = 0.01 → z = ±2.58

In [None]:
# Find critical z-value for alpha = 0.05 (two-tailed)
z_critical = stats.norm.ppf(1 - 0.05/2)
print(f"z-critical: {z_critical}")

## 7. Power of the Test
- **Definition**: The probability of correctly rejecting a false H0.
- **Python**: Can be computed using simulations or statistical libraries.

## 8. Confidence Interval (CI)
- **Definition**: A range of values that likely contain the true parameter.
- **95% Confidence Interval Formula**: CI = x̄ ± z(α/2) * (σ / √n)

In [None]:
# Confidence interval for the mean (95%)
conf_interval = stats.norm.interval(0.95, loc=sample_mean, scale=15 / math.sqrt(n))
print(f"95% Confidence Interval: {conf_interval}")

## 9. Type I and Type II Errors
- **Type I Error (α)**: Rejecting H0 when it’s true.
- **Type II Error (β)**: Failing to reject H0 when it’s false.

## 10. Effect Size
- **Definition**: A measure of the magnitude of the difference (e.g., Cohen’s d for mean differences).
- **Cohen’s d Formula**: d = (x̄ - μ) / s

In [None]:
# Cohen's d calculation
cohen_d = (sample_mean - population_mean) / sample_std
print(f"Cohen's d: {cohen_d}")