# Two-Sample Tests

### 1. Two-Sample Z-Test
Used when population standard deviations are known.

$$
Z = \frac{(\bar{X}_1 - \bar{X}_2) - (\mu_1 - \mu_2)}{\sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}}
$$


### 2. Two-Sample t-Test
Used when population standard deviations are unknown.

$$
T = \frac{(\bar{X}_1 - \bar{X}_2) - (\mu_1 - \mu_2)}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}}
$$

### 3. Paired t-Test
Used when the same group is measured twice.

$$
t = \frac{\bar{D}}{s_D / \sqrt{n}}
$$


## Degrees of Freedom in Two-Sample t-Test

### 1. **Welch–Satterthwaite Equation (Exact Approximation)**
Used when variances are **unequal**:

$$
df = \frac{\left(\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}\right)^2}{\frac{\left(\frac{s_1^2}{n_1}\right)^2}{n_1 - 1} + \frac{\left(\frac{s_2^2}{n_2}\right)^2}{n_2 - 1}}
$$

- Produces a **non-integer df** (e.g., 16.7)  
- Most software (R, Python, SPSS) uses this by default  

---

### 2. **Conservative Approximation (Minimum Rule)**
$$
df \approx \min(n_1 - 1,\; n_2 - 1)
$$

- Easier to compute by hand  
- Gives **smaller df** → larger critical \(t\) value  
- Reduces **Type I error** (false positives)  
- Common in older statistics textbooks  

---

### 3. **Equal Variance Assumption (Pooled t-test)**
If variances are assumed **equal**:

$$
df = n_1 + n_2 - 2
$$

---

- **Welch–Satterthwaite formula** → Best, used in modern software  
- **Min rule** → Conservative shortcut  
- **Pooled df** → Only if variances are assumed equal  




### Two sample t test

In [1]:
import numpy as np
from scipy import stats

In [2]:
group_A = [85, 88, 90, 92, 87, 85, 89, 91, 86, 88] 
group_B = [82, 84, 80, 83, 81, 79, 78, 85, 84, 83]

In [5]:
t_stats , p_value = stats.ttest_ind(group_A, group_B , equal_var= False)
t_stats
p_value

1.610475598965881e-05

In [7]:
alpha = 0.05
if p_value < alpha:
    print("I will reject the null hypothesis")
else:
    print("I will accept the null hypothesis")

I will reject the null hypothesis


## **Created By:** *Hafiz Muhammad Talal*