## Paired t-test

A **Paired t-test** is used when the same subjects are measured twice — for example, *before and after* an intervention — to test if the mean difference is significant.

### Formula

Let each pair difference be:

$$
d_i = X_{1,i} - X_{2,i}
$$

Mean and standard deviation of differences:

$$
\bar{d} = \frac{1}{n}\sum_{i=1}^{n} d_i, \quad
s_d = \sqrt{\frac{1}{n-1}\sum_{i=1}^{n}(d_i - \bar{d})^2}
$$

Test statistic:

$$
t = \frac{\bar{d}}{s_d / \sqrt{n}}
$$

Degrees of freedom:  $df = n - 1$

### Hypotheses

$$
H_0: \mu_d = 0 \quad \text{(no difference)}
$$
$$
H_1: \mu_d \ne 0 \quad \text{(significant difference)}
$$


In [1]:
import pandas as pd
import numpy as np
from scipy.stats import t

In [3]:
data = pd.read_csv('paired_students_marks_fixed.csv')


In [5]:
data.head()

Unnamed: 0,StudentID,Group,Before,After
0,1,A,63.973713,66.89686
1,2,A,58.893886,66.790659
2,3,A,65.181508,73.467936
3,4,A,72.184239,78.172853
4,5,A,58.126773,67.320344


In [6]:
data['Group'].value_counts()

Group
A    100
B    100
Name: count, dtype: int64

In [7]:
data.shape

(200, 4)

In [8]:
group_a = data[data['Group'] == "A"]
group_b = data[data['Group'] == "B"]

## Paired t-test on Students' Marks

We created two groups of students to test the effect of training on performance.

- **Group A (Trained):** Students who received training. Their marks are expected to improve after training.  
- **Group B (Untrained):** Students who did not receive any training. Their marks are expected to remain almost the same.

We will apply the **Paired t-test** separately on each group to check whether there is a significant difference between their **Before** and **After** marks.

### Hypotheses

- **Null Hypothesis (H₀):** There is no significant difference between before and after marks.  
- **Alternative Hypothesis (H₁):** There is a significant difference between before and after marks.

The paired t-test will help us confirm whether training had a real effect on students' marks.


In [9]:

def T_test_paired_right(sample_before, sample_after):
    # Step 1: difference (After - Before)
    diff = sample_after - sample_before

    # Step 2: mean and sample standard deviation
    diff_mean = diff.mean()
    diff_std = diff.std(ddof=1)

    # Step 3: t-value
    n = len(diff)
    T_value = diff_mean / (diff_std / np.sqrt(n))

    # Step 4: one-tailed (right side) p-value
    P_value = 1 - t.cdf(T_value, n - 1)

    # Step 5: print result
    print(f"T-value: {T_value:.4f}")
    print(f"P-value: {P_value:.6f}")

    if P_value < 0.05:
        print("✅ H1 Wins → Marks significantly increased")
    else:
        print("❌ H0 Wins → No significant increase in marks")


In [10]:
T_test_paired_right(group_a['Before'], group_a['After'])

T-value: 21.2055
P-value: 0.000000
✅ H1 Wins → Marks significantly increased


In [11]:
T_test_paired_right(group_b['Before'], group_b['After'])

T-value: 0.6810
P-value: 0.248730
❌ H0 Wins → No significant increase in marks


# 🧪 Paired Sample T-Test Result Summary

We conducted a **right-tailed paired t-test** to check whether students’ marks increased **after training**.

## 📊 Groups Overview
| Group | Description | Expected Effect |
|--------|--------------|----------------|
| **A** | Trained students | Marks should increase |
| **B** | Untrained students | Marks should not increase |

---

## 🧠 Hypotheses
- **Null Hypothesis (H₀):** There is no improvement (mean difference = 0)  
- **Alternative Hypothesis (H₁):** After marks are greater than before marks (mean difference > 0)

---

## 📈 Test Results

| Group | Mean (Before) | Mean (After) | t-value | p-value | Result |
|--------|----------------|---------------|----------|----------|---------|
| **A** | 59.78 | 69.83 | 21.21 | 0.0000 | ✅ **H₁ Wins — Significant Improvement** |
| **B** | 60.52 | 60.82 | 0.68 | 0.249 | ❌ **H₀ Wins — No Significant Change** |

---

## 🧾 Conclusion
- **Group A (trained)** students showed a **statistically significant improvement** in marks.  
- **Group B (untrained)** students showed **no significant difference**, confirming that training made the real impact.

Hence, **training program was effective** 🎯
