# A paired samples t-test (aka a dependent samples t-test)

### Purpose
The purpose of this analysis is to evaluate the effectiveness of a new drug in reducing blood pressure. 
By comparing the blood pressure measurements of the same group of patients before and after taking the drug, we can determine whether the observed changes are statistically significant and indicative of the drug’s efficacy.

### Intuition
A paired sample t-test is used to analyze the differences between two sets of measurements taken from the same individuals. 
In this case, the test compares:
- Before Treatment (mmHg): Blood pressure measurements before taking the new drug.
- After Treatment (mmHg): Blood pressure measurements after taking the new drug.
- The key question is: Did the drug consistently reduce blood pressure across the patients, or could the observed changes be due to random chance?

   

In [2]:
# Dataset
patients = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
before = [140, 145, 155, 160, 150, 148, 153, 165, 149, 155]
after = [132, 138, 149, 153, 145, 140, 146, 158, 141, 147]

# Method 1

In [4]:
from scipy import stats

t_stat, p_val = stats.ttest_rel(before, after)
print(f"T-statistic: {t_stat}, P-value: {p_val}")

alpha = 0.05
if p_val < alpha:
    print("Reject the null hypothesis: There is a statistically significant difference.")
else:
    print("Fail to reject the null hypothesis: There is no statistically significant difference.")

T-statistic: 22.577954844135466, P-value: 3.1089373891812632e-09
Reject the null hypothesis: There is a statistically significant difference.


# Method 2

## Hypotheses

- **Null Hypothesis $H_0$**: There is no difference in the mean blood pressure before and after treatment: $ \mu_{\text{before}} = \mu_{\text{after}} $
- **Alternative Hypothesis $H_a$**: There is a significant difference in the mean blood pressure before and after treatment: $ \mu_{\text{before}} \neq \mu_{\text{after}} $

---

## Calculations

### 1. Sample Mean of Differences
The sample mean $\bar{d}$ is calculated as: $ \bar{d} = \frac{1}{n} \sum_{i=1}^n d_i $

Where:
- $ n $: Number of samples
- $ d_i $: Difference for the $ i $-th patient ($ d_i = \text{before}_i - \text{after}_i $)


In [7]:
# calculate the differences
difference = list(map(lambda i: after[i] - before[i], range(len(patients))))
print("Differences (d):", difference)


Differences (d): [-8, -7, -6, -7, -5, -8, -7, -7, -8, -8]


| Patient | Before (mmHg) | After (mmHg) | Difference (d) |
|---------|---------------|--------------|----------------|
| 1       | 140           | 132          | -8             |
| 2       | 145           | 138          | -7             |
| 3       | 155           | 149          | -6             |
| 4       | 160           | 153          | -7             |
| 5       | 150           | 145          | -5             |
| 6       | 148           | 140          | -8             |
| 7       | 153           | 146          | -7             |
| 8       | 165           | 158          | -7             |
| 9       | 149           | 141          | -8             |
| 10      | 155           | 147          | -8             |


### 2. Sample Standard Deviation of Differences
The sample standard deviation $ s_d = \sqrt{\frac{1}{n-1} \sum_{i=1}^n (d_i - \bar{d})^2} $

Where:
- $ \bar{d} $: Sample mean of differences
- $ d_i $: Individual differences
- $ n $: Number of samples

In [10]:
d_bar = sum(difference) / len(patients) 
d_bar

-7.1

### 3. Test Statistic
The test statistic t for a paired t-test is calculated as: $ t = \frac{\bar{d}}{s_d / \sqrt{n}} $

Where:
- $ \bar{d} $: Sample mean of differences
- $ s_d $: Sample standard deviation of differences
- $ n $: Number of samples


In [12]:
import math
n = len(patients)
squared_diffs = [(d - d_bar) ** 2 for d in difference]
print('squared_diffs: ' + str(squared_diffs))
s_d = math.sqrt(sum(squared_diffs) / (n - 1))
s_d
# Bessel's correction 
# n-1 corrects for bias in the estimation of the population variance and ensures an unbiased estimate when working with a sample
print('\n s_d: ' + str(s_d))

t = d_bar / (s_d / math.sqrt(len(patients)))
t

squared_diffs: [0.8100000000000006, 0.009999999999999929, 1.2099999999999993, 0.009999999999999929, 4.409999999999998, 0.8100000000000006, 0.009999999999999929, 0.009999999999999929, 0.8100000000000006, 0.8100000000000006]

 s_d: 0.9944289260117531


-22.57795484413547

## Significance Test

Compare the computed $ t $-statistic to the critical value from the $ t $-distribution table or use the $ p $-value approach:
- If $ p < \alpha $, reject $H_0$
- If $ p \geq \alpha $, fail to reject $H_0$

---


## Conclusion
- Based on the result of the paired sample t-test, p-value is much smaller than 0.05. Thus, we can reject null hypothesis. 
- The reduction in blood pressure is statistically significant.
- Therefore, we conclude that the treatment had a measurable effect on reducing blood pressure.
