### Independent samples t test

There are two samples from two populations, and we wish to know how the population's parameters are different. For example, the lung capacity of a population of smokers compared to non-smokers.

$ \bar{x_1}, \bar{x_2} $ : the sample statistics

$ S_1, S_2 $ : the sample standard deviations

$ n_1, n_2 $ : the sizes of the samples

In [1]:
x_1 = 4.8
S_1 = 1.2
n_1 = 40

x_2 = 5.9
S_2 = 1.9
n_2 = 45

Suppose we wish to test whether the second population's parameter is significantly larger than the second, then the difference of $ \mu_1 - \mu_2 < 0 $

$ h_0 : \mu_1 - \mu_2 = 0 $

$ h_a : \mu_1 - \mu_2 < 0 $

$ alpha : 0.01 $

In [2]:
h_0_value = 0
alpha = 0.05 # A two tailed test is alpha/2, but one tail is just alpha

Find the test statistic

Note, this equates to "equal variances not assumed". You would have needed to do

$ t = \large \frac{(\bar{x_1} - \bar{x_2}) - (\mu_1 - \mu_2)}{\sqrt{\frac{S_1^2}{n_1}+\frac{S_2^2}{n_2}}}$

Note here, that$ \sqrt{\frac{S_1^2}{n_1}+\frac{S_2^2}{n_2}} = {\frac{S_1}{\sqrt{n_1}}+\frac{S_2}{\sqrt{n_2}}} $ is the error.

In [3]:
error = sqrt(S_1^2 / n_1 + S_2^2 / n_2)
test_statistic = ((x_1 - x_2) - h_0_value) / error
test_statistic

Find the degrees of freedom

$ v = \frac{\left(\frac{S_1^2}{n_1} + \frac{S_2^2}{n_2}\right)^2}{\frac{\left(\frac{sd_1^2}{n_1}\right)^2}{n_1 - 1} + \frac{\left(\frac{sd_2^2}{n_2}\right)^2}{n_2 - 1}} $

In [4]:
dof = (S_1^2 / n_1 + S_2^2 / n_2)^2 / ((S_1^2 / n_1)^2 / (n_1 - 1) + (S_2^2 / n_2)^2 / (n_2 - 1))
dof = round(dof)
dof

In [5]:
critical_value = qt(1-alpha, dof)
region_of_rejection = -critical_value
region_of_rejection

Since the test statstic does not fall below the lower tail, then we can not reject $ h_0 $

In [8]:
sample_statistic = x_1 - x_2

T = qt(0.99/2, 75)

CI_lower = sample_statistic - T*error
CI_lower