Hypothesis Testing For Differences Between Means

Tests of Significance for Two Unknown Means and Known Standard Deviations

Given samples from two normal populations of size n1 and n2 with unknown means  and  and known standard deviations  and , the test statistic comparing the means is known as the two-sample z statistic

$$Z_{score} = \frac{(\bar{X_1} - \bar{X_2}) - (\mu_{1} - \mu_{2})}{\sqrt{\frac{\sigma_{1}^2}{n_{1}}+\frac{\sigma_{2}^2}{n_{2}}}}$$


Exercise 1: A sample of 87 men showed that the average calcium depletion per year i 3352 µg. The population standard  deviation is 1100 µg.
A sample of 76 women showed that the  average calcium depletion per year is 5727 µg, with a  population standard deviation of 1700 µg.
A researcher wants  to “prove” that women lose more calcium. If they use $\alpha$ = 0.01 and these sample data,will they be able to reject a null hypothesis that women annually lose as much (or less) calcium  as men do? 

$$H_o: \mu_{women} - \mu_{men} \le 0$$

$$H_1: \mu_{women} - \mu_{men} > 0$$

In [20]:
# Set variable for women
X_bar_women = 5727
sigma_women = 1700
n_women = 76

In [21]:
# Set variable for men
X_bar_men = 3352
sigma_men = 1100
n_men = 87

$$Z_{score} = \frac{(\bar{X_{women}} - \bar{X_{men}}) - (\mu_{women} - \mu_{men})}{\sqrt{\frac{\sigma_{women}^2}
{n_{women}}+\frac{\sigma_{men}^2}{n_{men}}}}$$

$$\mu_{women} - \mu_{men} = 0$$

In [22]:
# Compute test statistic Z_score
import math
Z_score = ((X_bar_women - X_bar_men)-0)/(math.sqrt(sigma_women**2/n_women + sigma_men**2/n_men))
print("Z_score =",Z_score)

Z_score = 10.42164353961526


In [23]:
alpha = 0.001

# Compute critical value 
from scipy.stats import norm
Z_crit = norm.ppf(1 - alpha)
print("Critical value = ",Z_crit)

Critical value =  3.090232306167813


In [24]:
"""If Z_score < Z_crit, print reject Ho
   Elif Z_score > Z_crit, print do not reject Ho"""

if Z_score < Z_crit:
    print("Do not reject Ho")
elif Z_score > Z_crit:
    print("Reject Ho")

Reject Ho


Conclusion: Women annually lose more calcium than men

t Test for Differences in Population Means

+ Each of the two populations is normally distributed

+ The two samples are independent

+ The values of the population variances are unknown



t Formula to Test the Difference in Means Assuming $\sigma_{1}^2$ = $\sigma_{2}^2$ 
$$t_{score} = \frac{(\bar{X_{1}}-\bar{X_{2}})-(\mu_{1}-\mu_{2})}{\sqrt{\frac{s_1^2(n_1 -1)+s_2^2(n_2-1)}{n_1 + n_2 - 2}}\sqrt{\frac{1}{n_1}+\frac{1}{n_2}}}$$

$$df = n_{1} + n_{2} - 2$$

t Formula to Test the Difference in Means Assuming $\sigma_{1}^2$ != $\sigma_{2}^2$ 

$$t_{score} = \frac{(\bar{X_1}-\bar{X_2})-(\mu_{1}-\mu_{2})}{\sqrt{\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}}}$$

$$df = \frac{(\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2})^2}{\frac{(\frac{s_1^2}{n_1})^2}{n_1-1}+\frac{(\frac{s_2^2}{n_2})^2}{n_2 - 1}}$$

F Test to Test the Difference in Variance

$$F_{score} = \frac{s_1^2}{s_2^2}$$

$$df_{numerator} = n_1 - 1$$

$$df_{denominator} = n_2 - 1$$

A Paired Samples t Test

A paired t-test is used to compare two population means where you have two samples in
which observations in one sample can be paired with observations in the other sample.

Example:

+ Before and after measurements on the same individual 

+ Studies of twins 

+ Studies of spouses

$$t_{score} = \frac{\bar{d}-D}{\frac{s_d}{\sqrt{n}}}$$

where:

+ $\bar{d}$ = mean sample difference

+ D = mean population difference

+ $s_d$ = standard deviation of sample difference

+ df = n - 1

+ n = number of pairs

Exercise 2: Employees working in a factory were tested for fatigue after working in hot and cold conditions. Eight men and eight women were tested in both environments. 

|Men|Men|Women|Women|
|--|--|--|--|
|$16^oC$|$25^oC$|$16^oC$|$25^oC$|
|34|42|51|56|
|37|39|53|59|
|43|46|41|45|
|25|25|59|57|
|51|50|37|37|
|37|41|42|46|
|21|26|61|45|
|28|30|52|50|

a/ Is there any evidence that there was a difference in mean fatigue for the men due to the temperature? 

b/ Is there any evidence that men differ from women in respect of fatigue when working at $16^o C$?

a/  Use Paired t test

|Men|Men|difference|
|--|--|--|
|$16^oC$|$25^oC$|$16^oC$ - $25^oC$ |
|34|42|-8|
|37|39|-2|
|43|46|-3|
|25|25|0|
|51|50|1|
|37|41|-4|
|21|26|-5|
|28|30|-2|

$$H_o: D = 0$$

$$H_1: D \not= 0$$

In [2]:
import numpy as np  
import statistics
# Create arrays 
Men_16oC = np.array([34,37,43,25,51,37,21,28])
Men_25oC = np.array([42,39,46,25,50,41,26,30])
d = Men_16oC - Men_25oC
d = d.tolist()
print(d)

# Compute sample mean difference d_bar
d_bar = statistics.mean(d)
print("Sample mean difference d_bar =", d_bar)

# Compute standard deviation of sample difference
sd = statistics.stdev(d)
print("Standard deviation sd =", sd)

[-8, -2, -3, 0, 1, -4, -5, -2]
Sample mean difference d_bar = -2.875
Standard deviation sd = 2.850438562747845


In [5]:
D = 0
n = 8 
alpha = 0.05
df = n - 1
import math
# Compute test statistic t_score
t_score = (d_bar - D)/(sd/math.sqrt(n))
print("t_score", t_score)

# Compute critical value
from scipy.stats import t
t_crit1 = t.ppf(alpha/2, df)
print("Critical value 1 =", t_crit1)
t_crit2 = t.ppf(1 - alpha/2,df)
print("Critical value 2 =", t_crit2)

t_score -2.852798895551795
Critical value 1 = -2.3646242510103
Critical value 2 = 2.3646242510102993


In [6]:
# Decision making
if t_score > t_crit1 and t_score < t_crit2:
    print("Do not reject Ho")
else:
    print("Reject Ho")

Reject Ho


Conclusion: There was a difference in mean fatigue for the men due to temperature

b/  

|Men|Women|
|--|--|
|$16^oC$|$16^oC$|
|34|51|
|37|53|
|43|41|
|25|59|
|51|37|
|37|42|
|21|61|
|28|52|

In order to choose an appropriate Unpaired t test, first, we should use F-test to test whether the two population variance are equal or not. 

Variance F - test
$$Ho: \sigma_{1}^2 = \sigma_{2}^2$$


$$H1: \sigma_{1}^2 \not= \sigma_{2}^2$$

In [9]:
# Create two lists of value for Men and Women
Men = [34, 37, 43, 25, 51, 37, 21, 28]
Women = [51, 53, 41, 59, 37, 42, 61, 52]

# Compute sample variance 
import statistics
variance_men = statistics.variance(Men)
print("Sample variance for men =", variance_men)

variance_women = statistics.variance(Women)
print("Sample variance for women =", variance_women)

Sample variance for men = 96.0
Sample variance for women = 75.42857142857143


In [13]:
n_men = 8
n_women = 8
df_1 = n_men - 1
df_2 = n_women - 1
alpha = 0.05

# Compute test statistic F_score
F_score = variance_men/variance_women
print("F_score =", F_score)

# Compute critical value
from scipy.stats import f
F_crit1 = f.ppf(alpha/2,df_1,df_2)
print("F_crit1 =", F_crit1)

F_crit2 = f.ppf(1 - alpha/2,df_1,df_2)
print("F_crit2 =", F_crit2)

F_score = 1.2727272727272727
F_crit1 = 0.20020383877718267
F_crit2 = 4.994909219063238


In [14]:
# Decision making
if F_score > F_crit1 and F_score < F_crit2:
    print("Do not reject Ho")
else:
    print("Reject Ho")

Do not reject Ho


Conclusion: The two population variance are equal

We use t - test formala for equal variance to test mean difference in mean fatigue between men and women

$$H_o: \mu_{men} - \mu_{women} = 0$$

$$H_1: \mu_{men} - \mu_{women} \not= 0$$

In [15]:
# Create two lists of value for Men and Women
Men = [34, 37, 43, 25, 51, 37, 21, 28]
Women = [51, 53, 41, 59, 37, 42, 61, 52]
n_men = 8
n_women = 8
# Compute sample variance and sample mean
import statistics
average_men = statistics.mean(Men)
print("Sample mean for men =", average_men)
variance_men = statistics.variance(Men)
print("Sample variance for men =", variance_men)

average_women = statistics.mean(Women)
print("Sample mean for men =", average_women)
variance_women = statistics.variance(Women)
print("Sample variance for women =", variance_women)

Sample mean for men = 34.5
Sample variance for men = 96.0
Sample mean for men = 49.5
Sample variance for women = 75.42857142857143


In [38]:
# Compute test statistic t_score
t_score = ((average_men - average_women) -0)/(math.sqrt((variance_men*(n_men-1)+variance_women*(n_women-1))/(n_men+n_women-2))*math.sqrt(1/n_men+1/n_women))
print("t_score =", t_score)

# Compute critical value
alpha = 0.05
df = n_men + n_women - 2
from scipy.stats import t
t_crit1 = t.ppf(alpha/2, df)
print("Critical value 1 =", t_crit1)
t_crit2 = t.ppf(1 - alpha/2,df)
print("Critical value 2 =", t_crit2)

t_score = -3.2403703492039297
Critical value 1 = -2.1447866879169277
Critical value 2 = 2.1447866879169273


In [39]:
# Decision making
if t_score > t_crit1 and t_score < t_crit2:
    print("Do not reject Ho")
    print(t_score)
else:
    print("Reject Ho")

Reject Ho


Conclusion: There was a difference in mean fatigue between men and women when working at 16oC