## Statistical Significance -- Implies only that the null hypothesis is probably false, and not whether it’s false because of a large or small difference between population means.

Tests of hypotheses often are referred to as tests of significance, and test results are described as being statistically significant (if the null hypothesis has been rejected) or as not being statistically significant (if the null hypoth- esis has been retained). Rejecting the null hypothesis and statistically significant both signify that the test result can’t be attributed to chance.

### STATISTICAL HYPOTHESES

Null Hypothesis

$H_{0}$ : $\mu_{1} - \mu_{2} \leq 0$

Alternative Hypothesis (Research Hypothesis)

$H_{1}$ : $\mu_{1} - \mu_{2} > 0$

##### Two Other Possible Alternative Hypotheses

1. Another directional hypothesis, expressed as

$H_{1}$ : $\mu_{1} - \mu_{2} < 0$ translates into a one-tailed test with the lower tail critical.

2. A nondirectional hypothesis, expressed as

$H_{1}$ : $\mu_{1} - \mu_{2} \neq 0$ translates into a two-tailed test.

Statistical hypotheses for the difference between two population means must be selected from among the following three possibilities:

Nondirectional:
$H_{0}$ : $\mu_{1} - \mu_{2} = 0 ; H_{1}$ : $\mu_{1} - \mu_{2} \neq 0$

Directional, lower tail critical:
$H_{0}$ : $\mu_{1} - \mu_{2} \geq 0 ; H_{1}$ : $\mu_{1} - \mu_{2} < 0$

Directional, upper tail critical:
$H_{0}$ : $\mu_{1} - \mu_{2} \leq 0 ; H_{1}$ : $\mu_{1} - \mu_{2} > 0$

##### Progress Check *14.1 Identifying the treatment group with μ1, specify both the null and alternative hypotheses for each of the following studies. Select a directional alternative hypothesis only when a word or phrase justifies an exclusive concern about population mean differences in a particular direction.

(a) After randomly assigning migrant children to two groups, a school psychologist determines whether there is a difference in the mean reading scores between groups exposed to either a special bilingual or a traditional reading program.

Answer:

$H_{0}$ : $\mu_{1} - \mu_{2} = 0$

$H_{1}$ : $\mu_{1} - \mu_{2} \neq 0$

(b) On further reflection, the school psychologist decides that, because of the extra expense of the special bilingual program, the null hypothesis should be rejected only if there is evidence that reading scores are improved, on average, for the group exposed to the special
bilingual program.

Answer:

$H_{0}$ : $\mu_{1} - \mu_{2} \leq 0$

$H_{1}$ : $\mu_{1} - \mu_{2} > 0$

(c) An investigator wishes to determine whether, on average, cigarette consumption is reduced for smokers who chew caffeine gum. Smokers in attendance at an antismoking workshop are randomly assigned to two groups—one that chews caffeine gum and one that does not—and their daily cigarette consumption is monitored for six months after the workshop.

Answer:

$H_{0}$ : $\mu_{1} - \mu_{2} \geq 0$

$H_{1}$ : $\mu_{1} - \mu_{2} < 0$

(d) A political scientist determines whether males and females differ, on average, about the amount of money that, in their opinion, should be spent by the U.S. government on homeland security. After being informed about the size of the current budget for homeland security, in billions of dollars, randomly selected males and females are asked to indicate the percent by which they would alter this amount—for example, –8 percent for an 8 percent reduction, 0 percent for no change, 4 percent for a 4 percent increase.

Answer:

$H_{0}$ : $\mu_{1} - \mu_{2} = 0$

$H_{1}$ : $\mu_{1} - \mu_{2} \neq 0$

### Mean of the Sampling Distribution , $\mu_{\overline{X}_1-\overline{X}_2}$

The mean of the sampling distribution of X equals thepopulation mean, that is,

$\mu_{\overline{X}} = \mu$

Similarly, the mean of the new sampling distribution of $\overline{X}_1-\overline{X}_2$ equals the difference between population means, that is,

$\mu_{\overline{X}_1-\overline{X}_2} = \mu_{1} - \mu_{2}$

### Standard Error of the Sampling Distribution, (standard error of the difference between means) $\sigma_{\overline{X}_1-\overline{X}_2}$

$\sigma_\overline{X}$ = $\dfrac{\sigma}{\sqrt{n}} = \sqrt{\dfrac{\sigma^2}{n}}$

The standard deviation of the new sampling distribution of $\overline{X}_1 - \overline{X}_2$ equals

$\sigma_{\overline{X}_1-\overline{X}_2} = \sqrt{\dfrac{\sigma^2_1}{n_1} + \dfrac{\sigma^2_2}{n_2}}$

### t TEST

t Ratio

$t = \dfrac{(difference~between~sample~means)-(hypothesized~difference~between~population~means)}{estimated~standard~error}$

Expressed in symbols,

### t RATIO FOR TWO POPULATION MEANS (TWO INDEPENDENT SAMPLES)

$t = \dfrac{(\overline{X}_1 - \overline{X}_2)-(\mu_1 - \mu_2)_{hyp}}{S_{\overline{X}_1-\overline{X}_2}}$

### ESTIMATED STANDARD ERROR OF THE MEAN
$S_\overline{X}$ = $\dfrac{S}{\sqrt{n}}$

### Sample Sums of Squares
$SS$ = $\Sigma({X}-\overline{X})^2$ = $\Sigma{X}^2$ $-$ $\dfrac{(\Sigma{X})^2}{n}$

### THE POOLED VARIANCE

$S_P^2 = \dfrac{SS_1 + SS_2}{n_1 + n_2 - 2}$

$S_P^2 = \dfrac{({n_1 - 1})s_1^2 + ({n_2 - 1})s_2^2}{n_1 + n_2 - 2}$

where $s_1^2$ and $s_2^2$ are variances from $\sigma^2 = \frac{SS}{N - 1}$ then $SS = \sigma^2(N-1)$

### ESTIMATED STANDARD ERROR

$S_{\overline{X}_1-\overline{X}_2} = \sqrt{\dfrac{S^2_P}{n_1} + \dfrac{S^2_P}{n_2}}$

##### Progress Check *14.2 Using Table B in Appendix C, find the critical t values for each of the following hypothesis tests:
(a) two-tailed test; α = .05; n1 = 12; n2 = 11; df = 12+11-2 = 21; t_value = $\pm$2.080

(b) one-tailed test, upper tail critical; α = .05; n1 = 15; n2 = 13; df = 15+13-2 = 26; t_value = +1.706

(c) one-tailed test, lower tail critical; α = .01; n1 = n2 = 25; df=25+25-2 = 48; t_value = -2.423

(d) two-tailed test; α = .01; n1 = 8; n2 = 10; df=8+10-2=16; t_value = $\pm$2.921

### Table 14.1 CALCULATIONS FOR THE t TEST: TWO INDEPENDENT SAMPLES (EPO EXPERIMENT)

In [1]:
import statistics
import math

X1 = [12,5,11,11,9,18]
X2 = [7,3,4,6,3,13]

n1 = len(X1)
n2 = len(X2)
u_hyp = 0

mean_X1 = statistics.mean(X1)
mean_X2 = statistics.mean(X2)
print(f'mean_X1 = {mean_X1}; mean_X2 = {mean_X2}')

sum_of_X1_list = sum(X1)
sum_of_X2_list = sum(X2)
print(f'sum_of_X1_list = {sum_of_X1_list}; sum_of_X2_list = {sum_of_X2_list}')

square_of_each_X1 = [num1**2 for num1 in X1]
square_of_each_X2 = [num2**2 for num2 in X2]

sum_of_square_of_each_X1 = sum(square_of_each_X1)
sum_of_square_of_each_X2 = sum(square_of_each_X2)
print(f'sum_of_square_of_each_X1 = {sum_of_square_of_each_X1}; sum_of_square_of_each_X2 = {sum_of_square_of_each_X2}')

SS1 = sum_of_square_of_each_X1 - (sum_of_X1_list**2/n1)
SS2 = sum_of_square_of_each_X2 - (sum_of_X2_list**2/n1)
print(f'SS1 = {SS1}; SS2 = {SS2}')

pooled_variance = (SS1+SS2)/(n1+n2-2)
print(f'pooled_variance = {pooled_variance}')

std_error = math.sqrt((pooled_variance/n1)+(pooled_variance/n2))
print(f'std_error = {std_error}')

t_ratio = ((mean_X1-mean_X2)-u_hyp)/std_error
print(f't_ratio = {t_ratio}')

mean_X1 = 11; mean_X2 = 6
sum_of_X1_list = 66; sum_of_X2_list = 36
sum_of_square_of_each_X1 = 816; sum_of_square_of_each_X2 = 288
SS1 = 90.0; SS2 = 72.0
pooled_variance = 16.2
std_error = 2.32379000772445
t_ratio = 2.151657414559676


#### Markdown for Table 14.1 CALCULATIONS FOR THE t TEST: TWO INDEPENDENT SAMPLES (EPO EXPERIMENT)

$\overline{X}_1 = 11; \overline{X}_2 = 6$

$\Sigma{X}_1 = 66; \Sigma{X}_2 = 36$

$\Sigma{X}_1^2 = 816; \Sigma{X}_2^2 = 288$

$SS_1 = 90; SS_2 = 72$
    
$S_P^2 = 16.2$

$S_{\overline{X}_1-\overline{X}_2} = 2.32379000772445$

$t = 2.151657414559676$

For a one-tailed test at 95% confidence and significance level/p-value of 0.05, degrees of freedom df=6+6-2=10, t_value = 1.812. 
Decision: Reject H0 at the .05 level of significance because t = 2.151657414559676 exceeds 1.812

Interpretation: The difference between population means is greater than zero. There is evidence that EPO increases the mean endurance scores of treatment patients.

##### Progress Check *14.3 A psychologist investigates the effect of instructions on the time required to solve a puzzle. Each of 20 volunteers is given the same puzzle to be solved as rapidly as possible. Subjects are randomly assigned, in equal numbers, to receive two different sets of instructions prior to the task. One group is told that the task is difficult (X1), and the other group is told that the task is easy (X2). The score for each subject reflects the time in minutes required to solve the puzzle. Use a t to test the null hypothesis at the .05 level of significance.

In [2]:
import statistics
import math

X1 = [5,20,7,23,30,24,9,8,20,12]
X2 = [13,6,6,5,3,6,10,20,9,12]

n1 = len(X1)
n2 = len(X2)
u_hyp = 0

mean_X1 = statistics.mean(X1)
mean_X2 = statistics.mean(X2)
print(f'mean_X1 = {mean_X1}; mean_X2 = {mean_X2}')

sum_of_X1_list = sum(X1)
sum_of_X2_list = sum(X2)
print(f'sum_of_X1_list = {sum_of_X1_list}; sum_of_X2_list = {sum_of_X2_list}')

square_of_each_X1 = [num1**2 for num1 in X1]
square_of_each_X2 = [num2**2 for num2 in X2]

sum_of_square_of_each_X1 = sum(square_of_each_X1)
sum_of_square_of_each_X2 = sum(square_of_each_X2)
print(f'sum_of_square_of_each_X1 = {sum_of_square_of_each_X1}; sum_of_square_of_each_X2 = {sum_of_square_of_each_X2}')

SS1 = sum_of_square_of_each_X1 - (sum_of_X1_list**2/n1)
SS2 = sum_of_square_of_each_X2 - (sum_of_X2_list**2/n1)
print(f'SS1 = {SS1}; SS2 = {SS2}')

pooled_variance = (SS1+SS2)/(n1+n2-2)
print(f'pooled_variance = {pooled_variance}')

std_error = math.sqrt((pooled_variance/n1)+(pooled_variance/n2))
print(f'std_error = {std_error}')

t_ratio = ((mean_X1-mean_X2)-u_hyp)/std_error
print(f't_ratio = {t_ratio}')

mean_X1 = 15.8; mean_X2 = 9
sum_of_X1_list = 158; sum_of_X2_list = 90
sum_of_square_of_each_X1 = 3168; sum_of_square_of_each_X2 = 1036
SS1 = 671.5999999999999; SS2 = 226.0
pooled_variance = 49.86666666666666
std_error = 3.158058475287203
t_ratio = 2.1532216876958206


#### Markdown for Table 14.3
Statistical Hypothesis

$H_{0}$ : $\mu_{1} - \mu_{2} = 0$

$H_{1}$ : $\mu_{1} - \mu_{2} \neq 0$

$\overline{X}_1 = 15.8; \overline{X}_2 = 9$

$\Sigma{X}_1 = 158; \Sigma{X}_2 = 90$

$\Sigma{X}_1^2 = 3168; \Sigma{X}_2^2 = 1036$

$SS_1 = 671.5999999999999; SS_2 = 226.0$
    
$S_P^2 = 49.86666666666666$

$S_{\overline{X}_1-\overline{X}_2} = 3.158058475287203$

$t = 2.1532216876958206$

At 95% confidence level and a siginificance level/p-value of 0.05, degrees of freedom df=10+10-2=18, one-tailed test upper tail critical has a t_value = 2.101

Decision: Reject H0 at the .05 level of significance because t = 2.1532216876958206 exceeds 2.101.

Interpretation: Puzzle-solving times are longer, on average, for subjects who are told that the
puzzle is difficult than for those who are told that the puzzle is easy.

In [3]:
# Function to compute for t Test for two samples
X1 = [5,20,7,23,30,24,9,8,20,12]
X2 = [13,6,6,5,3,6,10,20,9,12]
def t_Test_two_samples(X1,X2):
    import statistics
    import math
    n1 = len(X1)
    n2 = len(X2)
    u_hyp = 0
    
    mean_X1 = statistics.mean(X1)
    mean_X2 = statistics.mean(X2)
    print(f'mean_X1 = {mean_X1}; mean_X2 = {mean_X2}')
    
    sum_of_X1_list = sum(X1)
    sum_of_X2_list = sum(X2)
    print(f'sum_of_X1_list = {sum_of_X1_list}; sum_of_X2_list = {sum_of_X2_list}')
    
    square_of_each_X1 = [num1**2 for num1 in X1]
    square_of_each_X2 = [num2**2 for num2 in X2]
    
    sum_of_square_of_each_X1 = sum(square_of_each_X1)
    sum_of_square_of_each_X2 = sum(square_of_each_X2)
    print(f'sum_of_square_of_each_X1 = {sum_of_square_of_each_X1}; sum_of_square_of_each_X2 = {sum_of_square_of_each_X2}')
    
    SS1 = sum_of_square_of_each_X1 - (sum_of_X1_list**2/n1)
    SS2 = sum_of_square_of_each_X2 - (sum_of_X2_list**2/n1)
    print(f'SS1 = {SS1}; SS2 = {SS2}')
    
    pooled_variance = (SS1+SS2)/(n1+n2-2)
    print(f'pooled_variance = {pooled_variance}')
    
    std_error = math.sqrt((pooled_variance/n1)+(pooled_variance/n2))
    print(f'std_error = {std_error}')
    
    t_ratio = ((mean_X1-mean_X2)-u_hyp)/std_error
    print(f't_ratio = {t_ratio}')
t_Test_two_samples(X1,X2)

mean_X1 = 15.8; mean_X2 = 9
sum_of_X1_list = 158; sum_of_X2_list = 90
sum_of_square_of_each_X1 = 3168; sum_of_square_of_each_X2 = 1036
SS1 = 671.5999999999999; SS2 = 226.0
pooled_variance = 49.86666666666666
std_error = 3.158058475287203
t_ratio = 2.1532216876958206


#### Progress Check *14.4 Find the approximate p-value for each of the following test results:
(a) one-tailed test, upper tail critical; df = 12; t = 4.61 Answer: p < 0.001

(b) one-tailed test, lower tail critical; df = 19; t = –2.41 Answer: p < 0.05

(c) two-tailed test; df = 15; t = 3.76 Answer: p < 0.01

(d) two-tailed test; df = 42; t = 1.305 Answer: p > 0.05

(e) one-tailed test, upper tail critical; df = 11; t = –4.23 (Be careful!) Answer: p > 0.05

#### Progress Check *14.5 Indicate which member of each of the following pairs of p-values describes the more rare test result:
(a1) p > .05 (a2) p < .05

(b1) p < .001 (b2) p < .01

(c1) p < .05 (c2) p < .01

(d1) p < .10 (d2) p < .20

(e1) p = .04 (e2) p = .02

a2, b1, c2, d1, e2

#### Progress Check *14.6 Treating each of the p-values in the previous exercise separately, indicate those that would cause you to reject the null hypothesis at the .05 level of significance.
a2, b1, b2, c1, c2, e1, e2

### CONFIDENCE INTERVAL (CI) FOR $\mu_1 - \mu_2$ (TWO INDEPENDENT SAMPLES)

$\overline{X}_1 - \overline{X}_2~\pm~(t_{conf})(S_{\overline{X}_1-\overline{X}_2})$

#### Progress Check *14.7 Imagine that one of the following 95 percent confidence intervals is based on an EPO experiment. (Because of the appearance of pairs of limits with dissimilar signs, a statistically significant result wasn’t required as a preliminary screen for constructing the confidence interval—possibly because, in the early stages of research, the investigator simply wanted to know the range of estimates, whether positive or negative, for any possible effect of EPO.)

| 95% CONFIDENCE INTERVAL | LOWER LIMIT | UPPER LIMIT | DIFFERENCE |
|:---------:|:--------:|:--------:|:--------:|
|  1 |  –3.45  |  4.25  | 4.25−–3.45 = 7.7 |
|  2  |  1.89  |  2.21  | 2.21-1.89 = 0.32 |
|  3  |  –1.54  |  –0.32  | –0.32-–1.54 = 1.22 |
|  4  |  0.21  |  1.53  | 1.53-0.21 = 1.32 |
|  5  |  –2.53  |  1.78  | 1.78-–2.53 = 4.31 |

(a) Which confidence interval is most precise? 2, because of the smallest difference

(b) Which confidence interval most strongly supports the conclusion that EPO facilitates endurance? 2,because of the smallest difference

(c) Which confidence interval most strongly supports the conclusion that EPO hinders endurance? 3, most negative

(d) Which confidence interval would most likely stimulate the investigator to conduct an additional experiment using larger sample sizes? 1, because of biggest difference

### ESTIMATING EFFECT SIZE : COHEN ’ S d

STANDARDIZED EFFECT SIZE, COHEN’S d (TWO INDEPENDENT SAMPLES)

$d = \dfrac{mean~difference}{standard~deviation} = \dfrac{\overline{X}_1 - \overline{X}_2}{\sqrt{S_P^2}}$

d refers to a standardized estimate of the effect size; $\overline{X}_1$ and $\overline{X}_2$ are the two sample means; and $\sqrt{S_P^2}$ is the sample standard deviation obtained from the square root of the pooled variance estimate.

### Cohen's Guidelines for d

| d | EFFECT SIZE 
|:---------:|:--------:|
|  0.20 |  small  |
|  0.50  | medium |
|  0.80  |  large  |


#### Progress Check *14.8 Recall that in Question 14.3, a psychologist determined the effect of instructions on the time required by subjects to solve the same puzzle. For two independent samples of ten subjects per group, mean solution times, in minutes, were longer for subjects given “difficult” instructions (X = 15.8, s = 8.64) than for subjects given “easy” instructions (X = 9.0, s = 5.01). A t ratio of 2.15 led to the rejection of the null hypothesis.

(a) Given a standard deviation, sp, of 7.06, calculate the value of the standardized effect size, d.

Answer: d = 0.9629500128629355

(b) Indicate how these results might be described in the literature.

Answer: Puzzle-solving times are longer, on average, for subjects who are told that the puzzle is difficult (X = 15.8, s = 8.64) than for those who are told that the puzzle is easy (X = 9.0, s = 5.01), according to the t test [t (18) = 2.15, p < .05, d = .96].

$d = \dfrac{mean~difference}{standard~deviation} = \dfrac{\overline{X}_1 - \overline{X}_2}{\sqrt{S_P^2}}$

### Sample Sums of Squares
$SS$ = $\Sigma({X}-\overline{X})^2$ = $\Sigma{X}^2$ $-$ $\dfrac{(\Sigma{X})^2}{n}$

### THE POOLED VARIANCE

$S_P^2 = \dfrac{SS_1 + SS_2}{n_1 + n_2 - 2}$

In [4]:
# a)
X1 = [5,20,7,23,30,24,9,8,20,12]
X2 = [13,6,6,5,3,6,10,20,9,12]

def t_Test_two_samples(X1,X2):
    import statistics
    import math
    n1 = len(X1)
    n2 = len(X2)
    u_hyp = 0
    
    mean_X1 = statistics.mean(X1)
    mean_X2 = statistics.mean(X2)
    print(f'mean_X1 = {mean_X1}; mean_X2 = {mean_X2}')
    
    sum_of_X1_list = sum(X1)
    sum_of_X2_list = sum(X2)
    print(f'sum_of_X1_list = {sum_of_X1_list}; sum_of_X2_list = {sum_of_X2_list}')
    
    square_of_each_X1 = [num1**2 for num1 in X1]
    square_of_each_X2 = [num2**2 for num2 in X2]
    
    sum_of_square_of_each_X1 = sum(square_of_each_X1)
    sum_of_square_of_each_X2 = sum(square_of_each_X2)
    print(f'sum_of_square_of_each_X1 = {sum_of_square_of_each_X1}; sum_of_square_of_each_X2 = {sum_of_square_of_each_X2}')
    
    SS1 = sum_of_square_of_each_X1 - (sum_of_X1_list**2/n1)
    SS2 = sum_of_square_of_each_X2 - (sum_of_X2_list**2/n1)
    print(f'SS1 = {SS1}; SS2 = {SS2}')
    
    pooled_variance = (SS1+SS2)/(n1+n2-2)
    print(f'pooled_variance = {pooled_variance}')
    
    std_error = math.sqrt((pooled_variance/n1)+(pooled_variance/n2))
    print(f'std_error = {std_error}')
    
    t_ratio = ((mean_X1-mean_X2)-u_hyp)/std_error
    print(f't_ratio = {t_ratio}')
t_Test_two_samples(X1,X2)

X_difficult = 15.8
s_difficult = 8.64
X_easy = 9
s_easy = 5.01
SS1 = 671.5999999999999
SS2 = 226.0
mean_X1 = 15.8
mean_X2 = 9

print(f'standard deviation, sp = {math.sqrt(pooled_variance)}')
d = (mean_X1 - mean_X2)/math.sqrt(pooled_variance)
print(f'd = {d}')

mean_X1 = 15.8; mean_X2 = 9
sum_of_X1_list = 158; sum_of_X2_list = 90
sum_of_square_of_each_X1 = 3168; sum_of_square_of_each_X2 = 1036
SS1 = 671.5999999999999; SS2 = 226.0
pooled_variance = 49.86666666666666
std_error = 3.158058475287203
t_ratio = 2.1532216876958206
standard deviation, sp = 7.061633427661525
d = 0.9629500128629355


#### *14.10 Figure 4.2 on page 62 describes the results for two fictitious experiments, each with the same mean difference of 2 but with noticeably different variabilities. Unresolved was the question “Once variability has been considered, should the difference between each pair of means be viewed as real or merely transitory?” A t test for two independent samples permits us to answer this question for each experimental result.

(a) Referring to Figure 4.2, again decide which of the two identical differences between pairs of means—that for Experiment B or for Experiment C—is more likely to be viewed as real. Answer: The difference between means for experiment B is more likely to be viewed as
real because of its smaller variability.

(b) Given that sp2 = .33 for Experiment B, test the null hypothesis at the .05 level of significance.

Statistical Hypothesis

$H_{0}$ : $\mu_{1} - \mu_{2} = 0$

$H_{1}$ : $\mu_{1} - \mu_{2} \neq 0$

### STANDARD ERROR

$S_{\overline{X}_1-\overline{X}_2} = \sqrt{\dfrac{S^2_P}{n_1} + \dfrac{S^2_P}{n_2}}$

### t TEST

t Ratio

$t = \dfrac{(difference~between~sample~means)-(hypothesized~difference~between~population~means)}{estimated~standard~error}$

Expressed in symbols,

### t RATIO FOR TWO POPULATION MEANS (TWO INDEPENDENT SAMPLES)

$t = \dfrac{(\overline{X}_1 - \overline{X}_2)-(\mu_1 - \mu_2)_{hyp}}{S_{\overline{X}_1-\overline{X}_2}}$


(c) Given that sp2 = 3.67 for Experiment C, test the null hypothesis at the .05 level of significance. You needn’t repeat the usual step-by-step hypothesis test procedure, but specify the observed value of t and the decision about the null hypothesis.

(d) Specify the approximate p-values for both t tests.

(e) Answer the original question about whether the difference between each pair of means is real or merely transitory. Answer: The difference between the means for experiment B is probably real, while that for experiment C is merely transitory.

(f) If a difference is real, use Cohen’s d to estimate the effect size.

STANDARDIZED EFFECT SIZE, COHEN’S d (TWO INDEPENDENT SAMPLES)

$d = \dfrac{mean~difference}{standard~deviation} = \dfrac{\overline{X}_1 - \overline{X}_2}{\sqrt{S_P^2}}$

In [5]:
# b) 95% confidence level, significance level of 0.05, two-tailed test, n1 = n2 = 7; df=n-7+7-2=12; t_value = ±2.179
sp2 = 0.33
mean_difference = 2
X_hyp = 0
std_err = math.sqrt((sp2/7)+(sp2/7))
t_ratio = (mean_difference - X_hyp)/std_err

print(f'std_err = {std_err}')
print(f't_ratio = {t_ratio}')
# Reject null hypothesis because 6.513389472789295 is larger than +2.179

std_err = 0.3070597894314954
t_ratio = 6.513389472789295


In [6]:
# c)
sp2 = 3.67
mean_difference = 2
X_hyp = 0
std_err = math.sqrt((sp2/7)+(sp2/7))
t_ratio = (mean_difference - X_hyp)/std_err

print(f'std_err = {std_err}')
print(f't_ratio = {t_ratio}')
# Retain null hypothesis because 1.953129257488548 is smaller than +2.179

# d)
# For experiment B, p < .001, while for experiment C, p > .05

std_err = 1.0239977678547099
t_ratio = 1.953129257488548


In [7]:
# f)
d1 = mean_difference/math.sqrt(0.33)
d2 = mean_difference/math.sqrt(3.67)
print(d1,d2)

3.481553119113957 1.0439915019437611


#### 14.11 To test compliance with authority, a classical experiment in social psychology requires subjects to administer increasingly painful electric shocks to seemingly helpless victims who agonize in an adjacent room.* Each subject earns a score between 0 and 30, depending on the point at which the subject refuses to comply with authority—an investigator, dressed in a white lab coat, who orders the administration of increasingly intense shocks. A score of 0 signifies the subject’s unwillingness to comply at the very outset, and a score of 30 signifies the subject’s willingness to comply completely with the experimenter’s orders.

#### Ignore the very real ethical issues raised by this type of experiment, and assume that you want to study the effect of a “committee atmosphere” on compliance with authority. In one condition, shocks are administered only after an affirmative decision by the committee, consisting of one real subject and two associates of the investigator, who act as subjects but, in fact, merely go along with the decision of the real subject. In the other condition, shocks are administered only after an affirmative decision by a solitary real subject.

#### A total of 12 subjects are randomly assigned, in equal numbers, to the committee condition (X1) and to the solitary condition (X2). A compliance score is obtained for each subject. Use t to test the null hypothesis at the .05 level of significance.

### Compliance Scores

| COMMITTEE | SOLITARY
|:---------:|:--------:|
|  2 |  3  |
|  5  | 8 |
|  20  |  7  |
|  15 |  10  |
|  4  | 14 |
|  10 |  0  |

Statistical Hypothesis

$H_{0}$ : $\mu_{1} - \mu_{2} = 0$

$H_{1}$ : $\mu_{1} - \mu_{2} \neq 0$

### THE POOLED VARIANCE

$S_P^2 = \dfrac{SS_1 + SS_2}{n_1 + n_2 - 2}$

### Sample Sums of Squares
$SS$ = $\Sigma({X}-\overline{X})^2$ = $\Sigma{X}^2$ $-$ $\dfrac{(\Sigma{X})^2}{n}$

### STANDARD ERROR

$S_{\overline{X}_1-\overline{X}_2} = \sqrt{\dfrac{S^2_P}{n_1} + \dfrac{S^2_P}{n_2}}$

### t RATIO FOR TWO POPULATION MEANS (TWO INDEPENDENT SAMPLES)

$t = \dfrac{(\overline{X}_1 - \overline{X}_2)-(\mu_1 - \mu_2)_{hyp}}{S_{\overline{X}_1-\overline{X}_2}}$

In [8]:
X1 = [2,5,20,15,4,10]
X2 = [3,8,7,10,14,0]

def t_Test_two_samples(X1,X2):
    import statistics
    import math
    n1 = len(X1)
    n2 = len(X2)
    u_hyp = 0
    
    mean_X1 = statistics.mean(X1)
    mean_X2 = statistics.mean(X2)
    print(f'mean_X1 = {mean_X1}; mean_X2 = {mean_X2}')
    
    sum_of_X1_list = sum(X1)
    sum_of_X2_list = sum(X2)
    print(f'sum_of_X1_list = {sum_of_X1_list}; sum_of_X2_list = {sum_of_X2_list}')
    
    square_of_each_X1 = [num1**2 for num1 in X1]
    square_of_each_X2 = [num2**2 for num2 in X2]
    
    sum_of_square_of_each_X1 = sum(square_of_each_X1)
    sum_of_square_of_each_X2 = sum(square_of_each_X2)
    print(f'sum_of_square_of_each_X1 = {sum_of_square_of_each_X1}; sum_of_square_of_each_X2 = {sum_of_square_of_each_X2}')
    
    SS1 = sum_of_square_of_each_X1 - (sum_of_X1_list**2/n1)
    SS2 = sum_of_square_of_each_X2 - (sum_of_X2_list**2/n1)
    print(f'SS1 = {SS1}; SS2 = {SS2}')
    
    pooled_variance = (SS1+SS2)/(n1+n2-2)
    print(f'pooled_variance = {pooled_variance}')
    
    std_error = math.sqrt((pooled_variance/n1)+(pooled_variance/n2))
    print(f'std_error = {std_error}')
    
    t_ratio = ((mean_X1-mean_X2)-u_hyp)/std_error
    print(f't_ratio = {t_ratio}')
t_Test_two_samples(X1,X2)

mean_X1 = 9.333333333333334; mean_X2 = 7
sum_of_X1_list = 56; sum_of_X2_list = 42
sum_of_square_of_each_X1 = 770; sum_of_square_of_each_X2 = 418
SS1 = 247.33333333333337; SS2 = 124.0
pooled_variance = 37.13333333333334
std_error = 3.51820661385567
t_ratio = 0.6632166866334747


95% confidence and a significance level of 0.05, for a two-tailed test, df=6+6-2=10, t_value=$\pm$2.228, retain the null hypothesis because there is no significant difference between the groups. t_ratio = 0.6632166866334747 is smaller than 2.228.

#### 14.12 To determine whether training in a series of workshops on creative thinking increases IQ scores, a total of 70 students are randomly divided into treatment and control groups of 35 each. After two months of training, the sample mean IQ X 1 for the treatment group equals 110, and the sample mean IQ X 2 for the control group equals 108. The estimated standard error equals 1.80.

(a) Using t, test the null hypothesis at the .01 level of significance.

(b) If appropriate (because the null hypothesis has been rejected), estimate the standardized effect size, construct a 99 percent confidence interval for the true population mean difference, and interpret these estimates.

Statistical Hypothesis

$H_{0}$ : $\mu_{1} - \mu_{2} = 0$

$H_{1}$ : $\mu_{1} - \mu_{2} \neq 0$

### Sample Sums of Squares
$SS$ = $\Sigma({X}-\overline{X})^2$ = $\Sigma{X}^2$ $-$ $\dfrac{(\Sigma{X})^2}{n}$

### THE POOLED VARIANCE

$S_P^2 = \dfrac{SS_1 + SS_2}{n_1 + n_2 - 2}$

$S_P^2 = \dfrac{({n_1 - 1})s_1^2 + ({n_2 - 1})s_2^2}{n_1 + n_2 - 2}$

where $s_1^2$ and $s_2^2$ are variances from $\sigma^2 = \frac{SS}{N - 1}$ then $SS = \sigma^2(N-1)$

### ESTIMATED STANDARD ERROR

$S_{\overline{X}_1-\overline{X}_2} = \sqrt{\dfrac{S^2_P}{n_1} + \dfrac{S^2_P}{n_2}}$

### t RATIO FOR TWO POPULATION MEANS (TWO INDEPENDENT SAMPLES)

$t = \dfrac{(\overline{X}_1 - \overline{X}_2)-(\mu_1 - \mu_2)_{hyp}}{S_{\overline{X}_1-\overline{X}_2}}$

In [9]:
# (a) Using t, test the null hypothesis at the .01 level of significance.
n1 = n2 = 35
sample_mean1 =110
sample_mean2 = 108
std_error = 1.8
# 99% confidence level at 0.01 significance level for a two-tailed t-test, df=n1+n2-2=68, t_value = ±2.66
pooled_variance = ((std_error**2*(n1-1)) + (std_error**2*(n2-1)))/(n1+n2-2)
print(pooled_variance)

estimated_std_error = math.sqrt(std_error**2/n1 + std_error**2/n2)
print(estimated_std_error)

t = ((sample_mean1 - sample_mean2)- 0)/estimated_std_error
t
# At 99% confidence and significance level of 0.01, two-tailed test, df=n1+n2-2=35+35-2=68, t_value=±2.66
# Reject the null hypothesis because t_ratio = 4.648111258522642 is greater than t_value=2.66

3.24
0.43028229936038176


4.648111258522642

### ESTIMATING EFFECT SIZE : COHEN ’ S d

STANDARDIZED EFFECT SIZE, COHEN’S d (TWO INDEPENDENT SAMPLES)

$d = \dfrac{mean~difference}{standard~deviation} = \dfrac{\overline{X}_1 - \overline{X}_2}{\sqrt{S_P^2}}$

d refers to a standardized estimate of the effect size; $\overline{X}_1$ and $\overline{X}_2$ are the two sample means; and $\sqrt{S_P^2}$ is the sample standard deviation obtained from the square root of the pooled variance estimate.

### THE POOLED VARIANCE

$S_P^2 = \dfrac{SS_1 + SS_2}{n_1 + n_2 - 2}$

$S_P^2 = \dfrac{({n_1 - 1})s_1^2 + ({n_2 - 1})s_2^2}{n_1 + n_2 - 2}$

### ESTIMATED STANDARD ERROR

$S_{\overline{X}_1-\overline{X}_2} = \sqrt{\dfrac{S^2_P}{n_1} + \dfrac{S^2_P}{n_2}}$

### CONFIDENCE INTERVAL (CI) FOR $\mu_1 - \mu_2$ (TWO INDEPENDENT SAMPLES)

$\overline{X}_1 - \overline{X}_2~\pm~(t_{conf})(S_{\overline{X}_1-\overline{X}_2})$

At 99% confidence, significance level of 0.01, degrees of freedom df=n1+n2-2=35+35-2=68, two-tailed test, the t_value=2.66

In [10]:
# b) If appropriate (because the null hypothesis has been rejected), estimate the standardized effect size, construct a 
# 99 percent confidence interval for the true population mean difference, and interpret these estimates.
import math
sample_mean1 =110
sample_mean2 = 108
pooled_variance = ((std_error**2*(n1-1)) + (std_error**2*(n2-1)))/(n1+n2-2)
print(f'pooled_variance = {pooled_variance}')
d = (sample_mean1-sample_mean2)/math.sqrt(pooled_variance)
print(f"Cohen's d = {d}")

est_std_error = math.sqrt(pooled_variance/n1 + pooled_variance/n2)
print(f'est_std_error = {est_std_error}')

CI_1 = (sample_mean1-sample_mean2) - 2.66*est_std_error
CI_2 = (sample_mean1-sample_mean2) + 2.66*est_std_error
print(f'CI_1 = {CI_1}; CI_2 = {CI_2}')

pooled_variance = 3.24
Cohen's d = 1.1111111111111112
est_std_error = 0.43028229936038176
CI_1 = 0.8554490837013844; CI_2 = 3.1445509162986154


#### 14.13 Is the performance of college students affected by the grading policy? In an introductory biology class, a total of 40 student volunteers are randomly assigned, in equal numbers, to take the course for either letter grades or a simple pass/fail. At the end of the  academic term, the mean achievement score for the letter grade students $\overline{X}_1$ equals 86.2, and the mean achievement score for pass/fail students $\overline{X}_2$ equals 81.6. The estimated standard error is 1.50.

Hypothesis Statement

$H_{0}$ : $\mu_{1} - \mu_{2} = 0$

$H_{1}$ : $\mu_{1} - \mu_{2} \neq 0$

(a) Use t to test the null hypothesis at the .05 level of significance.

(b) How would the above hypothesis test change if the roles of $X_1$ and $X_2$ were reversed—that is, if $X_1$ were identified with pass/fail students and $X_2$ were identified with letter grade students?

(c) Most students would doubtless prefer to select their favorite grading policy rather than be randomly assigned to a particular grading policy. Therefore, why not replace random assignment with self-selection?

(d) Specify the p-value for this test result.

(e) If the test result is statistically significant, estimate the standardized effect size, given that the standard deviation, $s_P$, equals 5.

(f) State how the test results might be reported in the literature, given that $s_1$ = 5.39 and $s_2$ = 4.58.

### THE POOLED VARIANCE

$S_P^2 = \dfrac{SS_1 + SS_2}{n_1 + n_2 - 2}$

$S_P^2 = \dfrac{({n_1 - 1})s_1^2 + ({n_2 - 1})s_2^2}{n_1 + n_2 - 2}$

### ESTIMATED STANDARD ERROR

$S_{\overline{X}_1-\overline{X}_2} = \sqrt{\dfrac{S^2_P}{n_1} + \dfrac{S^2_P}{n_2}}$

### t RATIO FOR TWO POPULATION MEANS (TWO INDEPENDENT SAMPLES)

$t = \dfrac{(\overline{X}_1 - \overline{X}_2)-(\mu_1 - \mu_2)_{hyp}}{S_{\overline{X}_1-\overline{X}_2}}$

In [13]:
# (a) Use t to test the null hypothesis at the .05 level of significance, two-tailed test df=40-2=38, t_value=2.042
n1 = n2 = 20
X1 = 86.2
X2 = 81.6
std_error = 1.5

pooled_variance = (std_error**2*(n1-1) + std_error**2*(n2-1))/(n1+n2-2)
print(f'pooled_variance = {pooled_variance}')

est_std_error = math.sqrt((pooled_variance/n1) + (pooled_variance/n2))
print(f'est_std_error = {est_std_error}')

# Solution from ChatGPT
t_ratio = ((X1-X2)-0)/est_std_error
print(t_ratio)

# Solution from book
t_ratio2 = ((X1-X2)-0)/std_error
print(t_ratio2)

pooled_variance = 2.25
est_std_error = 0.4743416490252569
9.697651491183048
3.066666666666672


In [15]:
# (b) How would the above hypothesis test change if the roles of X1 and X2 and were reversed—that is, if X1 were identified with pass/fail 
# students X2 and were identified with letter grade students?
n1 = n2 = 20
X1 = 81.6
X2 = 86.2
std_error = 1.5

pooled_variance = (std_error**2*(n1-1) + std_error**2*(n2-1))/(n1+n2-2)
print(f'pooled_variance = {pooled_variance}')

est_std_error = math.sqrt((pooled_variance/n1) + (pooled_variance/n2))
print(f'est_std_error = {est_std_error}')

# Solution from ChatGPT
t_ratio = ((X1-X2)-0)/est_std_error
print(t_ratio)

# Solution from book
t_ratio2 = ((X1-X2)-0)/std_error
print(t_ratio2)

# The calculated t ratio would have been equal to –3.07 rather than 3.07. Most important, however, the same interpretation would have been 
# appropriate: Introductory biology students have higher achievement scores, on average, when awarded letter grades rather than a simple 
# pass/fail.

pooled_variance = 2.25
est_std_error = 0.4743416490252569
-9.697651491183048
-3.066666666666672
