### $\bullet$ SUM OF SQUARES (REPEATED MEASURES)
$SS_{total}=SS_{between}+SS_{within}$

$SS_{total}=SS_{between}+SS_{subject}+SS_{error}$

$SS_{within}=SS_{subject}+SS_{error}$

### $\bullet$ WORD, DEFINITION, AND COMPUTATION FORMULAS FOR SS TERMS (REPEATED-MEASURES ANOVA)
$\bullet$ $\bullet$ For the total sums of squares, $\bullet$ $\bullet$

$SS_{total}$ = the sum of squared deviations for scores about the grand mean

$=\Sigma(X-\overline{X}_{grand})^2$

$SS_{total}=\Sigma{X}^2-\dfrac{G^2}{N}$, where $G$ is the grand total and $N$ is its sample size.

$\bullet$ $\bullet$ For the between sum of squares, $\bullet$ $\bullet$

$SS_{between}$ = the sum of squared deviations for group means about the grand mean

$=n\Sigma(\overline{X}_{group}-\overline{X}_{grand})^2$

$SS_{between}=\Sigma{\dfrac{T^2}{n}}-\dfrac{G^2}{N}$, where $T$ is the group total and $n$ is its sample size (and also the number of subjects).

$\bullet$ $\bullet$ For the within sum of squares, $\bullet$ $\bullet$

$SS_{subject}=$ the sum of squared deviations of subject means about the grand mean

$=k\Sigma(\overline{X}_{subject}-\overline{X}_{grand})^2$

$SS_{subject}=\Sigma{\dfrac{T^2_{subject}}{k}}-\dfrac{G^2}{N}$, where $T_{subject}$ is the total for each subject and $k$ equals the number of repeated measures (and also the number of levels of the independent variable)

$\bullet$ $\bullet$ For the error sum of squares, $\bullet$ $\bullet$

$SS_{error}=SS_{within}-SS_{subject}$

REMINDER:

$X=score$

$T=group~total$

$n = group~sample~size;~number~of~subjects$

$G = grand~total$

$N = grand~(combined)~sample~size$

$T_{subject} = subject~total$

$k = number~of~repeated~measures$

#### $\bullet$ FORMULAS FOR df TERMS: REPEATED-MEASURES ANOVA

$df_{total}=N-1$, the number of all scores$-$1

$df_{between}=k-1$, the number of repeated measures (or levels of the independent variable)$-$1

$df_{within}=N-k$, the number of all scores$-$number of levels

$df_{subject}=n-1$, the number of subjects$-$1

$df_{error}=df_{within}-df_{subject}=(N-k)-(n-1)$

$df_{between} = column/numerator$

$df_{error} = row/denominator$

#### $\bullet$ PROPORTION OF EXPLAINED VARIANCE (REPEATED MEASURES)

$\eta^2_p=\dfrac{SS_{between}}{SS_{total}-SS_{subject}}=\dfrac{SS_{between}}{(SS_{between}+SS_{subject}+SS_{error})-SS_{subject}}$

$\eta^2_p=\dfrac{SS_{between}}{SS_{between}+SS_{error}}$

where $\eta^2_p$ is referred to as a partial $\eta^2$ or more technically as a partial squared curvilinear correlation

$\bullet$ Guidelines for $\eta^2_p$

\begin{array}{ccc}
\text{$\eta^2_p$} & \text{EFFECT} &\\
\hline
.01 & small \\
.09 & medium \\
.25 & large \\
\end{array}

#### $\bullet$ TUKEY’S HSD TEST (REPEATED MEASURES)

$HSD=q\sqrt{\dfrac{MS_{error}}{n}}$

$\bullet$ to obtain q from the table use df_error and k

#### $\bullet$ STANDARDIZED EFFECT SIZE, COHEN’S d (ADAPTED FOR REPEATED-MEASURES ANOVA)

$d=\dfrac{\overline{X}_1-\overline{X}_2}{\sqrt{s^2_p}}=\dfrac{\overline{X}_1-\overline{X}_2}{\sqrt{MS_{error}}}$

In [1]:
# Sample For
import statistics
b = [[0,4,2],[3,6,6],[6,8,10]]
final = []
for index, item in enumerate(b):
    # print(index)
    temp = []
    final.append(temp)
    for n in range(len(b)):
        # print(f'n = {n}')
        # print(b[n][index])
        temp.append(b[n][index])
    print(f'temp = {temp}')
print(f'final = {final}')

mean = [statistics.mean(list) for list in final]
print(f'mean = {mean}')

# Another way using while an for loop
silence = [6,11,5,7,4,10]
white_noise = [4,3,4,6,6,7]
rock = [2,0,1,2,4,5]
the_data = [silence,white_noise,rock]
# list = [[0,4,2],[3,6,6],[6,8,10]]
final = []
n = 0
while n < len(the_data[0]):
    # print(f'n = {n}')
    temp = []
    final.append(temp)
    for i in range(len(the_data)):
        # print(f'i = {i}')
        # print(list[i][n])
        temp.append(the_data[i][n])
    # print(temp)
    n+=1
    # print('+++++')
print(final)

temp = [0, 3, 6]
temp = [4, 6, 8]
temp = [2, 6, 10]
final = [[0, 3, 6], [4, 6, 8], [2, 6, 10]]
mean = [3, 6, 6]
[[6, 4, 2], [11, 3, 0], [5, 4, 1], [7, 6, 2], [4, 6, 4], [10, 7, 5]]


In [4]:
def SS_Terms_Repeated_ANOVA(the_data,alpha):
    import statistics,math,numpy
    from scipy.stats import f
    import statsmodels.stats.libqsturng as qsturng

    joined_data = numpy.concatenate(the_data)
    print(f'joined_data = {joined_data}')

    k = len(the_data)
    print(f'k = {k}')

    N = len(joined_data)
    print(f'N = {N}')

    n = len(the_data[0])
    print(f'n = {n}')

    df_total = N-1
    print(f'df_total = {df_total}')

    df_between = k-1
    print(f'df_between = {df_between}')

    df_within = N-k
    print(f'df_within = {df_within}')

    df_subject = n-1
    print(f'df_subject = {df_subject}')

    df_error = df_within - df_subject
    print(f'df_error = {df_error}')
    
    T = [sum(data) for data in the_data]
    print(f'T = {T}')

    G = sum(T)
    print(f'G = {G}')

    T_sq_over_n = [round(t**2/n,3) for t in T]
    print(f'T_sq_over_n = {T_sq_over_n}')

    G_sq_over_N = G**2/N
    print(f'G_sq_over_N = {G_sq_over_N}')

    SS_between = round(sum(T_sq_over_n) - G_sq_over_N,3)
    print(f'SS_between = {SS_between}')

    sum_X_sq = sum([num**2 for num in joined_data])
    print(f'sum_X_sq = {sum_X_sq}')

    SS_within = round(sum_X_sq - sum(T_sq_over_n),3)
    print(f'SS_within = {SS_within}')

    # Subjects
    subjects = []
    n = 0
    while n < len(the_data[0]):
        temp = []
        subjects.append(temp)
        for i in range(len(the_data)):
            temp.append(the_data[i][n])
        n+=1
    print(f'subjects = {subjects}')

    subject_totals = [sum(list) for list in subjects]
    print(f'subject_totals = {subject_totals}')

    T_sq_subject = [round(num**2/k,3) for num in subject_totals]
    print(f'T_sq_subject = {T_sq_subject}')

    SS_subject = round(sum(T_sq_subject) - G_sq_over_N,3)
    print(f'SS_subject = {SS_subject}')

    SS_error = round(SS_within - SS_subject,3)
    print(f'SS_error = {SS_error}')

    SS_total = sum_X_sq - G_sq_over_N
    print(f'SS_total = {SS_total}')

    MS_between = SS_between/df_between
    print(f'MS_between = {MS_between}')

    MS_error = round(SS_error/df_error,3)
    print(f'MS_error = {MS_error}')

    F_ratio = round(MS_between/MS_error,3)
    print(f'F_ratio = {F_ratio}')

    F_critical_value = round(f.ppf(1 - alpha, df_between, df_error),3)
    print(f'F_critical_value = {F_critical_value}')

    if F_ratio >= F_critical_value:
        print(f'F_ratio = {F_ratio} is GREATER or EQUAL to F_critical_value = {F_critical_value}, therefore we will REJECT the Null Hypothesis')
    else:
        print(f'F_ratio = {F_ratio} is LESS than the F_critical_value = {F_critical_value}, therefore we will RETAIN the Null Hypothesis')

    eta_sq_p = round(SS_between/(SS_between + SS_error),3)
    print(f'proportion of explained variance = {eta_sq_p}')

    means = [round(statistics.mean(list),3) for list in the_data]
    print(f'means = {means}')

    q = round(qsturng.qsturng(1 - alpha, k, df_error),3)
    print(f'q = {q}')

    HSD = round(q*((MS_error/n)**0.5),3)
    print(f'HSD = {HSD}')
    
    means.sort(reverse=True)
    Cohens_ds = [round((max(means)-means[n])/MS_error**0.5,3) for n in range(1,len(means))]
    print(f"Cohen's d's = {Cohens_ds}")

X0 = [0,4,2]
X24 = [3,6,6]
X48 = [6,8,10]
a = [X0,X24,X48]
SS_Terms_Repeated_ANOVA(a,0.05)
print('===========')
# Using SLEEP-DEPRIVATION EXPERIMENT: REPEATED MEASURES (means)
MS_error = 1.0
d48_0 = 6/(MS_error**0.5)
d48_24 = 3/(MS_error**0.5)
print(f'd48_0 = {d48_0}\nd48_24 = {d48_24}')

joined_data = [ 0  4  2  3  6  6  6  8 10]
k = 3
N = 9
n = 3
df_total = 8
df_between = 2
df_within = 6
df_subject = 2
df_error = 4
T = [6, 15, 24]
G = 45
T_sq_over_n = [12.0, 75.0, 192.0]
G_sq_over_N = 225.0
SS_between = 54.0
sum_X_sq = 301
SS_within = 22.0
subjects = [[0, 3, 6], [4, 6, 8], [2, 6, 10]]
subject_totals = [9, 18, 18]
T_sq_subject = [27.0, 108.0, 108.0]
SS_subject = 18.0
SS_error = 4.0
SS_total = 76.0
MS_between = 27.0
MS_error = 1.0
F_ratio = 27.0
F_critical_value = 6.944
F_ratio = 27.0 is GREATER or EQUAL to F_critical_value = 6.944, therefore we will REJECT the Null Hypothesis
proportion of explained variance = 0.931
means = [2, 5, 8]
q = 5.033
HSD = 2.906
Cohen's d's = [3.0, 6.0]
d48_0 = 6.0
d48_24 = 3.0


In [3]:
X0 = [0,4,2]
X24 = [3,6,6]
X48 = [6,8,10]
a = [X0,X24,X48]
SS_Terms_Repeated_ANOVA(a,0.05)

# Using SLEEP-DEPRIVATION EXPERIMENT: REPEATED MEASURES (means)
MS_error = 1.0
d48_0 = 6/(MS_error**0.5)
d48_24 = 3/(MS_error**0.5)
print(f'd48_0 = {d48_0}\nd48_24 = {d48_24}')

joined_data = [ 0  4  2  3  6  6  6  8 10]
k = 3
N = 9
n = 3
df_total = 8
df_between = 2
df_within = 6
df_subject = 2
df_error = 4
T = [6, 15, 24]
G = 45
T_sq_over_n = [12.0, 75.0, 192.0]
G_sq_over_N = 225.0
SS_between = 54.0
sum_X_sq = 301
SS_within = 22.0
subjects = [[0, 3, 6], [4, 6, 8], [2, 6, 10]]
subject_totals = [9, 18, 18]
T_sq_subject = [27.0, 108.0, 108.0]
SS_subject = 18.0
SS_error = 4.0
SS_total = 76.0
MS_between = 27.0
MS_error = 1.0
F_ratio = 27.0
F_critical_value = 6.944
F_ratio = 27.0 is GREATER or EQUAL to F_critical_value = 6.944, therefore we will REJECT the Null Hypothesis
proportion of explained variance = 0.931
means = [2, 5, 8]
q = 5.033
HSD = 2.906
Cohen's_ds = [6.0, 3.0]
d48_0 = 6.0
d48_24 = 3.0


#### $\bullet$ REPORTS IN THE LITERATURE (From sleep deprivation experiment)

Mean aggression scores of 2, 5, and 8 were obtained when the same subjects were exposed to 0, 24, and 48 hours of sleep deprivation, respectively. There is evidence that, on average, aggression scores increase with hours of sleep deprivation [F (2, 4) = 27, MSE = 1.0, p < .01, $\eta^2_p$ = .93]. According to Tukey’s HSD test, all pairs of differences were significant (HSD = 2.87, p < .05 with 3 ≤ d ≤ 6).

ALL POSSIBLE ABSOLUTE DIFFERENCES BETWEEN PAIRS OF MEANS

SLEEP-DEPRIVATION EXPERIMENT: REPEATED MEASURES
\begin{array}{ccc}
\ & \overline{X}_0=2 & \overline{X}_{24}=5 & \overline{X}_{48}=8 \\
\hline
\overline{X}_0=2 & - & 3 & 6 \\
\overline{X}_{24}=5 &  & - & 3 \\
\overline{X}_{48}=8 &  &  & - \\
\end{array}

#### Progress Check *17.2 A school psychologist tests the effects of environmental noises on the reading comprehension scores of high school students who rotate, with the customary controls, through three different conditions: silence, white noise, and rock music. The reading comprehension scores for six subjects are as follows:

\begin{array}{ccc}
\text{SUBJECT} & \text{SILENCE} & \text{WHITE NOISE} & \text{ROCK} & \text{T}_{subject} \\
\hline
A & 6 & 4 & 2 & 12 \\
B & 11 & 3 & 0 & 14 \\
C & 5 & 4 & 1 & 10 \\
D & 7 & 6 & 2 & 15 \\
E & 4 & 6 & 4 & 14 \\
F & 10 & 7 & 5 & 22 \\
Total & 43 & 30 & 14  \\
\end{array}

#### Progress Check *17.4 
(a) Since the null hypothesis was rejected in Question 17.2, use Tukey’s HSD test to identify which pairs of population means differ significantly at the .05 level, given that the means for silence, white noise, and rock equal 7.17, 5.00, and 2.33, respectively.

(b) Use Cohen’s d to estimate the effect size for any statistically significant pairs of observed means.

(c) Interpret the results. The partial eta-squared, $\eta^2_p$, equals .64, a large effect. Mean reading comprehension is significantly higher when silence is compared with rock, with a standardized effect size. d, equivalent to almost two and one-half standard deviations. (proportion of explained variance = 0.6355421686746994)