# Chapter 4 Analysis of Variance
     

In [1]:
Spock <- read.csv(file="../Data/SpockTrial.csv", header=TRUE, sep=",")
Spock$Judge<-as.factor(Spock$Judge);
anova.fit<- aov(perc.women~Judge,data=Spock)
ns=as.numeric(table(Spock$Judge))
mse=sum(anova.fit$residual^2)/anova.fit$df.residual


### 4.1.10 Testing equal variance 

Estimate $\sigma_1^2,\sigma_2^2,\ldots, \sigma_r^2$ separately as, for $i=1,\ldots, r$, 
\[
s_i^2 = \sum_{j=1}^{n_i} \frac{\big(Y_{ij}-\bar{Y}_{i\cdot}\big)^2}{n_i-1}.
\]
We want to test the null hypothesis $H_0: \sigma_1=\cdots =\sigma_r$ against the alternative $H_a:$ not all $\sigma$s are equal. 


In [5]:
vars<-tapply(Spock$perc.women,Spock$Judge, var) #s_i^2
alpha=0.05

**Hartley test.** The test statistic is 
\[
H=\frac{\max(s_1^2,\ldots, s_r^2)}{\min(s_1^2,\ldots, s_r^2)}.
\]
Reject $H_0$ if $H>H(1-\alpha;r,n_i-1)$ when all $n_i$s are the same (balanced design). 

In [25]:
(H.stat = max(vars)/min(vars))

# For an unbalanced ANOVA:
library(SuppDists)
qmaxFratio(1-alpha,df=ns[2]-1, k=length(ns))
qmaxFratio(1-alpha,df=ns[1]-1, k=length(ns))
qmaxFratio(1-alpha,df= sum(ns)/length(ns) -1, k=length(ns))

qmaxFratio(1-alpha,df= floor(sum(ns)/length(ns) -1), k=length(ns)) # 7.25 -> 7
qmaxFratio(1-alpha,df= ceiling(sum(ns)/length(ns) -1), k=length(ns)) # 7.25 -> 8

**Bartlett test.** The test statistics is 
\[
K^2=(n_T-r) \log({\rm MSE})-\sum_{i=1}^r (n_i-1) \log (s_i^2).
\]
We know that $K^2 \geq 0$ from Jensen's inequality. Under $H_0$, $K^2$ is approximately $\chi^2_{r-1}$ assuming that $n_i$ are not small. Reject $H_0$ if $K^2> \chi^2(1-\alpha;r-1)$. Related to the likelihood ratio test. 


In [22]:
(K.stat = (sum(ns)-length(ns) )*log(mse)-sum(  (ns-1)*log(vars)  ))
qchisq(1-alpha, df=length(ns)-1)

**Levene test.** 
1. Create new data with $d_{ij}=|Y_{ij}-\bar{Y}_{i\cdot}|$. 
2. Treat $\{d_{ij}\}$ as response variables
3. Calculate the $F$-statistic for $H_0: \mathbb{E}[d_{1\cdot}]=\mathbb{E}[d_{2\cdot}] = \cdots =\mathbb{E}[d_{r\cdot}]$
Reject $H_0$ if $F^*>F(1-\alpha; r-1, n_T-r)$. 


In [24]:
Spock$res.abs = abs(anova.fit$residuals)
summary(aov(res.abs~Judge, data=Spock ))

            Df Sum Sq Mean Sq F value Pr(>F)
Judge        3   5.64    1.88   0.173  0.914
Residuals   29 314.70   10.85               