# Chapter 4 Analysis of Variance
     

In [2]:
Spock <- read.csv(file="../Data/SpockTrial.csv", header=TRUE, sep=",")
Spock$Judge<-as.factor(Spock$Judge);
anova.fit<- aov(perc.women~Judge,data=Spock)

### 4.1.11 Remedies for departures from model assumptions 

**Weighted least squares**

Idea: $\sqrt{w_i}\epsilon_{ij} \sim N(0,1)$ if $w_i = 1/\sigma_i^2$. 

Find weighted least squares estimator by minimizing $\sum_{i=1}^r \sum_{j=1}^{n_i} w_i(Y_{ij}-\mu)^2$ for the common mean 
$\tilde{\mu}=\sum n_i w_i \bar{Y}_{i\cdot}/\sum n_i w_i$. We will have a new set of ${\rm SSTR}$ and ${\rm SSE}$. 

In the end, we still have $F^* \sim F(r-1,n_T-r)$. In practice, we plug in $w_i = 1/s_i^2$, and the null distribution remains the same.






**Rank test** 
\[
F^*=\frac{ {\rm MSTR}(R) }{ {\rm MSE} (R)} = \frac{\sum \sum (\bar{R}_{i\cdot}-\bar{R})^2 /(r-1)}{\sum\sum (R_{ij}-\bar{R}_{i\cdot})^2/(n_T-r) } \sim F(r-1, n_T-r),
\]
where $R_{ij}$ is the rank of $Y_{ij}$ among all $n_T$ observations. Works when the sample size is large. 

In [6]:
(Spock$rank.perc = rank(Spock$perc.women))
anova.fit.np<- aov(rank.perc~Judge,data=Spock)
summary(anova.fit.np)

            Df Sum Sq Mean Sq F value   Pr(>F)    
Judge        3   1846   615.4    15.6 3.15e-06 ***
Residuals   29   1144    39.5                     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

**Kruskal-Wallis test**
\[
F^*=(n_T-1)\frac{ \sum_{i=1}^r (\bar{R}_{i\cdot}-\bar{R}_{\cdot \cdot})^2 }{ \sum \sum (R_{ij}-\bar{R}_{\cdot\cdot})^2} \sim \chi^2_{r-1},
\]
if $n_T$ is large. 

In [8]:
kruskal.test(perc.women~Judge,data=Spock)


	Kruskal-Wallis rank sum test

data:  perc.women by Judge
Kruskal-Wallis chi-squared = 19.757, df = 3, p-value = 0.0001906


**Box-Cox transformation**

\[
 Y(\lambda)=\frac{Y^{\lambda}-1}{\lambda}, 
\]
and $Y^0\equiv \log(Y)$ for $\lambda=0$. 

To tune the parameter $\lambda$, we can calculate the likelihood $L(\lambda) \equiv  \max_{\mu,\sigma} L(\lambda;\mu,\sigma)$. Then $\lambda^* = \arg \max L(\lambda)$. 
It can be shown that $\max L(\lambda)$ is equivalent to $\min {\rm SSE}[Y^*(\lambda)]$, where
\[
Y^*_{ij}(\lambda)\equiv \begin{cases}
\frac{Y_{ij}^\lambda-1}{\lambda \dot{Y}^\lambda -1} & \lambda\neq 0\\
\dot{Y}\log(Y_{ij}) & \lambda =0
\end{cases},
\]
where $\dot{Y}$ is the geometric mean of $Y$. 

Box-Cox for equal variance amounts to minimize the Bartlett statistics, or other test statistics. We can use the `boxcox()` function in library `MASS` for Box-Cox transformation in `R`. 
