# Chapter 4 Analysis of Variance
     

### 4.1.4 Alternative forms of ANOVA

**Factor effect form**

Let $\mu=\sum_{i=1}^r w_i \mu_i$ and $\tau_i = \mu_i -\mu$. Then $\sum_{i=1}^r w_i \tau_i=0$. We can rewrite the ANOVA model as 
\[
Y_{ij}=\mu+\tau_i+\epsilon_{ij}, 
\]
where $\{\epsilon_{ij}\}$ are i.i.d. $N(0,\sigma^2)$. 

The least squares estimators are $\hat{\mu}=\sum_{i=1}^r w_i \bar{Y}_{i\cdot}$ and $\hat{\tau}_i = \bar{Y}_{i\cdot} -\hat{\mu}$. 

Notes in `R`:
1. One of $\tau_i$ is set to zero by default.
2. `R` assumes $w_i=n_i/n_T$  for one-way ANOVA, but equal weights for higher-order ANOVAs. 
    


In [2]:
Spock <- read.csv(file="../Data/SpockTrial.csv", header=TRUE, sep=",")
Spock$Judge<-as.factor(Spock$Judge);
anova.fit<- aov(perc.women~Judge,data=Spock)
# Summary
anova.fit$coef
?aov

**Regression form**
  
There are multiple equivalent forms to turn a cell-mean model into the typical linear regression form. 
The regression equation takes the following form, for $j=1,\ldots, n_i,\ \quad i=1,\ldots, r,$ 
 \[
  Y_{ij}=\mu +\tau_2 X_{2,ij}+\tau_3 X_{3,ij} + \tau_4 X_{4,ij} +\epsilon_{ij}. 
 \]
where $\{\epsilon_{ij}\}$ are i.i.d. $N(0,\sigma^2)$. But there are multiple choices in the coding of $\{X_{1,ij},\ldots, X_{4,ij}\}$. 
1. _Dummy variables_, e.g., $X_{l,ij}=1$ when $l=i$ and 0 otherwise. 
2. $X_{l,ij}=1$ when $l=i$, $X_{l,ij}=-1$ when $l=4$, and $X_{l,ij}=0$ otherwise. This is equivalent form for an ANOVA model with equal weights. 
3. $X_{l,ij}=1$ when $l=i$, $X_{l,ij}=-n_l/n_4$ when $l=4$, and $X_{l,ij}=0$ otherwise. This is equivalent to an ANOVA model with unequal weights. 

It is easy to see that all three regression models are equivalent, since they are all equivalent forms of the same ANOVA model. However, the interpretations of the model parameters (i.e., $\tau$s) may differ slightly.