# Chapter 4 Analysis of Variance
     

### 4.1.2 One-way ANOVA

In the Spock trial data, let $r=4$ denote the number of judges,  $Y_{ij}$ be the percentage of women in the $j$th panel for the $i$th judge. Let Judge 1 be the judge in the Spock trial. We can propose the following model, for $j=1,\ldots, n_i, i=1,\ldots, r$,
\[
Y_{ij}=\mu_i + \epsilon_{ij}, 
\]
where $\{\epsilon_{ij}\}$ are i.i.d. $N(0,\sigma^2)$. In the Spock trial data, $r=4$, $n_2=6$, and $n_1=n_3=n_4=9$. This model is a one-way **an**alysis **o**f **va**riance model in its cell means form.  We will discuss other forms later in this chapter.  

Because $n_1,n_2, n_3$ and $n_4$ are not equal, this is an imbalanced ANOVA model. If $n_1=n_2=\cdots=n_r$, then the ANOVA model is **balanced**.

In this model, $\mu_i$ represents the mean percentage of women in the panels for the $i$th judge, and $\sigma^2$ represents the variance in the percentages across panels. It is easy to see that, by using one $\sigma^2$ across all judges, we assume the panels of all four judges have the same amount of variability. 

The question of interest in the Sprock trial data can now be translated to whether $\{\mu_i\}_{i=1}^r$ are the same, where $\{\mu_i\}$ and $\sigma^2$ are unknown. 


The estimators for $\mu_i, i=1,2,\ldots, r$ are simply the within-group sample means, i.e., for $i=1,\ldots, r$,
\[
\hat{\mu}_i = \bar{Y}_{i\cdot} = \frac{1}{n_i}\sum_{j=1}^{n_i} Y_{ij}.
\]
We have two observations on our estimators $\hat{\mu}_i$ $i=1,\ldots, r$. 
1. The estimator $\hat{\mu}_i$ is also the maximum likelihood estimator for $\mu_i$. 
2. $\hat{\mu}_i$ is the best linear unbiased estimator if $\{Y_{ij}\}$ are mutually uncorrelated but not necessarily normally distributed.

We can call the `aov()` function to fit a one-way ANOVA model in `R`. 

In [15]:
Spock <- read.csv(file="../Data/SpockTrial.csv", header=TRUE, sep=",")
Spock$Judge<-as.factor(Spock$Judge);

anova.fit<-aov(perc.women~Judge,data=Spock);
summary(anova.fit)

ls(anova.fit)
anova.fit$coef

            Df Sum Sq Mean Sq F value   Pr(>F)    
Judge        3 1591.3   530.4   17.61 1.06e-06 ***
Residuals   29  873.5    30.1                     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1