# Definitions

When we collect measurements of a variable from 3 or more populations, we need to do multi-sample hypothesis testing: an analysis of variance (ANOVA).

Without getting into too many details, it is improper to employ multiple *t*-tests in such a situation. The issue is [multiple comparisons](https://colab.research.google.com/drive/1AmfvDhhfviRQFvONiUVUba_6hof8RDp6?usp=sharing). Let's imagine you want to test the following hypothesis: $H_0:\mu_1=\mu_2=\mu_3$ at $\alpha=0.05$. For each two-sample *t*-test performed, there is a 95% probability that we will correctly conclude not to reject $H_0$ when the two population means are equal. For the set of 3 hypotheses, the probability of correctly declining to reject all of them is only $0.95*0.95*0.95=0.86$. This means the probability of incorrectly rejecting at least one of the $H_0$s is $1-0.95=0.14$ (or 14%).

# One-factor ANOVA and the *F* statistic

One-factor (or "one-way") Analysis of Variance is used to decide whether two or more groups (or "levels", typically denoted as *k*) of a factor have the same mean value. It is a generalization of the [two-sample *t*-test](https://colab.research.google.com/drive/1M7xjaMwJUEyULPHfXc3tWG6-WVjCl-uQ?usp=sharing), which is limited to *k*=2.

Specifically, we test the Null hypothesis: 

$\quad H_0:\mu_1=\mu_2=...\mu_k$, where *k* is the number of experimental groups. 

For example, you present 4 different visual stimuli and record the a neuron's firing rate to multiple presentations of each stimulus. Here the "factor" is visual stimulus and variable is firing rate. Alternatively, you could take a group of students and randomly assign them to different groups to test the effect of 4 different neuroenhancers on a cognitive test. See [here](https://stats.stackexchange.com/questions/6350/anova-assumption-normality-normal-distribution-of-residuals) for a discussion on the assumptions regarding an ANOVA.

We can summarize this four-group example as follows:

&nbsp; | **Group 1** | **Group 2** | **Group 3** | **Group 4**
-- | -- | -- | -- | -- 
**Sample Size** | $n_1$ | $n_2$ | $n_3$ | $n_4$
**Sample Mean** | $\bar{X_1}$ | $\bar{X_2}$ | $\bar{X_3}$ | $\bar{X_4}$
**Sample Standard Deviation** | $s_1$ | $s_2$ | $s_3$ | $s_4$

We assume that $s_1=s_2=s_3=s_4$; that is, the groups might differ in terms of their mean value, but not their standard deviation. 

The test statistic is the *F* statistic, which is the ratio:

$\quad F=\frac{explained\:variance}{unexplained\:variance}=\frac{between-group\:variability}{within-group\:variability}=\frac{MSB}{MSE}$

This ratio is often summarized in a standard ANOVA table, as follows:

*Source of Error* | **DF** | **SS** | **MS** | **F**
-- | -- | -- | -- | -- 
**Between Groups** | *k*-1 | SSB | MSB | $\frac{MSB}{MSE}$
**Error (or Residual)** | *N*-*k* | SSE | MSE | &nbsp;
**Total** | *N*-1 | SSR+SSE |  &nbsp; | &nbsp;

*N* is the total sample size, equal to $n_1+n_2+n_3+n_4$ in the four-group example above.

MSB is the between-group mean square, which quantifies how much variability is accounted for by the groups and is defined as:

$\quad MSB=\frac{SSB}{df_r}$, where SSB is the between-group sum of squares defined as:

$\quad SSB=\sum^k_{j=1}{n_j(\bar{X_j}-\bar{X})^2}$,

where $\hat{X_j}$ are the per-group means and $\bar{X}$ is the overall mean, and $df_b=k-1$ is the between-group degrees of freedom.

MSE is the mean square error, which quantifies how much variability is left over and is defined as:

$\quad MSE=\frac{SSE}{df_e}$, where SSE is the error sum of squares, summed across groups, defined as:

$\quad SSE=\sum^k_{j=1}{\sum^{n_j}_{i=1}{(X_i-\bar{X_j})^2}}$,

and $df_e=n-k-1$ is the error degrees of freedom 


# Multiple comparisons for 1-factor ANOVA

An ANOVA, if the null hypothesis is rejected, just tells you that the means are different. Which ones are different? Are they all different? Or is just one different from the rest? To answer these questions, you need to do post-hoc multiple comparisons. Unfortunately, there is no one truly accepted method or test. Two very common ones are the Tukey test, which is also known as the "honestly significance difference" test, and the Scheffé test.Each test has somewhat different assumptions and is more or less robust to departures from these assumptions.

The Scheffé test allows you to test for all combinations of $H_0: \mu_B - \mu_A=0$. Also, it is a very easy statistic to compute: 

if $\mu_B-\mu_A > S_{crit}$, then reject the null hypothesis, where

$\quad S_{crit}=\sqrt{(k-1)\times F_{alpha, k-1, N-k}}$, *k* is the number of groups, and *N* is the number of groups *x* the number of samples/group (assume equal numbers per gropu).



# *N*-factor ANOVA

One might have different factors or variables for interest in an ANOVA. Let's go back to our idea about neuroenhancers. Imagine, not only that you were interested in whether each neuroenhancer affected the results of an IQ test but you were interested in whether the amount of sleep on the previous night affected the results. So, now there would be two factors: sleep ('a lot' versus 'little') and neuroenhancer (each one). The number of  levels in each factor would be 2 ('a lot' versus 'little') for the sleep factor and the number of tested neuroenhancers for that factor. These can be the same or can be different: you could have 2 sleep levels and 4 different neuroenhancer levels.

Now, we have 3 different null hypotheses to test: each factor (or main effect) and the interaction between the two main effects. The first $H_0$ is that there is no effect of sleep on the mean result of the IQ test. The second $H_0$ is that there is no effect of neuroenhancer on the mean result of the IQ test. These two are the hypotheses for each main effect. The interaction $H_0$ is there there is no interaction of sleep and neuroenhancer on the mean effect of the cognitive test.

Another important two-factor ANOVA is a repeated measure. Here, we would ask participants to try each neuroenhancer and then take the cognitive test. But we have only one $H_0$: the mean test score is the same for all participants all 3 drugs.

# Additional Resources


Using ANOVAs in [Matlab](https://www.mathworks.com/help/stats/analysis-of-variance-anova-1.html), [R](https://www.scribbr.com/statistics/anova-in-r/), and [Python](https://www.marsja.se/four-ways-to-conduct-one-way-anovas-using-python/).

# Credits

Copyright 2021 by Joshua I. Gold, University of Pennsylvania