## The Null and Alternative Hypotheses

The null hypothesis for a one-way ANOVA test is that the population means of all groups are equal. Symbolically, this can be represented as:

$$H_0: \mu_1 = \mu_2 = \mu_3 = \ldots = \mu_k$$

where $\mu_i$ represents the population mean of the $i^{th}$ group.

The alternative hypothesis for a one-way ANOVA can take one of several different forms, depending on the research question of interest. Some possible alternatives are:

- At least one population mean is different from the others:
$$H_A: \text{not all } \mu_i \text{ are equal}$$
- The means of some particular pairs of groups are different, but we don't know which ones:
$$H_A: \text{there exists } i \neq j \text{ such that } \mu_i \neq \mu_j$$
- The population means follow a specific pattern or trend:
$$H_A: \mu_1 < \mu_2 < \mu_3 < \ldots < \mu_k$$

The appropriate alternative hypothesis will depend on the research question and the specific context of the study.

## Assumptions

One-way ANOVA makes several assumptions about the data and the sampling process, including:

1. Normality: The dependent variable within each group should be normally distributed.
2. Homogeneity of variance: The variance of the dependent variable within each group should be approximately equal across all groups.
3. Independence: The observations within each group must be independent of each other.
4. Random sampling: The samples should be obtained through random sampling.
5. Interval or ratio scale: The dependent variable should be measured on an interval or ratio scale.

Violations of these assumptions can lead to incorrect conclusions or loss of power in the statistical test. For example, if the assumption of normality is violated, a non-parametric test such as the Kruskal-Wallis test may be more appropriate. If the assumption of homogeneity of variance is violated, a Welch's ANOVA or a robust ANOVA may be used instead.

It is important to verify the assumptions of one-way ANOVA before conducting the analysis. Techniques such as visual inspection of histograms, normal probability plots, and box plots can help to assess normality and homogeneity of variance. Residual plots can also be used to check for violations of the independence assumption.


## Using multiple t-tests instead of a one-way

Using multiple t-tests instead of a one-way ANOVA to compare three or more independent groups is not recommended because it can lead to an increased risk of type I errors (false positives).

Suppose you want to compare the means of three independent groups. If you conduct three independent t-tests (Group 1 vs. Group 2, Group 1 vs. Group 3, Group 2 vs. Group 3), you are essentially testing the null hypothesis three times, which increases the overall probability of making a type I error (rejecting the null hypothesis when it is actually true). For example, if you use an alpha level of .05 for each t-test, the probability of making at least one type I error across all three tests is $1 - (1-.05)^3 = 0.143$, or a $14.3$% chance.

In contrast, a one-way ANOVA simultaneously tests whether there are any significant differences between any of the groups, while controlling for the overall type I error rate at the specified alpha level. The F-test in the one-way ANOVA compares the ratio of the variance between groups to the variance within groups, which increases the power (sensitivity) of the test to detect real differences between the groups. By using a one-way ANOVA instead of multiple t-tests, you reduce the risk of mistakenly concluding that there are significant differences between groups when there are not.

To know how F distribution is calculated, first we have to know what the pooled variance is.

## Pooled Variance

The pooled variance is a combined estimate of the variance within each group in a one-way ANOVA. It is calculated using the standard formula (simply it is the weighted average of the variances):

$$s_p^2 = \frac{\sum_{i=1}^k (n_i - 1)s_i^2}{\sum_{i=1}^k (n_i - 1)}$$

where $s_i^2$ is the sample variance of the $i^{th}$ group, $n_i$ is the sample size of the $i^{th}$ group, and $k$ is the total number of groups.

The pooled variance is an important value in the calculation of the F-statistic in a one-way ANOVA, which is used to test for significant differences between the means of the groups. It is used to estimate the common population variance when assuming that the population variances are equal among all groups.

Intuitively, the pooled variance gives an overall estimate of how much variation there is in the data across all groups together. It takes into account both the sample sizes and sample variances of each group, and provides a single number that summarizes the variability of the data across the entire population.
