# Test 29: The Link–Wallace test for multiple comparison of K population means (equal sample sizes)

## Objective

- You have $K$ populations 
- You want to know if the means are equal

## Assumptions

- The $K$ populations are normally distributed
- The $K$ populations have equal variances
- The $K$ populations have equal sample sizes $n$

## Method

- You have $K$ samples of size $n$; $x_1, x_2 ... x_k$

- Compute the sample means $\bar{x_1}, \bar{x_2} ... \bar{x_k}$

- Compute the range of values in each sample, and call this $w(x_i)$
    - i.e. Find the difference between the maximum and minimum of each sample

- Compute the range of values for the sample means, and call this $w(\bar{x})$

- The test statistic is
$$\begin{aligned}
    K_L &= \frac{n \cdot w(\bar{x})}{\sum_{i=1}^{k} w(x_i)}
\end{aligned}$$

- This test statistic follows the Link-Wallace distribution in Table 10

## Proof that test statistic follows Link Wallace table

- Using $\alpha=0.05$, we will try to find that the critical value for $K=4$ and $n=50$ is 1.45

In [1]:
import numpy as np
import scipy
import seaborn as sns
import matplotlib.pyplot as plt

In [9]:
K = 4
MEANS = [5] * K
SIGMA = [2] * K
SAMPLE_SIZE = [50] * K

def get_test_statistic():
    samples = [np.random.normal(x,y,z) for x,y,z in zip(MEANS, SIGMA, SAMPLE_SIZE)]
    sample_means = [np.mean(x) for x in samples]
    sample_ranges = [max(x) - min(x) for x in samples]
    sample_means_range = max(sample_means) - min(sample_means)

    test_statistic = (
        (SAMPLE_SIZE[0] * sample_means_range) /
        np.sum(sample_ranges)
    )
    return test_statistic

In [15]:
test_statistic_distribution = [get_test_statistic() for _ in range(3_000)]
np.percentile(test_statistic_distribution, q=95)

1.4498520053636055