# Test 30: Dunnett’s test for comparing K treatments with a control

## Objective

- You have $K$ treatments and 1 control
- Which of the treatment means differ significantly from the control?

## Assumptions

- The $K+1$ samples all have the same size $n$
- The samples are all independent
- The samples are all from normally distributed populations
- The samples all have equal variances

## Method

- You have $K$ treatment samples and 1 control sample

- Let $S_0$ be the control sample, and $S_1 ... S_K$ be the treatment samples

- Compute the within group sum of squares for all samples $S_i^2$
$$\begin{aligned}
    S_i^2 &= (\sum x_i - \bar{x_i})^2
\end{aligned}$$

- Then, compute the overall variance in then $K+1$ groups as
$$\begin{aligned}
    S_W^2 &= \frac{S_0^2 + S_1^2 + ... S_K^2}{n(K+1) - K - 1}
\end{aligned}$$

- Compute the standard deviation of the differences between treatment means and control means $S(\bar{d})$
$$\begin{aligned}
    S(\bar{d}) &= \sqrt{\frac{2 S_W^2}{n}}
\end{aligned}$$

- Next, compute the quotient $D_j$ for each sample. This is your test statistic
$$\begin{aligned}
    D_j &= \frac{\bar{x_j} - \bar{x_0}}{S(\bar{d})} \quad \forall j \in [1, K]
\end{aligned}$$

- Compare each $D_j$ with the critical values found in the Dunnett distribution (Table 11)

- The degrees of freedom $\nu = K * n$

## Proof that Dunnett distribution is met for given $\nu$ and $K$

- Let's assume $K = 4$
- Let's assume $n = 30$, so $\nu = 120$
- Then, the critical value of $\alpha=0.05$ is 2.18
- Let's prove this

In [4]:
import numpy as np
import scipy 
import matplotlib.pyplot as plt
import seaborn as sns

In [413]:
K = 4
MEAN = [5] * (K+1)
SIGMA = [2] * (K+1)
SAMPLE_SIZE = [25] * (K+1)

CRITICAL_VALUE_5PCT = 2.18
# CRITICAL_VALUE_5PCT = 2.23
DEGREES_OF_FREEDOM = np.sum(SAMPLE_SIZE) - (K+1)

def get_test_statistic():
    samples = [np.random.normal(x,y,z) for x,y,z in zip(MEAN, SIGMA, SAMPLE_SIZE)]
    sample_means = [np.mean(x) for x in samples]
    sum_squares_within_group = [np.sum((x - np.mean(x))**2) for x in samples]
    
    # mean_squares_within_group = np.sum(sum_squares_within_group) / ((SAMPLE_SIZE[0]-1) * (K-1))
    mean_squares_within_group = np.sum(sum_squares_within_group) / DEGREES_OF_FREEDOM
    
    sd_bar = np.sqrt((2 * mean_squares_within_group)/SAMPLE_SIZE[0])
    quotients = [(x - sample_means[0])/sd_bar for x in sample_means[1:]]
    any_value_exceeds_critical_value = any([x > CRITICAL_VALUE_5PCT for x in quotients])
    # any_value_exceeds_critical_value = len([x for x in quotients if x > CRITICAL_VALUE_5PCT])

    return any_value_exceeds_critical_value

In [421]:
test_statistic_distribution = [get_test_statistic() for _ in range(3_000)]
np.mean(test_statistic_distribution)

# test_statistic_distribution = [get_test_statistic() for _ in range(3_000)]
# np.sum(test_statistic_distribution) / (3_000*K)