## How can one account for multiple testing?

**Bonferroni correction**: If there are $m$ tests, multiply the p-values by $m$.

The Bonferroni correction makes sure that P(any of the $m$ tests rejects in error) $\leq 5\%$ (Probability of Type 1 Error).

The Bonferroni correction is often very restrictive. It guards against having even one false positive among the $m$ tests.

As a consequence the adjusted p-values may not be significant any more even if a noticeable effect is present.

Alternatively, we can try to control the **False Discovery Proportion (FDP)**:

$$FDP = \frac{number~of~false~discoveries}{total~number~of~discoveries}$$

where a 'discovery' occurs when a test rejects the null hypothesis.
As an example, we test 1,000 hypotheses.

In 900 cases the null hypothesis is true ("Nothing is going on"), and in 100 cases an alternative hypothesis is true ("There is an effect: something is going on").

<img src="imgs/img0.PNG">


Doing 1,000 tests results in Discoveries (Red boxes) and Non-discoveries (Gray boxes).

Type 1 Error: reject the null hypothesis even though it's true<br>
Type 2 Error: fail to reject the null even if the alternative is true

<img src="imgs/img1.PNG">

If we have 900 null hypothesis, we expect to have:<br>
$900\times 5\% = 45$ false discoveries

We made 80 true discoveries and 41 false discoveries. The false discovery proportion is $ \frac{41}{80+41} = \frac{41}{121} =0.34$

## Accounting for multiple testing with FDR

**False discovery rate (FDR)**: Controls the expected proportion of discoveries that are false.

Benjamini-Hochberg procedure to control the FDR at level $\sigma = 5\% (say)$:

1. Sort the p-values: $p_{(1)} \leq ... \leq p_{(m)}$

2. Find the largest $k$ such that $p_{(k)} \leq \frac{k}{m} \sigma$

3. Declare discoveries for all tests $i$ from 1 to $k$.

**Using a validation set**: Split the data into a _model-building_ set and a _validation_ set before the analysis.

You may use data snooping on the model-building set to find something interesting.

Then test this hypothesis on the validation set.

This approach requires strict discipline: You are not allowed to look at the validation set during the exploratory step!