Hypothesis Testing Framework
$$
\begin{array}{|c|c|c|c|}
\hline & \begin{array}{c}
\text { Declared non- } \\
\text { significant }
\end{array} & \begin{array}{c}
\text { Declared } \\
\text { significant }
\end{array} & \text { Total } \\
\hline \begin{array}{c}
\text { True Null } \\
\text { Hypothesis }
\end{array} & \begin{array}{c}
\mathbf{U} \\
\text { Correct }
\end{array} & \begin{array}{c}
\mathbf{V} \\
\text { Type I Error }
\end{array} & m_{0} \\
\hline \begin{array}{c}
\text { Non-true Null } \\
\text { Hypothesis }
\end{array} & \begin{array}{c}
\mathbf{T} \\
\text { Type II Error }
\end{array} & \begin{array}{c}
\mathbf{S} \\
\text { Correct }
\end{array} & m-m_{0} \\
\hline \text { Total } & m-\mathbf{R} & \mathbf{R} & m \\
\hline
\end{array}
$$

FWER(Familywise/Experimental Error Rate):
<br>
-Probability of making at least one Type I error amongest m independent comparsions $Pr(V\ge 1)$.
<br>
Methods of Controlling FWER:
1. Bonferroni Correction (overly conservative)
2. Holm-Bonferroni method
3. Many more examples: Sidak, Scheffe, Dunnet

FWER is designed for a handful of multiple comparisons, but if we find ourseleves with hundreds of hypothesis tests, we need to control False Discovery Rate(FDR)

FDR is defined as the proportion of rejected hypothesis that are erroneous: V/R 

Benjamini-Hochberg Procedure:
<br>
 $Q=V/(V+S)=V/R$
 <br>
 B-H focus on the expectation of $Q$
 
 1. order m unadjusted p-value generated from m hypothesis test
 2. Let k be largest i for which 
 $$p_{(i)} \le \frac{i}{m} q^{*}$$ where $q^*$ can be set by users, 0.05/0.1/0.01.

3. Reject all $H_i$ for $ i \in (1,2,\dots,k)$

In [1]:
import numpy as np
import pandas as pd

In [2]:
data_vec=[0.0001, 0.0004, 0.0019, 0.0095, 0.0201, 0.0278, 0.0298, 0.0344, 0.0459, 0.3240, 0.4262, 0.5719, 0.6528, 0.7590, 1.000]
#define q^*=0.05
q=0.05
m=len(data_vec)
hold_list=[]
for i in range(1,m+1):
    val=i/m*q
    hold_list.append(val)

combo=pd.DataFrame({'i':range(1,m+1),
                   'p_value':data_vec,
                   'adjusted_p':hold_list})
combo['rejected']=np.where(combo['p_value']<combo['adjusted_p'],1,0)


In [3]:
combo

Unnamed: 0,i,p_value,adjusted_p,rejected
0,1,0.0001,0.003333,1
1,2,0.0004,0.006667,1
2,3,0.0019,0.01,1
3,4,0.0095,0.013333,1
4,5,0.0201,0.016667,0
5,6,0.0278,0.02,0
6,7,0.0298,0.023333,0
7,8,0.0344,0.026667,0
8,9,0.0459,0.03,0
9,10,0.324,0.033333,0


* BH is better than FWER
* BH depends upon the independence
* selection of q

Benjamini-Krieger-Yekuteli's Adaptive FDR control
* estimate $k$ and then $\hat{m}_0=m-k$
* $q^*$=$q^{'}m/\hat{m}_0$

Other methods: 
* Storey's postive FDR(p-FDR)
* Local FDR
* Exceedance Control

接下来使用python 的statsmodel模块来处理FDR

In [13]:
import numpy as np
import pandas as pd
from statsmodels.stats.multitest import multipletests
multipletests?

In [14]:
data_vec=[0.0001, 0.0004, 0.0019, 0.0095, 0.0201, 0.0278, 0.0298, 0.0344, 0.0459, 0.3240, 0.4262, 0.5719, 0.6528, 0.7590, 1.000]
q=0.05
m=len(data_vec)
result=multipletests(data_vec,alpha=0.05,method='fdr_bh')
combo=pd.DataFrame({'i':range(1,m+1),
                   'p_value':data_vec,
                   'adjusted_p':result[1],
                   'rejected':result[0]})
combo

Unnamed: 0,i,p_value,adjusted_p,rejected
0,1,0.0001,0.0015,True
1,2,0.0004,0.003,True
2,3,0.0019,0.0095,True
3,4,0.0095,0.035625,True
4,5,0.0201,0.0603,False
5,6,0.0278,0.063857,False
6,7,0.0298,0.063857,False
7,8,0.0344,0.0645,False
8,9,0.0459,0.0765,False
9,10,0.324,0.486,False
