# Fisher Exact Test for Independence

Fisher exact test is a more precise generalization of the Chi-Sq test that makes fewer assumptions, but is less "famous" because it is only made possible with computational power. Basically, Chi-Sq makes some normality assumptions (that only hold with "large" numbers of observations in every cell - often interpreted to mean every cell must have more than 5 expected observations in each cell). This normality assumption allows it to analytically obtain a p-value for any $n * m$ contigency table. However, in this case, there are multiple cells where the expected observations are less than 5.

Instead of relying on a normality assumption, Fisher's Exact Test is similar to the technique I used in the interrater_agreement notebook. Based on the null hypothesis (that the variables `category` and `funding_source`
are independent), it generates some large number (`n_simulations`) of random contigency tables. Then, it counts the proportion of these simulations that are as extreme or more extreme than the observed contigency table.

Note: This notebook is written in Jupyter. However, the scipy implementation of Fisher's Exact Test is only for $2*2$ contigency tables, so I'm using rpy2 to port R code into Python. You will need to have R installed (and the `stats` package) to run the notebook.

In [55]:
import pandas as pd
import numpy as np
import scipy as sp
import itertools
import rpy2.robjects.numpy2ri
from rpy2.robjects.packages import importr
rpy2.robjects.numpy2ri.activate()

stats = importr('stats')

In [102]:
cat_data = pd.read_csv("categories by mechanism.csv") #TOCHANGE

categories = cat_data.iloc[:, 0].tolist()
grant_types = cat_data.columns[1:]
cont_table = np.asarray(cat_data.iloc[:, 1:cat_data.shape[1]])

cat_pairwise = list(itertools.combinations(range(0, len(categories)), 2))

print(cont_table)

[[ 277   15   28    9   37    0]
 [ 777   62   88    8  118   26]
 [ 982  148   52   32   80   19]
 [ 412   42   34   12   60    0]
 [ 907   54   43    0  104    6]
 [1592  218   84   42  308    9]
 [ 872  122   59   34  154   30]
 [ 120    4    3    0   18    0]
 [ 167   20   24   14   10   21]]


In [125]:
n_simulations = 100000 #TOCHANGE higher n_simulations leads to more accurate results - 100,000 is probably sufficient, but can do 1,000,000 if it is "final" data
alpha = 0.05 #TOCHANGE 0.05 is a good default for upper range of p-value

grant_cont_tables = []
for i in range(0, len(grant_types)):
    is_grant_i = cont_table[:, i]
    not_grant_i = np.delete(cont_table, i, axis = 1).sum(axis = 1)
    
    bin_cont_table = np.column_stack((is_grant_i, not_grant_i))
    
    grant_type_p = stats.fisher_test(bin_cont_table, 
                        simulate_p_value = True,
                        B = n_simulations)[0]
    
    grant_cont_tables.append(bin_cont_table)
    
    print("P-value for " + grant_types[i] + " vs Not is: " + str(grant_type_p))


P-value for #R01 vs Not is: [9.9999e-06]
P-value for #U01 vs Not is: [9.9999e-06]
P-value for #R44 vs Not is: [9.9999e-06]
P-value for #U24 vs Not is: [9.9999e-06]
P-value for #R21 vs Not is: [9.9999e-06]
P-value for #U54 vs Not is: [9.9999e-06]


In [144]:
grant_col = []
cat1_col = []
cat2_col = []
pval_col = []
sig_posthoc = []
cat1_prop = []
cat2_prop = []

for grant_index in range(0, len(grant_cont_tables)):
    bin_cont_table = grant_cont_tables[grant_index]
    
    for pairwise_index in range(0, len(cat_pairwise)):
        comparison = cat_pairwise[pairwise_index]
        
        pairwise_cont_table = bin_cont_table[comparison, :]
        cat_props = pairwise_cont_table[:, 0] / pairwise_cont_table.sum(axis = 1)
        category_1 = categories[comparison[0]]
        category_2 = categories[comparison[1]]

        pairwise_p = stats.fisher_test(pairwise_cont_table, 
                        simulate_p_value = True,
                        B = n_simulations)[0]
        bonf_sig = pairwise_p < alpha / len(cat_pairwise)

        grant_col.append(grant_types[grant_index])
        cat1_col.append(category_1)
        cat2_col.append(category_2)
        pval_col.append(pairwise_p)
        sig_posthoc.append(bonf_sig)
        cat1_prop.append(cat_props[0])
        cat2_prop.append(cat_props[1])
        
pairwise_fishers = pd.DataFrame({
    "grant_type" : grant_col,
    "category_1" : cat1_col,
    "category_2" : cat2_col,
    "uncorrected_pvalues" : pval_col,
    "sig_post_correction" : sig_posthoc,
    "category_1_prop" : cat1_prop,
    "category_2_prop" : cat2_prop
},
    index = list(range(len(grant_col))))

In [148]:
pairwise_fishers
pairwise_fishers.to_csv("pairwise_fishers.csv", encoding='utf-8', index=False)

Assuming this is actual data, a number of the pairwise comparisons have significant p-values at alpha = 0.05 even after correcting for multiple comparisons. These pairwise comparisons have `sig_post_correction` of `True`. Fisher's Exact Test for 2x2 contigency table can be interpreted as a binomial test, which I think has a more intuitive interpretation that a Chi-square test of Independence.

An example interpretation for row 11 is: "A higher proportion of successful grant submissions in the Human behavior and interaction category were awarded R01s when compared to those in the Data, model, and device types category."

Note that the `category_1` and `category_2` columns are interchangeable. For example, if you want to find all pairwise comparisons involving the category Data, model, and device types, you would need to find all rows such that that category is in either `category_1` or `category_2`.