<a href="https://colab.research.google.com/github/zakariabeni/Interactive-Statistics-Notebooks/blob/master/Fisher's_Exact_Test.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Fisher's Exact Test

### Table of Contents
- Introduction
- Fisher's Exact Test
- Fisher’s exact test vs Pearson’s chi-squared test
- Application
- Conclusion

### Introduction
In statistics, an exact (significance) test is a test where if the null hypothesis is true then all assumptions are met. Using an exact test provides a significance test that keeps the Type I error rate of the test (α) at the desired significance level of the test. For example, an exact test at significance level of α = 5%, when repeating the test over many samples where the null hypotheses is true, will reject at most 5% of the time. 

Most statistical tests calculate a P-value based on how a statistic (e.g. a mean or a proportion) is distributed. In contrast, exact tests calculate a P-value empirically. Exact tests calculate the empirical probability of getting an outcome as different or more from the null hypothesis, compared to the outcome observed in the data.


### Fisher's Exact Test
![Image of Yaktocat](https://upload.wikimedia.org/wikipedia/commons/a/aa/Youngronaldfisher2.JPG)

Sir Ronald Aylmer Fisher (17 February 1890 – 29 July 1962) was a British statistician and the single most important figure in 20th century statistics. One of his greatest contributions to modern statistics was “Fisher’s exact test”.

Fisher’s exact test is used to determine whether there is a significant association between two categorical variables in a contingency table. In statistics, a contingency table is a type of table in matrix format that displays the (multivariate) frequency distribution of the variables.
Fisher’s exact test is used to calculate the probability that proportions as or more extreme than the proportions we observed are caused by random chance. By keeping the row and column totals fixed and using permutations and combinations to determine the probability of every possible contingency table, Fisher’s exact test formula can be described as hypergeometric probability as follows:

![Image of Yaktocat](https://drive.google.com/uc?id=16qHi7-lJDFdGMvYkktCvPyR74QcmYUME)
![Image of Yaktocat](https://drive.google.com/uc?id=1loKyuF8yzqkyaYSjTTS9gkUW9zvhfGU0)

The null hypothesis for the test is that there is no association between the rows and columns of the 2 × 2 table, such that the probability of a subject being in a particular row is not influenced by being in a particular column. An important assumption for Fisher’s Exact test is that the binary data are independent. If the proportions are correlated then more advanced techniques should be applied.

The test is based upon calculating directly the probability of obtaining the results that we have shown (or results more extreme) if the null hypothesis is actually true, using all possible 2 × 2 tables that could have been observed, for the same row and column totals as the observed data. These row and column totals are also known as marginal totals. What we are trying to establish is how extreme our particular table (combination of cell frequencies) is in relation to all the possible ones that could have occurred given the marginal totals.

### Fisher’s exact test vs Pearson’s chi-squared test
Fisher’s exact test is an alternative to Pearson’s chi-squared test for independence. While actually valid for all sample sizes, Fisher’s exact test is practically applied when sample sizes are small. A general recommendation is to use Fisher’s exact test, instead of the chi-squared test, whenever more than 20% of cells in a contingency table have expected frequencies < 5.

With large samples, a chi-squared test can be used in this situation. However, the significance value it provides is only an approximation, because the sampling distribution of the test statistic that is calculated is only approximately equal to the theoretical chi-squared distribution. The approximation is inadequate when sample sizes are small, resulting in the cell counts predicted on the null hypothesis (the “expected values”) being low. In fact, for small, sparse, or unbalanced data, the exact and asymptotic P-values can be quite different and may lead to opposite conclusions concerning the hypothesis of interest. In contrast the Fisher’s exact test is exact as long as the experimental procedure keeps the row and column totals fixed, and it can therefore be used regardless of the sample characteristics. It becomes difficult to calculate with large samples or well-balanced tables, but fortunately these are exactly the conditions where the chi-squared test is appropriate.


### Application
Knowing that 15 of these 32 teenagers are studying, and that 15 of the 32 are female, and assuming the null hypothesis that men and women are equally likely to study, what is the probability that these 15 teenagers who are studying would be so unevenly distributed between the women and the men? If we were to choose 15 of the teenagers at random, what is the probability that 12 or more of them would be among the 15 women, and only 3 or fewer from among the 17 men?

![Image of Yaktocat](https://drive.google.com/uc?id=1ADeB_nVwUSrWKgDDMcN1Xl1vFIblxFw_)

In [0]:
# import libraries
import numpy as np 
import pandas as pd
import scipy
import scipy.special

The null hypothesis ($H_{0}$) is that men and women are equally likely to study or the difference of the proportion of studying individuals between men and women that observed is not significant. The alternative hypothesis ($H_{a}$) is that the proportion of studying individuals is higher or lower among the women than among the men, or the difference of proportions is significant.

In [0]:
# create contingency table with floats to avoid datatype issues with pd.DataFrame.at 
ar=np.array([[12.0, 3.0],[3.0, 14.0]])    
df=pd.DataFrame(ar, columns=["Women", "Men"])
df.index=["Studying", "Not-studying"] 
df

Unnamed: 0,Women,Men
Studying,12.0,3.0
Not-studying,3.0,14.0


In [0]:
# create contingency table with the marginal totals and the grand total. 
df2=df.copy()
df2.loc['Column_Total']= df2.sum(numeric_only=True, axis=0)
df2.loc[:,'Row_Total'] = df2.sum(numeric_only=True, axis=1)
df2

Unnamed: 0,Women,Men,Row_Total
Studying,12.0,3.0,15.0
Not-studying,3.0,14.0,17.0
Column_Total,15.0,17.0,32.0


From the data in table, of the 17 men who like to study only 3 (18 per cent) whereas, of the 15 women, 12 teenagers like to study (80 per cent).
There are 16 different ways of rearranging the cell frequencies for the table whilst keeping the marginal totals the same. The result that corresponds to our observed cell frequencies is `(xiii)`.

![Image of Yaktocat](https://drive.google.com/uc?id=1rBw1sRjPewrJT3Z-68CwDLbP49FrJoi3)
![Image of Yaktocat](https://drive.google.com/uc?id=1NAoGI0TVm40LgY-7j_5UrkEwVTuaQFqu)
![Image of Yaktocat](https://drive.google.com/uc?id=1y9eA5224GNdhyy4Dpz8kJVy_lkSHe7Ux)
![Image of Yaktocat](https://drive.google.com/uc?id=17NCWj7WcHsBUgkpAPxGvx7qv1cPfSNnu)
![Image of Yaktocat](https://drive.google.com/uc?id=1tKZ9l0TnPK8Bm0hqkT2EPwFUSPLR7ypG)
![Image of Yaktocat](https://drive.google.com/uc?id=1Bt5amSWba9RiTDK0qD9fcDnDLHt671NL)
![Image of Yaktocat](https://drive.google.com/uc?id=1rfrwGhc2_M3YNA_KM8dO6UMihiQxwjXi)
![Image of Yaktocat](https://drive.google.com/uc?id=1V3YNmthj7KbcdP1OIsZQwZNwvF1rOlhv)
![Image of Yaktocat](https://drive.google.com/uc?id=1-NiIh8v-02au4GOKfph1pK7N19EkBiAM)
![Image of Yaktocat](https://drive.google.com/uc?id=1gWsaWFedWuzBvSPU-1tKp93AP1jLKNeH)
![Image of Yaktocat](https://drive.google.com/uc?id=1A83oDILbn57ykt61fw3lP4XtotZcApJw)
![Image of Yaktocat](https://drive.google.com/uc?id=1iIajaSi_HsMovDiSJwfo_DHIYEUpHIXK)
![Image of Yaktocat](https://drive.google.com/uc?id=1tXkx-dCKQA64QwzdQCJODAnGqkb00EBI)
![Image of Yaktocat](https://drive.google.com/uc?id=1x3tXmn5pE6Bev0yfemAkYXIfHj9saJse)
![Image of Yaktocat](https://drive.google.com/uc?id=19vAgJ7NBB-753mVU8EQzEADmsvwYCIVS)
![Image of Yaktocat](https://drive.google.com/uc?id=1gRaHe2iSw9DkDIMuzBWXDpwGqniGm7Xq)

For example, the probability of obtaining `(i)` is
![Image of Yaktocat](https://drive.google.com/uc?id=1JdmDiGQn-TeYilWa9Ukg8IOZ_PwZgK6e)

In [0]:
# create function that takes the upper left cell of a contingency table as input and 
# returns the probability to observe this particular contingency table.
def p(a): 
    v=(scipy.special.binom(int(df2.iloc[0,2]), a) * scipy.special.binom(int(df2.iloc[1,2]), (int(df2.iloc[2,0])-a)) )/scipy.special.binom(n, int(df2.iloc[2,0]))
    return v

p(0) # if we try "a=0" we get the following probability of (i)

2.404004562517836e-07

And for all difference ways can be summarized as follows:

![Image of Yaktocat](https://drive.google.com/uc?id=12CDRjeDVpcDs5-UeMoJrQBtTj8doySgO)

Table above shows that the probability of obtaining our observed frequencies for `(xiii)` gives `P = 0.0005469` and probability of obtaining our results or results more extreme (a difference that is at least as large) is the sum of the probabilities for `(xiii)` to `(xvi)` = `0.000573`‬. This gives the **one-sided P-value** for obtaining our results or results more extreme. 

In order to obtain the **two-sided P-value** there are several approaches.
- Simply double the value of one-sided, which gives `P = 0.0001146`.‬
- Add together all the probabilities that are the same size or smaller than our observed result. In this case, all probabilities that are less than or equal to `0.0005469`, are `(i)`, `(ii)`, `(iii)`, `(xiii)`, `(xiv)`, `(xv)` and `(xvi)`. This gives a two-sided value of `P = 0.001033`‬.
- Swinscow and Campbell introduced a *mid-P method*, which computes half the value of the observed table and the sum of the more extreme probabilities. This gives `P = 0.000759`.


In [0]:
# In our contingency table, a was equal to 12.
p_observed=p(12)

p_list=[]
# calculate p(a) for every possible table we can get given the fixed margins...
for i in range(int(df2.iloc[0,2])  + 1  ):
    if p(i)<=p_observed:
        # append these probabilites to p_list only if <= p_observed
        p_list.append(p(i))
        
# the sum of this list corresponds to the p-value         
p_val=np.sum(p_list) 
p_val

0.0010326118774229185

In [0]:
# Using Python's built-in function
import scipy.stats as stats

oddsratio, pvalue = stats.fisher_exact([[12, 3],[3, 14]])  
pvalue

# both p-values are identical with second approach to obtain the two-sided p-value

0.0010326118774229133

For our example, the P-value is less than 0.05, the nominal level for statistical significance and **we can conclude that there is evidence of a statistically significant difference in the proportions to likely study between the men and women**. However, in common with other non-parametric tests, Fisher’s Exact test is simply a hypothesis test. It will merely tell you whether a difference is likely, given the null hypothesis (of no difference). It gives you no information about the likely size of the difference, and **so we can conclude that there is a significant difference between the gender type with respect to likely to study or not**, we can draw no conclusions about the possible size of the difference.

#### Notes: 
- The one-sided Fisher’s tests if a result is greater than or less than a certain amount. The two-sided Fisher’s tests if a result is different from a certain proportion. In most cases, you’ll probably use a two-sided test.
- The criticism of the first two methods is that they are too conservative, i.e. if the null hypothesis was true, over repeated studies they would reject the null hypothesis less often than 5 per cent.
- The *mid-P value method* is less conservative, and gives approximately the correct rate of type I errors (false positives).


### Conclusion
Fisher’s Exact test is used for analyzing simple 2 × 2 contingency tables when the assumptions for the Chi-squared test are not met.

An alternative exact test, Barnard's exact test, has been developed and proponents of it suggest that this method is more powerful, particularly in 2 × 2 tables. Another alternative is to use maximum likelihood estimates to calculate a p-value from the exact binomial or multinomial distributions and reject or fail to reject based on the P-value.

### References
- [Exact test](https://en.wikipedia.org/wiki/Exact_test)
- [Fisher's exact test](https://en.wikipedia.org/wiki/Fisher's_exact_test)
- [Fisher's exact test from scratch with python](https://towardsdatascience.com/fishers-exact-test-from-scratch-with-python-2b907f29e593)
- [Fisher's exact test independence](https://www.statisticshowto.com/fishers-exact-test-independence/) 
- [Tutorial Fisher's exact test](https://www.sheffield.ac.uk/polopoly_fs/1.43998!/file/tutorial-9-fishers.pdf) 
