# scipy.stats.fisher_exact #

The null hypothesis is that the true odds ratio of the populations underlying the observations is one, and the observations were sampled from these populations under a condition: the marginals of the resulting table must equal those of the observed table. The statistic returned is the unconditional maximum likelihood estimate of the odds ratio, and the p-value is the probability under the null hypothesis of obtaining a table at least as extreme as the one that was actually observed.

**Ladies Tea Tasting Experiemnt**

Our experiment consists in mixing eight cups of tea, four in one way and four in the other, and presenting them to the subject for judgment in a random order. The subject has been told in advance of what the test will consist, namely that she will be asked to taste eight cups, that these shall be four of each kind, and that they shall be presented to her in a random order, that is in an order not determined arbitrarily by human choice, but by the actual manipulation of the physical apparatus used in games of chance, cards, dice, roulettes, etc., or, more expeditiously, from a published collection of random sampling numbers purporting to give the actual results of such manipulation. Her task is to divide the 8 cups into two sets of 4, agreeing, if possible, with the treatments received.

**Null Hypothesis:** the subject can not tell if the cup had milk in it first or last.


Once we collect the experimental data we evaluate how likely we were to see such data if the null hypothesis is true.

If it is very unlikely, then we may reject the null hypothesis.

Typically we have an alternative hypothesis that we suggest rejecting the null hypothesis is evidence for.


**Alternative Hypothesis:** the subject can tell.


Should the subject pick the correct four cups with milk in them first, there is only a 1 in 70 (~1.4%) chance of them having done that if they were simply guessing.

Fisher considered that unlikely enough to reject the null hypothesis if they manage it.


![table](ProbTable.jpg)

![Example Table](TableExam.jpg)

The one-tailed p value for Fisher’s Exact Test is calculated as:

p = (a+b)!(c+d)!(a+c)!(b+d)! / (a!b!c!d!n!)

This produces the same p value as the **CDF of the hypergeometric distribution** with the following parameters:

population size = n
population “successes” = a+b
sample size = a + c
sample “successes” = a

In [25]:
from scipy.stats import hypergeom
table = np.array([[4, 0], [0, 4]])
M = table.sum()
n = table[0].sum()
N = table[:, 0].sum()
start, end = hypergeom.support(M, n, N)
hypergeom.pmf(np.arange(start, end+1), M, n, N)

array([0.01428571, 0.22857143, 0.51428571, 0.22857143, 0.01428571])

In [26]:
from scipy.stats import fisher_exact
oddsr, p = fisher_exact(table, alternative='two-sided')
p

0.028571428571428536

In [27]:
# this gives us the probability that the exact 4 cups are guessed correctly 
oddsr, p = fisher_exact(table, alternative='greater')
p

0.014285714285714268

In [28]:
oddsr, p = fisher_exact(table, alternative='less')
p

1.0

![probabilities](ProbTea.jpg)

# References #

https://towardsdatascience.com/fishers-exact-test-from-scratch-with-python-2b907f29e593

https://pythonguides.com/scipy-stats/
