# Assignemnt 1 - Fisher's Exact Test

### Null hypothesis:
Samples **A** and **B** were taken from the same probability distribution and the differences between them are caused by accident only. In other words, the efficacies of both drugs **A** and **B** are the same.


### Table 1

-| Improvement YES |	Improvement NO 	 | 
:---:|:---------:|:------------:|:----:
Drug $A$  | a | b | a+b 
Drug $B$  | c | d | c+d 
-------------------|-------------|-----------------|-------
- | a+c | b+d | N 

### Task 1: 
Derive the formula for computing the probability that the table with results will have the same values as in Table 1 for given values $a, b, c$ and $d$ $(N=a+b+c+d)$ assuming that the null hypothesis is true. 

Remark: Stating that the probability corresponds to the hypergeometric probability distributions is not enough. You should explain what is the meaning of all binomial coefficients or factorials in the formulas you will use!



### Task 1 Solution:
Conditional on the margins of the table, $a$ is distributed as an hypergeometric distribution with $a+b$ draws from a population with $a+c$ successes and $b+d$ failures, as is explained on wikipedia (https://en.wikipedia.org/wiki/Fisher%27s_exact_test).

The probability of obtaining such a set of values from hypergeometric distribution is:
$$ p = \frac{\binom{a+c}{a} \binom{b+d}{b}} {\binom{N}{a+b}} = \frac{\binom{a+c}{c} \binom{b+d}{d}} {\binom{N}{c+d}} = \frac{(a+c)! (b+d)! (a+b)! (c+d)!}{a! b! c! d! N!} $$

In my own words:
* There are $\binom{a+c}{a}$ possible ways how the $a$ number of patients who were given the drug $A$ can be chosen from $(a+c)$ patients with improvement.
* There are $\binom{b+d}{b}$ possible ways how the $b$ number of patients who were given the drug $B$ can be chosen from $(b+d)$ patients with**OUT** improvement.
* There are $\binom{N}{a+b}$ possible ways how the $a+b$ number of patients who were given the drug $A$ can be chosen from $(N)$ set of all patients.

By **x ways how the k can be chosen from n** I mean in this case how many combinations of k patients are possible given the number of patients n.

### Table 2

-| Improvement YES |	Improvement NO 	 | 
:---:|:---------:|:------------:|:----:
Drug $A$  | 8 | 1 | 9
Drug $B$  | 4 | 5 | 9 
-------------------|-------------|-----------------|-------
- | 12 | 6 | 18 

### Task 2:
Implement function $TabProb(a,b,c,d)$ that computes the probability of Table 1 assuming the null hypothesis is true using the formula you have derived in Task 1. Using the function, compute the probability of Table 2.

In [1]:
import numpy as np

def TabProb(a, b, c, d):
    numerator = np.math.factorial(a+c) * np.math.factorial(b+d) * np.math.factorial(a+b) * np.math.factorial(c+d)
    denominator = np.math.factorial(a) * np.math.factorial(b) * np.math.factorial(c) * np.math.factorial(d) * np.math.factorial(a+b+c+d)
    return numerator / denominator

In [2]:
table = np.array([[8,1], [4,5]])
a = 8
b = 1
c = 4
d = 5
p = TabProb(a, b, c, d)
print(p)

0.06108597285067873


The difference between the drugs A and B is evident. Is this difference statistically significant? That is, assuming that both samples A and B are from the same probability distribution, what is the probability that two samples differ to the same or even higher extent? If this probability is small, e.g., at most α=0.05
, we can state with the high confidence (1−α)=0.95 that the null hypothesis is not valid. Based on the marginal sums (a+b, c+d, a+c and b+d), we can easily compute that the expected value of the field $a$ is 6. The notion "differing to the same or even higher extent" can be understood in two ways

1. one-sided - only the values of a that are on one side from the expected value; in our case, the values 8 and 9, or
2. two-sided - all the values of a
such that |a−6|≥8−6; in our case, the values 0, 1, 2, 3, 4, 8 and 9.

In case 1, we use a one-sided test, in case 2, we use a two-sided test.

### Is this difference statistically significant? 
Probability that two samples differ to the same or even higher extent is approximately 0.061 which is larger than α=0.05. The difference between the drugs A and B is therefore **NOT** statistically significant. 

### Task 3:

* Answer the following question
    * In general, which of the four combinations of tests 
    
    {one-sided, two-sided}×{Fisher's test,χ2-test} are meaningful?

* While ignoring the requirement that χ2-test can be used only if all counts in the contingency table are at least 5, perform and evaluate all meaningful combinations of the above tests at the significance level α=0.05. Compare the results of the tests. For computing the test use suitable functions from Python scipy library and scipy.stats.chi2.cdf(), scipy.stats.chi2.sf()and scipy.stats.chi2.isf(). Of course, scipy contains functions for computing Fisher's exact test and χ2-test. Compare your results with the results obtained by using the functions scipy.stats.fisher_exact() and scipy.stats.chi2_contingency().

### Answers Task 3:
1. χ2-test is a two-sided test which uses one-sided critical region. 
The χ2-test statistic is:
$$ X^2 = \sum_{i,j} \frac{(o_{ij}-e_{ij})^2}{e_{ij}}$$
Where $o_{ij}$ are the observed counts in cell $[i,j]$ and $e_{ij}$ are the expected cell count in cell $[i,j]$. As the numerator of each term is **squared** difference between observation and expected value. So it makes no difference when $o_{ij} < e_{ij}$ or $o_{ij} > e_{ij}$. Therefore χ2-test is always two-sided test even if the critical region is defined in one (the right) tail of the χ2 distribution.
(My main source: https://stats.stackexchange.com/questions/171074/chi-square-test-why-is-the-chi-squared-test-a-one-tailed-test/171084#171084)
Other combinations of test are meaningful.

2. perform and evaluate all meaningful combinations:


In [24]:
import scipy.stats

print("table:\n", table)
row_sum = np.sum(table, axis=1)
#print(row_sum)
column_sum = np.sum(table, axis=0)
#print(column_sum)
n = np.sum(table)
#print(n)

expected_table = np.zeros((2, 2))
expected_table = np.outer(row_sum,column_sum) / n

print("expected table:\n", expected_table)

table:
 [[8 1]
 [4 5]]
expected table:
 [[6. 3.]
 [6. 3.]]


Fisher's exact tests:

In [14]:
# two-sided
# a=1 and a=2 impossible
p_a3 = TabProb(3,6,9,0) # a=3
p_a4 = TabProb(4,5,8,1) # a=4

p_a8 = TabProb(8,1,4,5) # a=3
p_a9 = TabProb(9,0,3,6) # a=4

p2 = p_a3 + p_a4 + p_a8 + p_a9
print("Fisher two-sided:",p2)
print("control:", scipy.stats.fisher_exact(table, alternative="two-sided"))

# one-sided

p1 = p_a8 + p_a9 
print("Fisher one-sided:", p1)
print("control:", scipy.stats.fisher_exact(table, alternative="greater"))

Fisher two-sided: 0.13122171945701358
control: (10.0, 0.13122171945701377)
Fisher one-sided: 0.06561085972850679
control: (10.0, 0.06561085972850689)


χ2-test

In [40]:
test = np.sum(np.square(table - expected_table) / expected_table)
p_sf = scipy.stats.chi2.sf(test, df=1)
p_cdf = 1 - scipy.stats.chi2.cdf(test, df=1)
print("Chi squared p-value using sf func.: ", p_sf)
print("Chi squared p-value using cdf func.:", p_cdf)

chi2, p, dof, expected = scipy.stats.chi2_contingency(table, correction=False)
print("Control p-value without correction: ", p)
chi2, p, dof, expected = scipy.stats.chi2_contingency(table, correction=True)
print("Control p-value with correction:    ", p)


Chi squared p-value using sf func.:  0.04550026389635857
Chi squared p-value using cdf func.: 0.04550026389635853
Control p-value without correction:  0.04550026389635857
Control p-value with correction:     0.13361440253771584
