# Why Non-Parametric Tests?
- Some experiments yield response measurements that defy exact quantification.
    - e.g. it is impossible to make statements such as “teacher A is twice as good as teacher B.”
- Insufficient knowledge to estimate if the population(s) fulfil the assumptions that specified by parametric tests.

In [3]:
import pandas as pd
import numpy as np
import pingouin as pg
from scipy import stats
from statsmodels.stats.descriptivestats import sign_test
from pydataset import data

In [2]:
df = data('iris')

# Paired Sign Test 
- Number of positive differences, M where $D_i = X_i − Y_i$ follows a binomial distribution, $Bin(n, p)$.
- Test statistic: M = number of positive differences where $D_i = X_i − Y_i$
- $H_0$: $p=0.5$
- $H_a$: $p\neq0.5$

In [30]:
def sign_test(x, y, tail='two-sided', p0=0.5):
    M = (x>y).sum()
    left_p = stats.binom.cdf(M, len(x), p0) # P(X<=x)
    if tail=='greater':
        return M, 1-left_p
    if tail=='less':
        return M, left_p
    else:
        return M, (left_p if left_p<0.5 else 1-left_p) * 2

In [32]:
M, pval = sign_test(
    x = df.loc[df['Species']=='setosa','Sepal.Width'].values, 
    y = df.loc[df['Species']=='versicolor','Sepal.Width'].values,
    tail='two-sided', 
    p0=0.5
)
print("M:", M, "p-value:", pval)

M: 43 p-value: 3.243740565039843e-08


# Wilcoxon Signed-Rank Test (Paired Samples)

- $H_0$: The population distributions for the X’s and Y ’s are identical.
- $H_a$: 
    - The two population distributions differ in location (two-tailed),
    - The population relative frequency distribution for the X’s is shifted to the right/left of that for the Y ’s (one-tailed).
    
- Procedure: The absolute values of the pair differences are ranked, 
    - W- indicates the sum of rank of the negative differences
    - W+ indicates the sum of rank of the positive differences
Test statistic:
1. For a two-tailed test, use $W = min(W+, W−)$
2. For a one-tailed test (to detect the one-tailed alternative just given), use the rank sum T− of the negative differences.


In [33]:
pg.wilcoxon(
    x=df.loc[df['Species']=='setosa','Sepal.Length'], 
    y=df.loc[df['Species']=='versicolor','Sepal.Length'],
    tail='two-sided'
)

Unnamed: 0,W-val,tail,p-val,RBC,CLES
Wilcoxon,19.0,two-sided,3.586548e-09,-0.96898,0.0674


# Mann–Whitney U Test (Independent Samples)
Equivalent to Wilcoxon Rank-sum Test
- Procedure: Rank n1 + n2 = n observations in terms of magnitude
- Rationale: The expected rank sums for the samples should be proportional to the sample sizes n1 and n2. If the observations in one population tended to be larger, the rank sum in its sample is larger than the expected rank sum.

- $H_0$: The population distributions for the X’s and Y ’s are identical.
- $H_a$: 
    - The two population distributions differ in location (two-tailed),
    - The population relative frequency distribution for the X’s is shifted to the right/left of that for the Y ’s (one-tailed).

In [81]:
pg.mwu(
    x=df.loc[df['Species']=='setosa','Sepal.Length'], 
    y=df.loc[df['Species']=='versicolor','Sepal.Length'],
    tail='two-sided'
)

Unnamed: 0,U-val,tail,p-val,RBC,CLES
MWU,168.5,two-sided,8.345827e-14,0.8652,0.0674
