# Permutation Test

Permutation tests are a group of nonparametric statistics. Here we use a permutation test to test the null hypothesis that two different groups come from the same distribution. The notation and examples shown here are borrowed from Efron and Tibshirani’s An Introduction to the Bootstrap [1]. 

---

Our specific problem is that we observe two groups of data and we are interested in testing the null hypothesis that they have same means

* credits: http://www2.stat.duke.edu/~ar182/rr/examples-gallery/PermutationTest.html

In [1]:
import numpy as np
z = np.array([94,197,16,38,99,141,23])
y = np.array([52,104,146,10,51,30,40,27,46])

In [2]:
theta_hat = z.mean() - y.mean()
theta_hat

30.63492063492064

In [3]:
def run_permutation_test(pooled, sizeZ, sizeY):
    np.random.shuffle(pooled)
    starZ = pooled[:sizeZ]
    starY = pooled[-sizeY:]
    return starZ.mean() - starY.mean()


In [13]:
pooled = np.hstack([z,y])
delta = z.mean() - y.mean()

numSamples = 100000
diffCount = 0

for _ in range(numSamples):
    # just to get the same result
    np.random.seed(1)
    pooled = np.random.permutation(pooled)

    gr1 = pooled[:len(z)]
    gr2 = pooled[len(z):]

    estimate = gr1.mean() - gr2.mean()
    if estimate <= delta:
        diffCount += 1


hat_asl_perm = 1.0 - diffCount / numSamples
hat_asl_perm

0.030309999999999948

In this case pvalue = 03 and we reject the null hypothesis.