# Python AB Testing

---

> Scipy within Python’s data analysis stack provides an interface for doing statistical hypothesis tests. 

>The statistical analysis functions are within the stats module within Scipy and can be invoked by importing scipy.stats in Python.

### One and two sample t-tests can be conducted using the t-test function built into Scipy
---

In [1]:
# Import Library
import numpy as np
from scipy import stats

In [2]:
np.random.seed(12345678)

---
Test with sample with identical means:

In [3]:
rvs1 = stats.norm.rvs(loc=5, scale=10, size=500)
rvs2 = stats.norm.rvs(loc=5, scale=10, size=500)

In [4]:
stats.ttest_ind(rvs1,rvs2)

Ttest_indResult(statistic=0.26833823296238857, pvalue=0.788494433695651)

In [5]:
stats.ttest_ind(rvs1,rvs2,equal_var=False)

Ttest_indResult(statistic=0.26833823296238857, pvalue=0.7884945274950106)

---
ttest_ind underestimates p for unequal variances:

In [6]:
rvs3 = stats.norm.rvs(loc=5, scale=20, size=500)

In [7]:
stats.ttest_ind(rvs1, rvs3)

Ttest_indResult(statistic=-0.46580283298287956, pvalue=0.6414582741343561)

In [8]:
stats.ttest_ind(rvs1, rvs3, equal_var = False)

Ttest_indResult(statistic=-0.46580283298287956, pvalue=0.6414964624656874)

---
When n1 != n2, the equal variance t-statistic is no longer equal to the unequal variance t-statistic:

In [9]:
rvs4 = stats.norm.rvs(loc=5, scale=20, size=100)

In [10]:
stats.ttest_ind(rvs1, rvs4)

Ttest_indResult(statistic=-0.9988253944278285, pvalue=0.3182832709103878)

In [11]:
stats.ttest_ind(rvs1, rvs4, equal_var = False)

Ttest_indResult(statistic=-0.6971257058465435, pvalue=0.4871692772540187)

---
T-test with different means, variance, and n:

In [12]:
rvs5 = stats.norm.rvs(loc=8, scale=20, size=100)

In [13]:
stats.ttest_ind(rvs1, rvs5)

Ttest_indResult(statistic=-1.467966985449067, pvalue=0.14263895620529113)

In [14]:
stats.ttest_ind(rvs1, rvs5, equal_var = False)

Ttest_indResult(statistic=-0.9436597361713308, pvalue=0.3474417033479409)