# Python AB Testing

---

> Scipy within Python’s data analysis stack provides an interface for doing statistical hypothesis tests. 

>The statistical analysis functions are within the stats module within Scipy and can be invoked by importing scipy.stats in Python.

### One and two sample t-tests can be conducted using the t-test function built into Scipy
---

In [4]:
import numpy as np
from scipy import stats

In [6]:
np.random.seed(12345678)

---
Test with sample with identical means:

In [8]:
rvs1 = stats.norm.rvs(loc=5, scale=10, size=500)
rvs2 = stats.norm.rvs(loc=5, scale=10, size=500)

In [9]:
stats.ttest_ind(rvs1,rvs2)

Ttest_indResult(statistic=-0.5414590940020996, pvalue=0.588312033669383)

In [10]:
stats.ttest_ind(rvs1,rvs2,equal_var=False)

Ttest_indResult(statistic=-0.5414590940020996, pvalue=0.5883120845989677)

---
ttest_ind underestimates p for unequal variances:

In [11]:
rvs3 = stats.norm.rvs(loc=5, scale=20, size=500)

In [12]:
stats.ttest_ind(rvs1, rvs3)

Ttest_indResult(statistic=0.923168000499728, pvalue=0.35614279463705023)

In [13]:
stats.ttest_ind(rvs1, rvs3, equal_var = False)

Ttest_indResult(statistic=0.923168000499728, pvalue=0.3562131513880533)

---
When n1 != n2, the equal variance t-statistic is no longer equal to the unequal variance t-statistic:

In [14]:
rvs4 = stats.norm.rvs(loc=5, scale=20, size=100)

In [15]:
stats.ttest_ind(rvs1, rvs4)

Ttest_indResult(statistic=-0.16120297029859712, pvalue=0.8719879867564876)

In [16]:
stats.ttest_ind(rvs1, rvs4, equal_var = False)

Ttest_indResult(statistic=-0.10770725462812314, pvalue=0.9144241503079025)

---
T-test with different means, variance, and n:

In [17]:
rvs5 = stats.norm.rvs(loc=8, scale=20, size=100)

In [18]:
stats.ttest_ind(rvs1, rvs5)

Ttest_indResult(statistic=-2.0642520195013927, pvalue=0.0394245412304782)

In [19]:
stats.ttest_ind(rvs1, rvs5, equal_var = False)

Ttest_indResult(statistic=-1.3694014002992354, pvalue=0.17367119350397411)