# A/B testing: 2 sample t-test calculator 
__Author: Laura Urbisci__   

---
## Overview

The goal is to automate the 2 sample t-test in python. While python has a t-test function (i.e., `ttest_ind`) in the scipy package, it does not give the user an option to specify if this is a two- or one-sided test. In addition, one still needs to do an F-test for equal variance before the 2 sample t-test. Thus, I will roughly base my function off of R's `t.test` (See documentation [here](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_ind.html) for the Python package and [here](https://www.rdocumentation.org/packages/stats/versions/3.6.0/topics/t.test) for R package). The end product is a python function where the user can input the raw data for the two samples, the alternative hypothesis, and significance level and determine if the average value differs across the two groups.

*Note: when using t-tests, we assume the data (aka samples) come from a Normal distribution. If the data is not Normallly distributed a non-parametric test is preferred.*



# Building the function

In [54]:
import numpy as np
from scipy import stats

s1 = [64.2, 28.4, 85.3, 83.1, 13.4, 56.8, 44.2, 90]
s2 = [45, 29.5, 32.3, 49.3, 18.3, 34.2, 43.9, 13.8, 27.4, 43.4]

# f-test for equal variance, two-sided test
var_s1 = np.var(s1, ddof = 1) # need ddof so divisor is N-1
var_s2 = np.var(s2, ddof = 1) # need ddof so divisor is N-1
f_stat = var_s1/var_s2 
print(f_stat) # F statistic
m = len(s1)
n = len(s2)
print(m-1) # num df
print(n-1) # denom df
print( 2*(1-stats.f.cdf(f_stat, dfn=m-1, dfd=n-1)) ) # p-value from f_stat
2*(1-stats.f.cdf(f_stat, dfn=m-1, dfd=n-1)) < 0.05 # reject null that same

5.598438418230809
7
9
0.020170487653488944


True

In [73]:
# two sample t-test, unequal variance

# two-sided test
t_stat_un = (np.mean(s1) - np.mean(s2))/np.sqrt( (var_s1/m) + (var_s2/n) )
print(t_stat_un) # t statistic
df_t_un = ( (var_s1/m) + (var_s2/n) )**2/( (var_s1/m)**2/(m-1) + (var_s2/n)**2/(n-1) ) 
print(df_t_un) # df
print( 2*(1-stats.t.cdf(t_stat_un, df=df_t_un)) ) # two-sided p-value from t_test_un
2*(1-stats.t.cdf(t_stat_un, df=df_t_un)) < 0.05

# one-sided test, less
print(stats.t.cdf(t_stat_un, df=df_t_un)) # one-sided p-value from t_test_un

# one-sided test, greater
print(1-stats.t.cdf(t_stat_un, df=df_t_un)) # one-sided p-value from t_test_un


2.3102761451054765
9.000549141935105
0.046213934265318946
0.9768930328673405
0.023106967132659473


In [77]:
# two sample t-test, equal variance

# two-sided test
s_pool = (((m-1)*var_s1) + ((n-1)*var_s2))/(m+n-2)
t_stat = (np.mean(s1) - np.mean(s2))/(np.sqrt(s_pool)*np.sqrt( (1/m) + (1/n) ))
print(t_stat) # t statistic
df_t = m+n-2
print(df_t) # df
print( 2*(1-stats.t.cdf(t_stat, df=df_t)) ) # two-sided p-value from t_test
2*(1-stats.t.cdf(t_stat, df=df_t)) < 0.05

# one-sided test, less
print(stats.t.cdf(t_stat, df=df_t)) # one-sided p-value from t_test_un

# one-sided test, greater
print(1-stats.t.cdf(t_stat, df=df_t)) # one-sided p-value from t_test_un


2.5098649986861177
16
0.023208765109723917
0.988395617445138
0.011604382554861958


# Final product

A python function that takes the following inputs:
- Sample 1 raw data (s1)
- Sample 2 raw data (s2)
- Specify alternative hypothesis (alternative): two-sided (default), less, greater
- Significance level (sig_level): 5% is default


and outputs the p-value and verdict of the 2 sample t-test. 

*Note: In this function, we assume the data follow a Normal distribution. When this assumption is not met, this calculator should not be used.*


In [3]:
import numpy as np
from scipy import stats

def twosample_t_test(s1, s2, alternative="two_sided", sig_level=0.05):
    """
    Returns the p-value and verdict a 2 sample t-test
    
    Inputs:
    s1(array): sample 1 raw data
    s2(array): sample 2 raw data
    alternative(string): specify the alternative hypothesis, must be "two_sided" (default),
    "greater", or "less"
    sig_level: significance level, default is 5%
    
    Output:
    verdict with p-value
    """
    # f-test for equal variance, two-sided test
    var_s1 = np.var(s1, ddof = 1) # need ddof so divisor is N-1
    var_s2 = np.var(s2, ddof = 1) # need ddof so divisor is N-1
    f_stat = var_s1/var_s2 
    m = len(s1)
    n = len(s2)
    f_pval = 2*(1-stats.f.cdf(f_stat, dfn=m-1, dfd=n-1))  # p-value from f_stat
    
    if f_pval >= sig_level:
        # fail to reject null -> use two sample t-test with equal variance
        s_pool = (((m-1)*var_s1) + ((n-1)*var_s2))/(m+n-2)
        # t statistic
        t_stat = (np.mean(s1) - np.mean(s2))/(np.sqrt(s_pool)*np.sqrt( (1/m) + (1/n) ))
        # df 
        df_t = m+n-2
        
        # alternative hypothesis:
        if alternative == "two_sided":
            # two-sided test p-value
            p_2side = 2*(1 - stats.t.cdf(abs(t_stat), df=df_t))
            
            # verdict
            if p_2side >= sig_level:
                print("Verdict: No significant difference in means (p={})".format(round(p_2side,4)))
            else:
                print("Verdict: Significant difference in means(p={})".format(round(p_2side,4)))
            
        elif alternative == "less":
            # one-sided test (less) p-value
            p_less = 1- stats.t.cdf(abs(t_stat), df=df_t)
            
            # verdict
            if p_less >= sig_level:
                print("Verdict: No significant difference (p={})".format(round(p_less,4)))
            else:
                print("Verdict: Significant difference (p={}). Sample 1's mean is less than sample 2's mean".format(round(p_less,4)))
         
        else:
            # one-sided test (greater) p-value  
            p_great = stats.t.cdf(abs(t_stat), df=df_t)
            
            # verdict
            if p_great >= sig_level:
                print("Verdict: No significant difference (p={})".format(round(p_great,4)))
            else:
                print("Verdict: Significant difference (p={}). Sample 1's mean is greater than sample 2's mean".format(round(p_great,4)))
   
    else:
        # reject null that same -> use two sample t-test with unequal variance
        # t statistic
        t_stat_un = (np.mean(s1) - np.mean(s2))/np.sqrt( (var_s1/m) + (var_s2/n) )
        # df
        df_t_un = ( (var_s1/m) + (var_s2/n) )**2/( (var_s1/m)**2/(m-1) + (var_s2/n)**2/(n-1) ) 
        
        # alternative hypothesis:
        if alternative == "two_sided":
            # two-sided test p-value
            p_2side_un = 2*(1 - stats.t.cdf(abs(t_stat_un), df=df_t_un))
            
            # verdict
            if p_2side_un >= sig_level:
                print("Verdict: No significant difference in means (p={})".format(round(p_2side_un,4)))
            else:
                print("Verdict: Significant difference in means (p={})".format(round(p_2side_un,4)))
          
        elif alternative == "less":
            # one-sided test(less) p-value
            p_less_un = 1 - stats.t.cdf(abs(t_stat_un), df=df_t_un)
            
            # verdict
            if p_less_un >= sig_level:
                print("Verdict: No significant difference (p={})".format(round(p_less_un,4)))
            else:
                print("Verdict: Significant difference (p={}). Sample 1's mean is less than sample 2's mean".format(round(p_less_un,4)))
            
        else:
            # one-sided test (greater) p-value  
            p_great_un = stats.t.cdf(abs(t_stat_un), df=df_t_un)
            
            # verdict
            if p_great_un >= sig_level:
                print("Verdict: No significant difference (p={})".format(round(p_great_un,4)))
            else:
                print("Verdict: Significant difference (p={}). Sample 1's mean is greater than sample 2's mean".format(round(p_great_un,4)))
                

In [4]:
s1 = [64.2, 28.4, 85.3, 83.1, 13.4, 56.8, 44.2, 90]
s2 = [45, 29.5, 32.3, 49.3, 18.3, 34.2, 43.9, 13.8, 27.4, 43.4]

twosample_t_test(s1, s2, alternative="two_sided", sig_level=0.05)
twosample_t_test(s1, s2, sig_level=0.05)
twosample_t_test(s1, s2, alternative="less", sig_level=0.05)
twosample_t_test(s1, s2, alternative="greater", sig_level=0.05)

Verdict: Significant difference in means (p=0.0462)
Verdict: Significant difference in means (p=0.0462)
Verdict: No significant difference (p=0.9769)
Verdict: Significant difference (p=0.0231). Sample 1's mean is greater than sample 2's mean
