## AB Testing

A test is performed on a number of items. We'll look at how the test influences the orders made of that item.

- Find orders within 6 months after test started
- 

In [274]:
import pandas as pd
from pandasql import sqldf 
df = pd.read_csv('./data/abtesting.csv')
df.head()

Unnamed: 0,item_id,test_assignment,test_number,test_start_date
0,2512.0,1.0,item_test_1,2013-01-05 00:00:00
1,482.0,0.0,item_test_1,2013-01-05 00:00:00
2,2446.0,0.0,item_test_1,2013-01-05 00:00:00
3,1312.0,0.0,item_test_1,2013-01-05 00:00:00
4,3556.0,1.0,item_test_1,2013-01-05 00:00:00


In [52]:
df.describe(include='all')

Unnamed: 0,item_id,test_assignment,test_number,test_start_date
count,6594.0,6594.0,6594,6594
unique,,,3,3
top,,,item_test_3,2016-01-07 00:00:00
freq,,,2198,2198
mean,1991.293904,0.496967,,
std,1163.787869,0.500029,,
min,0.0,0.0,,
25%,981.0,0.0,,
50%,2008.0,0.0,,
75%,2995.0,1.0,,


In [10]:
orders = pd.read_csv('./data/ab_orders.csv')
orders.head()

Unnamed: 0,invoice_id,line_item_id,user_id,item_id,item_name,item_category,price,created_at,paid_at
0,192320.0,83118.0,178481.0,3526.0,digital apparatus,apparatus,330.0,2017-06-28 21:14:25,2017-06-27 21:19:39
1,192320.0,207309.0,178481.0,1514.0,miniature apparatus cleaner,apparatus,99.0,2017-06-28 21:14:25,2017-06-27 21:19:39
2,192320.0,392027.0,178481.0,3712.0,miniature apparatus cleaner,apparatus,99.0,2017-06-28 21:14:25,2017-06-27 21:19:39
3,80902.0,243831.0,154133.0,3586.0,reflective instrument,instrument,57.2,2016-10-09 06:57:30,2016-10-07 10:08:10
4,80902.0,399806.0,154133.0,1061.0,extra-strength instrument charger,instrument,17.6,2016-10-09 06:57:30,2016-10-07 10:08:10


In [149]:
d1 = sqldf("\
            SELECT                                                                                                            \
                orders.item_id, test.test_start_date, orders.created_at, test.test_assignment,                                                     \
                (strftime('%s',orders.created_at) - strftime('%s',test.test_start_date) )/(3600*24) as days                \
            FROM df AS test                                                                                                   \
            LEFT JOIN                                                                                                         \
                orders AS orders                                                                                              \
            ON                                                                                                                \
                orders.item_id = test.item_id                                                                                 \
            WHERE                                                                                                             \
                test.test_number = 'item_test_3'                                                                              \
            ORDER BY orders.item_id                                                                                             \
           ")
d1

Unnamed: 0,item_id,test_start_date,created_at,test_assignment,days
0,0.0,2016-01-07 00:00:00,2014-03-21 06:08:48,1.0,-656
1,0.0,2016-01-07 00:00:00,2014-04-08 09:43:38,1.0,-638
2,0.0,2016-01-07 00:00:00,2015-01-21 15:51:14,1.0,-350
3,0.0,2016-01-07 00:00:00,2015-03-04 18:06:07,1.0,-308
4,0.0,2016-01-07 00:00:00,2015-04-08 00:42:57,1.0,-273
...,...,...,...,...,...
47397,3997.0,2016-01-07 00:00:00,2018-01-29 10:07:14,1.0,753
47398,3997.0,2016-01-07 00:00:00,2018-03-31 23:12:01,1.0,814
47399,3997.0,2016-01-07 00:00:00,2018-05-02 06:47:29,1.0,846
47400,3997.0,2016-01-07 00:00:00,2018-05-20 08:30:34,1.0,864


In [150]:
d2 = sqldf("\
            SELECT                                       \
               item_id, test_assignment,                 \
               MAX(CASE                                  \
                   WHEN days>0 AND days<=30              \
                   THEN 1                                \
                   ELSE 0 END) AS order_test             \
             FROM                                        \
                d1                                       \
            GROUP BY                                     \
                item_id, test_assignment                 \
                ")
d2

Unnamed: 0,item_id,test_assignment,order_test
0,0.0,1.0,0
1,1.0,0.0,0
2,2.0,0.0,1
3,3.0,1.0,1
4,4.0,1.0,0
...,...,...,...
2193,3992.0,1.0,1
2194,3993.0,0.0,0
2195,3994.0,1.0,0
2196,3996.0,0.0,0


In [190]:
from scipy.stats import chi2_contingency, chi2 
acceptance_criteria = 0.05

observed_values  = pd.crosstab(d2.test_assignment,d2.order_test).values

makesame = 1
if makesame==1:
    observed_values[1]=observed_values[0]*2


ov=pd.DataFrame(observed_values)
ov['tots']=ov.iloc[:,0]+ov.iloc[:,1]
ov['success_rate']=100*ov.iloc[:,1]/ov.iloc[:,2]
ov

Unnamed: 0,0,1,tots,success_rate
0,715,360,1075,33.488372
1,1430,720,2150,33.488372


In [193]:
#hide
chi2_statistic, p_value, dof, expected_values = chi2_contingency(observed_values, correction = False)

print(f"The chi2_statistic is {chi2_statistic:.2f}, and the p value is  {p_value:.2f}")

# find the critical value for our test
critical_value = chi2.ppf(1 - acceptance_criteria, dof)
print(critical_value)

The chi2_statistic is 0.00, and the p value is  1.00
3.841458820694124


- count = Successes if Null Hypothesis is True. (P * nobs)
- nobs  = The number of trails/sample
- value = Observed Proportion
- alternative = Type of test(2-tailed or 1-tailed)


In [192]:
from statsmodels.stats.proportion import proportions_ztest, proportion_confint

z_stat, pval= proportions_ztest(count=ov.iloc[:,1],nobs=ov['tots'].values, alternative='two-sided')

print(f"z test stat = {z_stat:.2f} and p-value = {pval:.2f}")

z test stat = 0.00 and p-value = 1.00


In [168]:
(lower_con, lower_treat), (upper_con, upper_treat) = proportion_confint(ov.iloc[:,1],nobs=ov['tots'].values,alpha=0.05)

print(f'conf ind 95% for control group: [{lower_con:.3f}, {upper_con:.3f}]')
print(f'conf ind 95% for treatment group: [{lower_treat:.3f}, {upper_treat:.3f}]')

conf ind 95% for control group: [0.307, 0.363]
conf ind 95% for treatment group: [0.279, 0.333]


- When your p-value is less than or equal to your significance level, you reject the null hypothesis. The data favors the alternative hypothesis. Congratulations! Your results are statistically significant.
- When your p-value is greater than your significance level, you fail to reject the null hypothesis. Your results are not significant. You’ll learn more about interpreting this outcome later in this post.

Failed to reject the null hypothesis (that changes are due to chance alone), since p-value is over 0.05

## Laws of Probability

In the next few sections, we'll derive three relationships between conjunction and conditional probability:

* Theorem 1: Using a conjunction to compute a conditional probability.

* Theorem 2: Using a conditional probability to compute a conjunction.

* Theorem 3: Using `conditional(A, B)` to compute `conditional(B, A)`.

Theorem 3 is also known as Bayes's Theorem.

I'll write these theorems using mathematical notation for probability:

* $P(A)$ is the probability of proposition $A$.

* $P(A~\mathrm{and}~B)$ is the probability of the conjunction of $A$ and $B$, that is, the probability that both are true.

* $P(A | B)$ is the conditional probability of $A$ given that $B$ is true.  The vertical line between $A$ and $B$ is pronounced "given". 

With that, we are ready for Theorem 1.

In [194]:
#hide
# Load the data file

from os.path import basename, exists

def download(url):
    filename = basename(url)
    if not exists(filename):
        from urllib.request import urlretrieve
        local, _ = urlretrieve(url, filename)
        print('Downloaded ' + local)
    
download('https://github.com/AllenDowney/BiteSizeBayes/raw/master/gss_bayes.csv')
import pandas as pd

gss = pd.read_csv('gss_bayes.csv', index_col=0)


Downloaded gss_bayes.csv


In [201]:
gss.describe()

Unnamed: 0,year,age,sex,polviews,partyid,indus10
count,49290.0,49290.0,49290.0,49290.0,49290.0,49290.0
mean,1995.36405,46.143132,1.537858,4.105052,2.753905,5993.666504
std,12.336592,17.11142,0.49857,1.37716,2.048108,2796.295069
min,1974.0,18.0,1.0,1.0,0.0,170.0
25%,1985.0,32.0,1.0,3.0,1.0,3890.0
50%,1996.0,44.0,2.0,4.0,3.0,6990.0
75%,2006.0,59.0,2.0,5.0,5.0,8190.0
max,2016.0,89.0,2.0,7.0,7.0,9870.0


In [251]:
male = gss.sex==1
banker = gss.indus10==6870
year1990 = gss.year<1990

print(f"Percentage male = {male.mean():.3f}")
print(f"Percentage banker = {banker.mean():.3f}")
print(f"Percentage year1990 = {year1990.mean():.3f}")
print()
print(f"first few rows of male = {male.head()}")

Percentage male = 0.462
Percentage banker = 0.015
Percentage year1990 = 0.362

first few rows of male = caseid
1     True
2     True
5    False
6     True
7     True
Name: sex, dtype: bool


In [255]:
prob = lambda x : x.mean()
print(f'Conjunction: Probability person is male & year1990 = {prob(male & year1990):.2f}')
print()

prob_cond = lambda x,given : prob(x[given])
print(f'Conditional: Given person is a banker what is the probability they are female?\
      \n{prob_cond(~male,banker):.2f}')

print(f'Conditional: Given person is female what is the probability they are a banker?\
      \n{prob_cond(banker, ~male):.2f}')
print()
print('N.B. x[given] is a list of x where given==True, i.e. has the length of given.sum()')


Conjunction: Probability person is male & year1990 = 0.17

Conditional: Given person is a banker what is the probability they are female?      
0.77
Conditional: Given person is female what is the probability they are a banker?      
0.02

N.B. x[given] is a list of x where given==True, i.e. has the length of given.sum()


### Theorem 1


$$P(A|B) = \frac{P(A~\mathrm{and}~B)}{P(B)}$$

The conditional probability of 𝐴 given that 𝐵 is true $P(A|B)$ equals the probaility of A and B occuring divided by probability of B occuring.

In [259]:
print('A = female, B= banker')
print(f'P(B) = {prob(banker):.3f}')
print(f'P(A & B) = {prob( ~male & banker):.3f}')
print(f'P(A | B) = {prob_cond(~male, banker):.3f}')
print(f'P(A & B)/P(B) = {prob( ~male & banker)/prob(banker):.3f}')



A = female, B= banker
P(B) = 0.015
P(A & B) = 0.011
P(A | B) = 0.771
P(A & B)/P(B) = 0.771


### Theorem 2

If we start with Theorem 1 and multiply both sides by $P(B)$, we get Theorem 2.

$$P(A~\mathrm{and}~B) = P(B) ~ P(A|B)$$

This formula suggests a second way to compute a conjunction: instead of using the `&` operator, we can compute the product of two probabilities.


In [262]:
print(f'P(B)*P(A|B) = {prob(banker)*prob_cond(~male,banker):.3f}')
print(f'P(A & B) = {prob( ~male & banker):.3f}')

P(B)*P(A|B) = 0.011
P(A & B) = 0.011


### Theorem 3

We have established that conjunction is commutative.  In math notation, that means:

$$P(A~\mathrm{and}~B) = P(B~\mathrm{and}~A)$$

If we apply Theorem 2 to both sides, we have

$$P(B) P(A|B) = P(A) P(B|A)$$

Here's one way to interpret that: if you want to check $A$ and $B$, you can do it in either order:

1. You can check $B$ first, then $A$ conditioned on $B$, or

2. You can check $A$ first, then $B$ conditioned on $A$.

If we divide through by $P(B)$, we get Theorem 3:

$$P(A|B) = \frac{P(A) P(B|A)}{P(B)}$$

And that, my friends, is Bayes's Theorem.



In [264]:
print(f'P(A)*P(B|A)/P(B) = {prob(~male)*prob_cond(banker, ~male)/prob(banker):.3f}')
print(f'P(A | B) = {prob_cond( ~male , banker):.3f}')


P(A)*P(B|A)/P(B) = 0.771
P(A | B) = 0.771


### The Law of Total Probability

In addition to these three theorems, there's one more thing we'll need to do Bayesian statistics: the law of total probability.
Here's one form of the law, expressed in mathematical notation:

$$P(A) = P(B_1 \mathrm{and} A) + P(B_2 \mathrm{and} A)$$

In words, the total probability of $A$ is the sum of two possibilities: either $B_1$ and $A$ are true or $B_2$ and $A$ are true.
But this law applies only if $B_1$ and $B_2$ are:

* Mutually exclusive, which means that only one of them can be true, and

* Collectively exhaustive, which means that one of them must be true.



In [265]:
print(f'P(B) = {prob(banker):.3f}')
print(f'P(A1 & B) + P(A2 & B) = {prob(~male & banker) + prob(male & banker):.3f}')


P(B) = 0.015
P(A1 & B) + P(A2 & B) = 0.015


### The Apple Problem

We'll start with a thinly disguised version of an [urn problem](https://en.wikipedia.org/wiki/Urn_problem):

> Suppose there are two bowls of apples.
>
> * Bowl 1 contains 30 red apples and 10 green apples. 
>
> * Bowl 2 contains 20 red apples and 20 green apples.
>
> Now suppose you choose one of the bowls at random and, without looking, choose an apple at random. If the apple is red, what is the probability that it came from Bowl 1?

$$P(A|B) = \frac{P(A) P(B|A)}{P(B)}$$

In [286]:
print('P(A) = probability apple is red, P(B) = probability bowl is bowl 1')

print('We want prob bowl 1 given apple is red or P(B|A)')
PA=(30+20)/(30+20+10+20)
print(f'P(A)={PA:.3f}')
PB=1/2
print(f'P(B)={PB:.3f}')
PAgivenB=(30)/(30+10)
print(f'P(A|B)={PAgivenB:.3f}')
print(f'P(B|A)={PB*PAgivenB/PA:.3f}')

P(A) = probability apple is red, P(B) = probability bowl is bowl 1
We want prob bowl 1 given apple is red or P(B|A)
P(A)=0.625
P(B)=0.500
P(A|B)=0.750
P(B|A)=0.600
