# Analyzing Farmburg's A/B Test

In [1]:
import pandas as pd
import numpy as np

In [2]:
# Read in the `clicks.csv` file as `abdata`
abdata = pd.read_csv('clicks.csv')

# Inspect the dataframe
print(abdata.head())

    user_id group is_purchase
0  8e27bf9a     A          No
1  eb89e6f0     A          No
2  7119106a     A          No
3  e53781ff     A          No
4  02d48cf1     A         Yes


We are interested in whether visitors are more likely to make a purchase if they are in any one group compared to the others.

In [3]:
# Create a contingency table with pd.crosstab
Xtab = pd.crosstab(abdata.group, abdata.is_purchase)
print(Xtab)

is_purchase    No  Yes
group                 
A            1350  316
B            1483  183
C            1583   83


In [4]:
# Import chi2_contingency module and calculate p-value
from scipy.stats import chi2_contingency
chi2, pval, dof, expected = chi2_contingency(Xtab)
print(pval)

2.4126213546684264e-35


**We can conclude that there is a significant difference in the purchase rate for groups A, B, and C.**

Now we want to know is whether each price point allows us to make enough money that we can exceed some target goal.  
It’s true that more people wanted to purchase the upgrade at $0.99  
we need to generate a minimum of $1000 in revenue per week in order to justify this project.  
In order to justify this feature, we will need to calculate the necessary purchase rate for each price point.

In [12]:
# Calculate and print the number of visits
num_visits = len(abdata)
print(num_visits)

# Calculate the purchase rate needed at 0.99
num_sales_needed_099 = 1000/0.99
p_sales_needed_099 = num_sales_needed_099/num_visits
print('Purchase rate need at 0.99$: {psales}'.format(psales = p_sales_needed_099))

# Calculate the purchase rate needed at 1.99
num_sales_needed_199 = 1000/1.99
p_sales_needed_199 = num_sales_needed_199/num_visits
print('Purchase rate need at 1.99$: {psales}'.format(psales = p_sales_needed_199))


# Calculate the purchase rate needed at 4.99
num_sales_needed_499 = 1000/4.99
p_sales_needed_499 = num_sales_needed_499/num_visits
print('Purchase rate need at 4.99$: {psales}'.format(psales = p_sales_needed_499))



4998
Purchase rate need at 0.99$: 0.20210104243717691
Purchase rate need at 1.99$: 0.10054272965467594
Purchase rate need at 4.99$: 0.040096198800161346


In [16]:
# Calculate samp size & sales for 0.99 price point
samp_size_099 = np.sum(abdata.group == 'A')
sales_099 = np.sum((abdata.group == 'A') & (abdata.is_purchase == 'Yes'))
print(samp_size_099)
print(sales_099)

# Calculate samp size & sales for 1.99 price point
samp_size_199 = np.sum(abdata.group == 'B')
sales_199 = np.sum((abdata.group == 'B') & (abdata.is_purchase == 'Yes'))
print(samp_size_199)
print(sales_199)

# Calculate samp size & sales for 4.99 price point
samp_size_499 = np.sum(abdata.group == 'C')
sales_499 = np.sum((abdata.group == 'C') & (abdata.is_purchase == 'Yes'))
print(samp_size_499)
print(sales_499)

1666
316
1666
183
1666
83


For Group A ($0.99 price point), perform a binomial test using binom_test() to see if the observed purchase rate is significantly greater than p_sales_needed_099

In [19]:
# Import the binom_test module
from scipy.stats import binom_test

# Calculate the p-value for Group A
pvalueA = binom_test(sales_099, n=samp_size_099, p=p_sales_needed_099, alternative='greater')
print(pvalueA)

# Calculate the p-value for Group B
pvalueB = binom_test(sales_199, n=samp_size_199, p=p_sales_needed_199, alternative='greater')
print(pvalueB)

# Calculate the p-value for Group C
pvalueC = binom_test(sales_499, n=samp_size_499, p=p_sales_needed_499, alternative='greater')
print(pvalueC)

0.9028081076188554
0.11184562623740614
0.02794482665983064


**Alternative Hypothesis: the observed purchase rate is significantly 'greater' than the purchase rate that results in the minimum revenue target.**

The chosen price group is 4.99$