### Analyzing Farmburg's A/B Test
We're working on an A/B test with three different groups: A, B, and C. The data has the following columns:
- user_id: a unique id for each visitor to the FarmBurg site
- group: either 'A', 'B', or 'C' depending on which group the visitor was assigned to
- is_purchase: either 'Yes' if the visitor made a purchase or 'No' if they did not.

In [7]:
# Import libraries
import pandas as pd
import numpy as np

from scipy.stats import chi2_contingency

# Load the dataset and print out a sample
abdata = pd.read_csv("clicks.csv")
print(abdata.head())

    user_id group is_purchase
0  8e27bf9a     A          No
1  eb89e6f0     A          No
2  7119106a     A          No
3  e53781ff     A          No
4  02d48cf1     A         Yes


We have two categorical variables: `group` and `is_purchase`. We are interested in whether visitors are more likely to make a purchase if they are in any one group compared to the others. Because we want to know if there is an association between two categorical variables, we’ll start by using a Chi-Square test to address our question.

In [8]:
# Create a contingency table
Xtab = pd.crosstab(abdata.group, abdata.is_purchase)
# Print the result
print(Xtab)

# Run Chi-Square test and print the result
chi2, pval, dof, expected = chi2_contingency(Xtab)
print(pval)

is_purchase    No  Yes
group                 
A            1350  316
B            1483  183
C            1583   83
2.4126213546684264e-35


The p-value is equivalent to 0.0000000000000000000000000000000000241 and less than 0.05 and we can conclude that there is a significant difference in the purchase rate for groups A, B, and C.