### A/B Testing - It is a controlled experiment commonly used in Data Science and other fields to compare two or more versions of a product or model based on a suitable metric. It is a statistical method which helps in data-driven decision making by evaluating the impact of changes on user behaviour.

### Demo - the dataset contains a product (Mobile screen guard) which is recommended when user went to purchase either of the 2 products (Mobile, Mobile back cover) and whether the user purchased the recommended product with the main product or not (1 if purchased, 0 if did not purchase)


In [29]:
# importing libraries
import numpy as np
import pandas as pd

In [30]:
# generating dataset for the demo
def create_data(N):
  recommended_product = "Mobile Screen Guard"
  product_names = ["Mobile Back Cover", "Mobile"]
  product_values = np.random.rand(N, 1)  # product values lies between 0 and 1
  purchase_values = np.random.rand(N, 1) # purchased values lies between 0 and 1
  products = [product_names[0] if product_value > 0.7 else product_names[1] for product_value in product_values]  # if a product value > 0.7, product is Mobile back cover, else product is Mobile
  purchases = [0 if purchase_value > 0.5 else 1 for purchase_value in purchase_values]  # if a purchase value > 0.5, purchased = 1, else purchased = 0
  data = {'Recommended_Product': recommended_product, 'Products': products, 'Purchased': purchases}

  data = pd.DataFrame(data = data)
  return data

data = create_data(100)
data.head(10)

Unnamed: 0,Recommended_Product,Products,Purchased
0,Mobile Screen Guard,Mobile,1
1,Mobile Screen Guard,Mobile,1
2,Mobile Screen Guard,Mobile Back Cover,0
3,Mobile Screen Guard,Mobile Back Cover,1
4,Mobile Screen Guard,Mobile Back Cover,1
5,Mobile Screen Guard,Mobile,1
6,Mobile Screen Guard,Mobile Back Cover,0
7,Mobile Screen Guard,Mobile,0
8,Mobile Screen Guard,Mobile,0
9,Mobile Screen Guard,Mobile,1


In [31]:
# products with purchased matrix
product_purchased_matrix = data.groupby('Products')['Purchased'].value_counts()
product_purchased_matrix

Unnamed: 0_level_0,Unnamed: 1_level_0,count
Products,Purchased,Unnamed: 2_level_1
Mobile,0,39
Mobile,1,29
Mobile Back Cover,0,18
Mobile Back Cover,1,14


In [32]:
# contingency table
contingency_table = pd.crosstab(data['Products'], data['Purchased']).values
print("Contingency table: \n")
contingency_table

Contingency table: 



array([[39, 29],
       [18, 14]])

In [33]:
# Null hypothesis (H0) and Alternate hypothesis (H1)
null_hypothesis = "There is no relationship between purchase rate and recommendation type"
alternative_hypothesis = "There is a relationship between purchase rate and recommendation type"

In [34]:
# calculating chi_squared and p-value
from scipy.stats import chi2_contingency
chi2_statistic, p_value, dof, expected_values = chi2_contingency(contingency_table, correction = False)
print(f"Chi-squared: {chi2_statistic}, p-value: {p_value}")

Chi-squared: 0.01079991360069116, p-value: 0.91723074600557


In [35]:
# comparing p-value with alpha value to evaluate H0
prob = 0.95
alpha = 1 - prob
print("Significance: %.3f, p: %.3f" % (alpha, p_value))
if p_value <= alpha:
  print("Dependent, With the evidence given, we can reject H0")
else:
  print("Independent, No sufficient evidence to reject H0")

Significance: 0.050, p: 0.917
Independent, No sufficient evidence to reject H0
