Blog Title A/B Test
- Pageviews - how many users viewed the article
- Bounce Rate - % of users who left the page without doing anything

I am testing..
- Version A: "10 Tips for Better Sleep"
- Version B: "Improve your Sleep with these 10 simple habits"

I also want to analyse
- Which title got more pageviews
- Which one had a lower bounce rate

In [1]:
# import libraries
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from scipy.stats import ttest_ind, chi2_contingency

In [2]:
# Simulate data
np.random.seed(42)
n=1000
data=pd.DataFrame({
    'user_id':range(1, n+1),
    'title_version': np.random.choice(['A', 'B'], size=n)
})

In [3]:
data['pageview'] = 1  # all are views since it's a blog test

In [4]:
# simulate bounce rate (1=bounced, 0=stayed)
def simulate_bounce(row):
    if row['title_version'] == 'A':
        return np.random.rand() < 0.55
    else:
        return np.random.rand() < 0.45
data['bounced'] = data.apply(simulate_bounce, axis=1).astype(int)

In [5]:
data['time_on_page'] = data['bounced'].apply(lambda x: np.random.normal(20, 5) if x else np.random.normal(80, 15))


In [6]:
# Analyze Data
summary = data.groupby('title_version').agg({
    'user_id': 'count',
    'bounced': ['mean', 'sum'],
    'time_on_page': 'mean'
})
summary.columns = ['Users', 'Bounce Rate', 'Total Bounces', 'Avg Time on Page']
print("Summary:\n", summary)

Summary:
                Users  Bounce Rate  Total Bounces  Avg Time on Page
title_version                                                     
A                490     0.548980            269         46.804553
B                510     0.439216            224         54.381959


In [7]:
contingency = pd.crosstab(data['title_version'], data['bounced'])
chi2, p_chi, _, _ = chi2_contingency(contingency)

In [8]:
# T test for time on page
group_A_time = data[data['title_version'] == "A"]['time_on_page']
group_B_time = data[data['title_version'] == "B"]['time_on_page']
t_stat, p_ttest = ttest_ind(group_A_time, group_B_time)

In [9]:
print(f"\nChi-square Test for Bounce Rate:\nChi2 = {chi2:.4f}, p = {p_chi:.4f}")
print("Bounce Rate Significant?" , "Yes" if p_chi < 0.05 else "No")

print(f"\nT-Test for Time on Page:\nt = {t_stat:.4f}, p = {p_ttest:.4f}")
print("Time on Page Difference Significant?", "Yes" if p_ttest < 0.05 else "No")


Chi-square Test for Bounce Rate:
Chi2 = 11.6105, p = 0.0007
Bounce Rate Significant? Yes

T-Test for Time on Page:
t = -3.7604, p = 0.0002
Time on Page Difference Significant? Yes
