# A/B Testing â€” Web Analytics Signup Experiment

This notebook uses a **synthetic dataset** to simulate a real-world web analytics A/B test comparing two website designs (Control vs Variant) with the goal of improving user signup conversion.

## Experiment Assumptions

- One row per user (user-level analysis)
- Users are randomly assigned to a single variant
- Users are exposed to **only one** version of the website
- Conversion is binary (signed up or not)
- Variant B has a *slightly* higher true conversion rate than Control


## Dataset Schema & Rationale

- **user_id**: unique identifier and unit of analysis
- **variant**: experimental group (A = Control, B = Variant)
- **converted**: primary metric (signup event, 0/1)
- **session_duration**: secondary behavioral metric
- **device**: used to validate randomization across platforms
- **traffic_source**: used to check for acquisition bias


## Data Generation Notes

Conversion probabilities are intentionally close between variants to reflect realistic product experiments, where effects are typically small and noisy. Randomness is preserved to avoid deterministic outcomes.

## Initial Data Validation

The following checks ensure:
- Dataset size is sufficient
- Variants are reasonably balanced
- Schema and data types match expectations
- No obvious data integrity issues are present


## Next Steps

- Compute signup rates per variant
- Estimate absolute and relative lift
- Perform a two-proportions z-test
- Quantify uncertainty using confidence intervals
- Translate results into a business decision


In [8]:
import pandas as pd
import numpy as np

In [9]:
# Read dataset
df = pd.read_csv("../data/synthetic_dataset.csv", sep=";")

### Basic data quality checks

In [11]:
# Display shape and first 5 rows of the dataset
print(df.shape)
df.head()

(10000, 6)


Unnamed: 0,user_id,variant,converted,session_duration_sec,device,traffic_source
0,U-0001,A,0,130,Mobile,Paid
1,U-0002,B,0,209,Mobile,Organic
2,U-0003,A,0,245,Mobile,Referral
3,U-0004,A,0,124,Web,Paid
4,U-0005,A,0,191,Mobile,Organic


In [12]:
# Display column names, their count of non-null values and data types
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 6 columns):
 #   Column                Non-Null Count  Dtype 
---  ------                --------------  ----- 
 0   user_id               10000 non-null  object
 1   variant               10000 non-null  object
 2   converted             10000 non-null  int64 
 3   session_duration_sec  10000 non-null  int64 
 4   device                10000 non-null  object
 5   traffic_source        10000 non-null  object
dtypes: int64(2), object(4)
memory usage: 468.9+ KB


In [13]:
# Check for null values
df.isna().sum()

user_id                 0
variant                 0
converted               0
session_duration_sec    0
device                  0
traffic_source          0
dtype: int64

In [14]:
# Check for duplicated rows
df.duplicated().sum()

0

### Compute Statistics

In [16]:
grouped_by_variant = df.groupby("variant", as_index=False)[["user_id", "converted"]].agg({"user_id": "count", "converted": "sum"})
grouped_by_variant["signup_rate"] = grouped_by_variant["converted"] / grouped_by_variant["user_id"]
grouped_by_variant

# Count of users per variant
n_A = grouped_by_variant.loc[grouped_by_variant["variant"] == "A", "user_id"].iloc[0]
n_B = grouped_by_variant.loc[grouped_by_variant["variant"] == "B", "user_id"].iloc[0]

# Count of conversions per variant
c_A = grouped_by_variant.loc[grouped_by_variant["variant"] == "A", "converted"].iloc[0]
c_B = grouped_by_variant.loc[grouped_by_variant["variant"] == "B", "converted"].iloc[0]

# Control and Variant signup rates
sr_A = grouped_by_variant.loc[grouped_by_variant["variant"] == "A", "signup_rate"].iloc[0]
sr_B = grouped_by_variant.loc[grouped_by_variant["variant"] == "B", "signup_rate"].iloc[0]

# Absolute lift
abs_lift = sr_B - sr_A
abs_lift

# Relative lift
rel_lift = (sr_B - sr_A) / sr_A
rel_lift

print(f"Control group user count: {n_A} | Control group conversions: {c_A}")
print(f"Variant group user count: {n_B} | Control group conversions: {c_B}")
print(f"Signup Rate for Control: {sr_A * 100:.2f}%")
print(f"Signup Rate for Variant: {sr_B * 100:.2f}%")
print(f"Absolute lift: {abs_lift * 100:.2f}%")
print(f"Relative lift: {rel_lift * 100:.2f}%")

Control group user count: 4990 | Control group conversions: 696
Variant group user count: 5010 | Control group conversions: 790
Signup Rate for Control: 13.95%
Signup Rate for Variant: 15.77%
Absolute lift: 1.82%
Relative lift: 13.05%


### Run Test and compute Z-Stat and p-value

In [18]:
from statsmodels.stats.proportion import proportions_ztest

z_stat, p_value = proportions_ztest(
    count=[c_B, c_A],
    nobs=[n_B, n_A],
    alternative="larger"
)

In [19]:
alpha = 0.05

if p_value < alpha:
    print("Result is statistically significant. Reject the null hypothesis.")
else:
    print("Result is not statistically significant. Fail to reject the null hypothesis.")

Result is statistically significant. Reject the null hypothesis.


### Compute Confidence Interval (CI)

In [21]:
from statsmodels.stats.proportion import confint_proportions_2indep

In [22]:
ci_low, ci_high = confint_proportions_2indep(
    count1=c_B, nobs1=n_B,   # Variant
    count2=c_A, nobs2=n_A,   # Control
    method="wald",
    alpha=0.05
)

In [23]:
print(f"95% CI for absolute uplift: [{ci_low*100:.2f}%, {ci_high*100:.2f}%]")


95% CI for absolute uplift: [0.43%, 3.21%]


### Interpretation

In [25]:
print(
    f"The new design increases signup rate by {abs_lift*100:.2f}% points on average. "
    f"We are 95% confident the true uplift lies between "
    f"{ci_low*100:.2f}% and {ci_high*100:.2f}%."
)

The new design increases signup rate by 1.82% points on average. We are 95% confident the true uplift lies between 0.43% and 3.21%.


## Final Recommendation

Based on the results of the A/B test, the new website design (Variant B) shows an increase in signup rate compared to the current design (Control).

The observed uplift is statistically significant and the estimated effect size is meaningful from a business perspective. While there is natural uncertainty around the exact magnitude of the improvement, the confidence interval suggests that the true impact is unlikely to be negative.

### Recommendation
We recommend **shipping the new design** as the default experience for users.

### Rationale
- The experiment indicates a real improvement in user signups rather than random variation.
- The estimated lift, even under conservative assumptions, represents a positive outcome for the business.
- No evidence of downside risk or degraded user behavior was observed.

### Next Steps
- Roll out the new design gradually and monitor signup rate post-deployment.
- Track secondary metrics (e.g. session duration, bounce rate) to ensure no unintended side effects.
- Consider follow-up experiments to further optimize conversion (e.g. CTA copy, form length, or page layout).