# 🧪 A/B Test: Conversion Rate Comparison
This notebook analyzes binary outcome data for a control and variant group to determine whether the new version leads to significantly higher conversion rates.

In [None]:
import pandas as pd
import numpy as np
from scipy import stats
import seaborn as sns
import matplotlib.pyplot as plt

In [None]:
# Load the data
df = pd.read_csv('ab_test_binary.csv')
df.head()

In [None]:
# Check conversion rates by group
summary = df.groupby('group')['converted'].agg(['count', 'sum', 'mean'])
summary.columns = ['n', 'conversions', 'conversion_rate']
summary

In [None]:
# Extract values for z-test
control = df[df['group'] == 'control']['converted']
variant = df[df['group'] == 'variant']['converted']

# Conversion rates
p1 = control.mean()
p2 = variant.mean()

# Sample sizes
n1 = control.count()
n2 = variant.count()

# Pooled proportion and standard error
p_pool = (p1 * n1 + p2 * n2) / (n1 + n2)
se = np.sqrt(p_pool * (1 - p_pool) * (1/n1 + 1/n2))

# Z-score and p-value (two-tailed)
z_score = (p2 - p1) / se
p_value = 2 * (1 - stats.norm.cdf(abs(z_score)))

print(f"Z-Score: {z_score:.4f}, P-Value: {p_value:.4f}")

In [None]:
# Plot conversion rates
sns.barplot(data=df, x='group', y='converted', ci=95)
plt.title('Conversion Rate by Group')
plt.ylabel('Conversion Rate')
plt.show()

## 📊 Interpretation
- If the p-value is below 0.05, the difference in conversion rates is **statistically significant**.
- If not, we **fail to reject** the null hypothesis and cannot conclude a meaningful difference.

👔 Business Action: If the variant performs significantly better, consider rolling out the new version to all users. If not, investigate further or run additional tests.