# 📚 Private A/B Testing

Built by **Stu** 🚀

## Section 1: Basics of A/B Testing + Privacy

### Exercise 1: Define A/B Testing

In [1]:
ab_testing_definition = "Compare two versions (A and B) to evaluate which performs better statistically."

### Exercise 2: Sketch Privacy Risks in A/B Testing

In [2]:
privacy_risks_sketch = "Raw counts and conversion rates could leak individual behavior if datasets are small."

## Section 2: Simulate A/B Data

### Exercise 3: Simulate Control and Treatment Group Outcomes

In [3]:
np.random.seed(42)
control = np.random.binomial(1, 0.4, size=500)
treatment = np.random.binomial(1, 0.45, size=500)

### Exercise 4: Compute True Conversion Rates

In [4]:
control_rate = np.mean(control)
treatment_rate = np.mean(treatment)
control_rate, treatment_rate

### Exercise 5: Add Laplace Noise to Rates

In [5]:
def add_laplace_noise(value, sensitivity=1.0, epsilon=1.0):
    scale = sensitivity / epsilon
    return value + np.random.laplace(0, scale)

noisy_control_rate = add_laplace_noise(control_rate, epsilon=1.0)
noisy_treatment_rate = add_laplace_noise(treatment_rate, epsilon=1.0)
noisy_control_rate, noisy_treatment_rate

### Exercise 6: Conduct Noisy Z-Test

In [6]:
import scipy.stats as stats

p1 = noisy_control_rate
p2 = noisy_treatment_rate
n1 = len(control)
n2 = len(treatment)

pooled = (p1*n1 + p2*n2) / (n1 + n2)
z = (p1 - p2) / np.sqrt(pooled*(1-pooled)*(1/n1 + 1/n2))
p_value = 2*(1 - stats.norm.cdf(abs(z)))
z, p_value

### Exercise 7: Reflect on Noisy Hypothesis Testing

In [7]:
hypothesis_testing_reflection = "Noise increases p-values and reduces sensitivity to detect small true effects."

## Section 3: Advanced

### Exercise 8: Plot p-value Distributions Under Different ε

In [8]:
epsilons = [0.1, 0.5, 1.0, 2.0]
pvals = []
for eps in epsilons:
    p1 = add_laplace_noise(control_rate, epsilon=eps)
    p2 = add_laplace_noise(treatment_rate, epsilon=eps)
    pooled = (p1*n1 + p2*n2) / (n1 + n2)
    z = (p1 - p2) / np.sqrt(pooled*(1-pooled)*(1/n1 + 1/n2))
    pval = 2*(1 - stats.norm.cdf(abs(z)))
    pvals.append(pval)

plt.plot(epsilons, pvals)
plt.xlabel('ε')
plt.ylabel('p-value')
plt.title('Impact of ε on p-value')
plt.show()

### Exercise 9: Sketch Tradeoffs in DP A/B Testing

In [9]:
tradeoffs_sketch = "More privacy (lower ε) → noisier decisions; less privacy (higher ε) → better sensitivity but more risk."

### Exercise 10: Define Sensitivity for Rates

In [10]:
rate_sensitivity = "1/n for sample size n, assuming binary outcomes."

### Exercise 11: Reflect on Practical Settings

In [11]:
practical_reflection = "When sample sizes are small, DP may make detecting real effects very difficult; need larger n or smarter techniques."