# Bayesian approach to A/B testing


__Problem Statement__: Lets say, as part of an advertising campaign, the marketing team has come up with 2 versions to a flyer, to promote sign ups to our elite subsciption program. We want to conduct an experiment on a small group of users to decide which version gives us best results. A/B testing is a method to formulate the hypothesis, test it and build statistical evidence to support our findings. 

__Define Experiment__:

`Null Hypothesis`: Assume 2 versions will create same impact. That is conversion rate for the two sample populations, each shown a variant of the flyer, will be equal.

`Sample Size`: Number of users to whom the flyer was emailed to. We will run our experiment on 500 users. Each flyer will be sent to 500 randomly selected customers. No customer will be present in both the groups. 

`Observations`: Lets tabulate the conversions in each group

| Version | Sample size | Conversions |
| :--- | ---: | ---: |
| 1 | 500 | 23 |
| 2 | 500 | 27 |



## Frequentist Approach

In [16]:
from statsmodels.stats.proportion import proportions_ztest, proportion_confint

In [7]:
z_stat, pval = proportions_ztest([23,27], nobs=500)

(lower_v1, lower_v2), (upper_v1, upper_v2) = proportion_confint([23,27], nobs=500, alpha=0.05)

print(f'z statistic: {z_stat:.2f}')
print(f'p-value: {pval:.3f}')
print(f'ci 95% for version 1: [{lower_v1:.3f}, {upper_v1:.3f}]')
print(f'ci 95% for version 2: [{lower_v2:.3f}, {upper_v2:.3f}]')

z statistic: -0.58
p-value: 0.562
ci 95% for version 1: [0.028, 0.064]
ci 95% for version 2: [0.034, 0.074]


Since `p-value` way above acceptable threshold value of `0.05`, we cannot reject the null hypothesis.

## Bayesian Approach

In [12]:
import numpy as np 
import scipy.stats 
import matplotlib.pyplot as plt
np.random.seed(42)

NumofSamples = 10000

In [15]:
v1_samples = scipy.stats.beta.rvs(23,477,size=NumofSamples)
v2_samples = scipy.stats.beta.rvs(27,473,size=NumofSamples)

np.mean(v2_samples > v1_samples)

0.7186

We can say with 72% probability that Version 2 performs better than version 1. We can therefore reject the null hypothesis and pick version 2 flyer for the campaign.