# Bonus Exercise

<div class="alert alert-warning">
<h3>Objective:</h3>

The following tasks are aimed to help you gain deeper insight when implementing an A/B test. The insight gained from this exercise will help you understand the importance of the following aspects of an A/B test:

- Probability of making an error and the subsequent impact on the conversion rate
- The number of samples required to run an A/B test. Or the duration for which the A/B test should be run
 
</div>

Consider a scenario where you have implemented an A/B test with the following specifications:

- You assumed a $\beta(1,1)$ as the prior for both variants.
- You have 10 samples and 0.2 conversion rate for variant A.
- You have 10 samples and 0.3 conversion rate for variant B.
- You have obtained the posterior distributions of conversion rates for both variants.

You decide to stop the test and declare B as the winner. However, you are not sure if you have made the right decision. So, you decide to run a few checks. You find out that:

- About two third of times variant B happen to be better than variant A.
- On average, there is an about 4% loss in conversion rate by choosing variant B over variant A.

<div class="alert alert-info">
<h4>Task 1</h4>

Your task is to develop solution methods that give you the above mentioned quantities (two third as a probability, and 4% loss) using three different methods:

1. Closed form solution
2. Numerical approximation
3. Simulation i.e. based on the outcomes from PyMC
</div>

<div class="alert alert-info">
<h4>Task 2</h4>

You realize that the 4% loss in conversion rate is not acceptable. So, you decide to collect more samples. But at some point you have to stop the test. You decide to stop the test when the loss in conversion rate drops below 1%. How many more samples you need to collect to drop the loss to less than 1%? For simplicity assume that $N_A$ = $N_B$.

</div>


<div class="alert alert-success">

Hint: given $\theta_A \sim \beta(a_1,b_1)$ and $\theta_B \sim \beta(a_2,b_2)$ as posterior distributions for the conversion rates of variants A and B, we can show that:

\begin{align*}
P(\theta_B > \theta_A) &= 1- H(a_1,b_1,a_2,b_2)
\end{align*}

where $H(a_1,b_1,a_2,b_2)$ can be calculated using the following function:



```
from scipy.special import gamma

def compute_H(a_1,b_1,a_2,b_2):
    prob = 0
    for i in range(a_2):
        prob += gamma(a_1+i)*gamma(b_1+b_2)*gamma(1+b_2+i)*gamma(a_1+b_1)/gamma(a_1+b_1+b_2+i)/gamma(1+i)/gamma(b_2)/gamma(a_1)/gamma(b_1)/(b_2+i)
    return 1-prob
```
</div>
