<img src="http://imgur.com/1ZcRyrc.png" style="float: left; margin: 20px; height: 50px">

# Bayesian A/B Testing

<div style="clear: both;"></div>

---

### Learning Objectives
1. Create a Bayesian A/B test
1. Understand the explore/exploit dilemma and how to optimize your experiments

In [None]:
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt

## Background
---

You've just graduated from DSI (congrats!). Your first job is at a startup. 

- Your role: Chief Data Scientist in Charge
- Your salary: Money

Your first task is to design an experiment, testing a new landing page. The current version has the following call to action (CTA):

<a href="https://youtu.be/dQw4w9WgXcQ?t=42" class="btn btn-success">Act Today!</a>

Marketing wants to test another version:

<a href="https://youtu.be/dQw4w9WgXcQ?t=42" class="btn btn-success">Buy now!</a>

Treatment A has been around awhile, and has roughly a 3% conversion rate. 

## Setup
---

All the data for this lecture is fake. We need to randomly generate our parameters for A and B. In reality, these would never be known.

In [None]:
# Remember, you're not supposed to know these!

Since we're working exclusively with Beta distributions today, let's create a `np` array for plotting our X axes.

## Priors
---


### Treatment A

As mentioned before, treatment A has roughly a 3% conversion ratio, meaning 3% of visitors to this page will click the button. We want a beta distribution that reflects this ratio, so this will be our prior:

```python
A_alpha = 3
A_beta = 97
A_distn = stats.beta(A_alpha, A_beta)
```

### Treatment B

A is the champ, B is the challenger. Since we have never released B into the wild, we'll start with a uniform prior:

```python
B_alpha = 1
B_beta = 1
B_distn = stats.beta(B_alpha, B_beta)
```

## Challenge
---
Plot the two beta distributions in the cell below.

## Challenge
---

Because this is fake data, we need to simulate a prospective customer converting with a given probability.

Write a function that takes a probability `p` as a parameter. The function should randomly return 1 with a probability `p` or 0 with a probability `1 - p`.

```python
fake_conversion(.03) # returns a 1 roughly 3% of the time
```

## Simulation

---

Now we will simulate our experiment. Because the entire experiment needs to be run in a `for` loop, the cell below is heavily commented as it will be rather large once we're done.

In [None]:
# Set a random seed of 11

# We're going to run this simulation a lot, so we'll need to
# reset our Beta distributions

# For each treatment we'll collect the results from our experiment
# Create four counter variables: A_trials, A_conversions, B_trials, B_conversions


iterations = 10
for i in range(iterations):
    # Split between A and B
        # Create fake_conversion
        # Increment A_trials
        # Increment A_conversions
        # Create new beta dist'n
    # Conditional for B
        # Create fake_conversion
        # Increment B_trials
        # Increment B_conversions
        # Create new beta dist'n
    
    # Plot ~3 charts
    continue # get rid of this!

## Sampling from our posteriors

---

Now that we have our posterior distributions, we can sample from each to calculate the probability that B is higher than A.

## Explore/Exploit dilemma

---

> Suppose you are faced with N slot machines (colourfully called multi-armed bandits). Each bandit has an unknown probability of distributing a prize (assume for now the prizes are the same for each bandit, only the probabilities differ). Some bandits are very generous, others not so much. Of course, you don't know what these probabilities are. By only choosing one bandit per round, our task is devise a strategy to maximize our winnings. - [Bayes Methods for Hackers](https://nbviewer.jupyter.org/github/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/blob/master/Chapter6_Priorities/Ch6_Priors_PyMC3.ipynb#Example:-Bayesian-Multi-Armed-Bandits)

In our experiment, we're actually trying to balance two objectives: we want to find out which treatment is better, **while at the same time maximizing revenue**. 
