# Contents

1. Bayes Rule
2. An Example A/B Testing With Naive Bayes
3. BSTS
4. Naive Bayes
5. MCMC

# Bayes Rule and the canonical example

## Deriving Bayes Rule

Bayes rule is easy to derive from the following identity: $P(A,B) = P(A)P(B\mid A)$. In words, the chances of A and B occuring is equal to the probability of A occuring times the probability of B given that A has occured. Because $P(A,B) = P(B,A)$ , we can also say that $P(B,A) = P(B)P(A\mid B)$. Setting both right hand sides equal, we get that

$$ P(A \mid B) = \frac{P(A)P(B \mid A)}{P(B)} $$,

which is bayes rule. 

## A canononical example

# A/B Testing 

Say that we are responsible for improving the useage rate of a feature on a media sharing website. Currently 50% of the users use the feature. We want to see if we can improve that.

## Choosing Prior Parameters

Before we get started, we need to select a prior mean and prior standard deviation for each of our experiments. 

In [1]:
mu_prior = [0.5, 0.5] # Prior for the A group (control) and B (Treatment Group)
sigma_prior = [0.1, 0.1]
delta = 0.02 # We want there to be at least a 2% increase before we change to the treatment condition

We run our experiment and we end up getting the following results:

## Running the Experiment

In [20]:
uses = [9876, 5230]
visitors = [19400, 9580]

In [21]:
from scipy.special import betaln
from math import exp, log

def probability_b_better(a_success, a_failure, b_success, b_failure):
    """Computes the probability that our b 
    
    We use the beta distribution here because it is a good prior distribution for modeling 
    bionomial parameter p
    """
    total = 0.0
    for i in range(b_success - 1):
        total += exp(betaln(a_success + i, b_failure + a_failure) - 
                    log(b_failure + i) - betaln(1 + i, b_failure) -
                    betaln(a_success, b_success))
    return total

In [22]:
probability_b_better(uses[0] + 1, visitors[0] - uses[0] + 1, 
                    uses[1] + 1, visitors[1] - uses[1] + 1)

0.0

## Another Approach: Simulation

In [32]:
from numpy.random import beta as beta_dist
import numpy as np
N_samp = 10000 # number of samples to draw
clicks_A = uses[0]
views_A = visitors[0]
clicks_B = uses[1]
views_B = visitors[1]
alpha = 1.1 # just for the example - set your own!
beta = 1
A_samples = beta_dist(clicks_A + alpha, views_A - clicks_A + beta, N_samp)
B_samples = beta_dist(clicks_B + alpha, views_B - clicks_B + beta, N_samp)

This is the posterior probability that A is greater than B

In [37]:
len(A_samples)

10000

In [35]:
np.mean(B_samples > A_samples)

1.0

We can also estimate the probability that A relative to B is at least 10% greater

In [34]:
np.mean((B_samples - A_samples)/ A_samples > .07) 

0.57620000000000005

As we can see, simulation is quite a lot easier than 

## Advantages of Bayesian A/B Testing

# Further Reading

1. How programming makes [stats](https://speakerdeck.com/jakevdp/statistics-for-hackers) easy
2. How Bayesian stats doesn't avoid the [peeking rule](http://varianceexplained.org/r/bayesian-ab-testing/)