<h3> A/B Test </h3>
<p>Show either ad A or ad B to customers, randomly (N_a, N_b impressions). Observe clicks (n_a, n_b). Each ad is then a Bernoulli trial with probability p_a or p_b of ad being clicked. Thus n_a/N_a and n_b/N_b are approximately normal rvars. </p>

In [10]:
from __future__ import division
from scipy import stats
import math

def estimated_parameters(N, n):
    p = n / N
    sigma = math.sqrt(p* ( 1 - p) / N)
    return p, sigma

<p>Assuming independence of n_a/N_a and n_b/N_b implies their difference is also a normal rvar with mean p_b - p_a and standard deviation = sqrt(sigma_a^2 + sigma_b^2) (this is actually an approximation good for large data sets since we don't know the true standard deviations)</p>

<p>Let H_0 be the null that p_a and p_b are the same, that is, that p_a - p_b = 0 . Thus, the following statistic is approx. standard normal. </p>

In [8]:
def a_b_test_statistic(N_A, n_A, N_B, n_B):
    p_A, sigma_A = estimated_parameters(N_A, n_A)
    p_B, sigma_B = estimated_parameters(N_B, n_B)
    return (p_B - p_A) / math.sqrt(sigma_A ** 2 + sigma_B ** 2)

<p>If A gets 200 clicks out of 1000 views and B gets 180 clicks out of 1000 views: </p>

In [9]:
z = a_b_test_statistic(1000, 200, 1000, 180)
print z

-1.1403464899


<p>The probability of seeing a value at least as extreme as the observed value z is:</p>

In [12]:
2 * stats.norm.cdf(z)

0.25414197654223603

<p>That is, if the means were equal there would be a ~25% chance of seeing a value at least as extreme as the observed value. So we fail to reject the null that the ads have the same CTR</p>

<p> If instead we'd seen ad B get 150 clicks out of 1000 impressions, we'd have: </p>

In [14]:
z = a_b_test_statistic(1000, 200, 1000, 150)
print z

-2.9488391231


In [15]:
2 * stats.norm.cdf(z)

0.0031896997062168583

<p> And since 0.003 &lsaquo; 0.05 we'd reject the null that the ads are equally effective under the 5% significance level - ad A is likely more effective. </p>