# Bayesian A/B Testing (with Synthetic Data)

In this notebook, we will use the PyMC3 library to do some Bayesian A/B testing. We will be using synthetic data, so we can see how the process works and know that we got the expected answer.

In the other notebook in this mini-project, we will use real data for a more complex analysis.

A/B testing is the bread and butter of traditional statistics. It is, like, what statisticians DO. Traditionally, an experiment would be designed so that no tested factors (or few tested factors, depending on the complexity of the experiment) were confounded, which we will talk about a bit more later, and then a linear regression would be run and conclusions drawn from an ANalysis Of VAriance (ANOVA). Experimental design is a huge and complicated and interesting and tedious subject, so we won't talk about it too much here. We might need to discuss experimental design in more depth in the real-date notebook.

Introducing Bayesian statistics into the A/B testing framework allows us to draw similar conclusions as by traditional methods, but allows us to phrase the conclusions differently, more intuitively. We'll see that later.

## Storyline

The story here...say we are hired by a fitness blog to help drive subscriptions to their email newsletter. This is a pretty common use of A/B testing, to improve online engagement and conversion. There is a pop-up box on the website that asks each visitor for their email, and the web-designer creates two versions:

1. The pop-up box shows a two-second gif of the blogger doing squats
2. The pop-up box shows a two-second gif of the blogger flexing and smiling

That is the only difference between the two versions of the pop-up box. We want to know which one gets more people to type in their email addresses.

### Experimental Design

This experiment is pretty easy to design, because we are only testing one factor: The content of the gif. 

In [21]:
import numpy as np
from scipy.stats import ttest_ind
import pymc3 as pm



In [2]:
np.random.seed(0)
conversions1 = np.random.binomial(1, 0.01, size=5400)
conversions2 = np.random.binomial(1, 0.015, size=4600)

In [19]:
# Conversion rates
print(f'Conversion Rate 1: {conversions1.mean():.1%}')
print(f'Conversion Rate 1: {conversions2.mean():.1%}')

Conversion Rate 1: 0.9%
Conversion Rate 1: 1.5%


In [20]:
# Traditional Hypotheses Test p-value
print(f'p-value: {ttest_ind(conversions1, conversions2, equal_var=False, alternative="less").pvalue:.1%}')

p-value: 0.5%


In [22]:
# Bayesian Method
with pm.Model():
    # priors
    rate1 = pm.Beta('conversions1', 1, 99)
    rate2 = pm.Beta('conversions2', 1, 99)
    
    # posteriors
    obs1 = pm.Bernoulli('obs1', rate1, observed=conversions1)
    obs2 = pm.Bernoulli('obs2', rate2, observed=conversions2)
    
    # sample
    trace = pm.sample(return_inferencedata=True)


You can find the C code in this temporary file: C:\Users\johnr\AppData\Local\Temp\theano_compilation_error_l300n7_u


Exception: ("Compilation failed (return status=1): C:\\Users\\johnr\\AppData\\Local\\Temp\\cczblgZ8.o: In function `run':\r. C:/Users/johnr/AppData/Local/Theano/compiledir_Windows-10-10.0.19041-SP0-Intel64_Family_6_Model_142_Stepping_9_GenuineIntel-3.9.1-64/tmp5q3qq05h/mod.cpp:99: undefined reference to `__imp__Py_NoneStruct'\r. C:/Users/johnr/AppData/Local/Theano/compiledir_Windows-10-10.0.19041-SP0-Intel64_Family_6_Model_142_Stepping_9_GenuineIntel-3.9.1-64/tmp5q3qq05h/mod.cpp:124: undefined reference to `__imp_PyExc_ValueError'\r. C:/Users/johnr/AppData/Local/Theano/compiledir_Windows-10-10.0.19041-SP0-Intel64_Family_6_Model_142_Stepping_9_GenuineIntel-3.9.1-64/tmp5q3qq05h/mod.cpp:130: undefined reference to `__imp_PyExc_RuntimeError'\r. C:/Users/johnr/AppData/Local/Theano/compiledir_Windows-10-10.0.19041-SP0-Intel64_Family_6_Model_142_Stepping_9_GenuineIntel-3.9.1-64/tmp5q3qq05h/mod.cpp:158: undefined reference to `__imp_PyExc_NotImplementedError'\r. C:/Users/johnr/AppData/Local/Theano/compiledir_Windows-10-10.0.19041-SP0-Intel64_Family_6_Model_142_Stepping_9_GenuineIntel-3.9.1-64/tmp5q3qq05h/mod.cpp:195: undefined reference to `__imp__Py_NoneStruct'\r. C:/Users/johnr/AppData/Local/Theano/compiledir_Windows-10-10.0.19041-SP0-Intel64_Family_6_Model_142_Stepping_9_GenuineIntel-3.9.1-64/tmp5q3qq05h/mod.cpp:210: undefined reference to `__imp_PyExc_ValueError'\r. C:/Users/johnr/AppData/Local/Theano/compiledir_Windows-10-10.0.19041-SP0-Intel64_Family_6_Model_142_Stepping_9_GenuineIntel-3.9.1-64/tmp5q3qq05h/mod.cpp:479: undefined reference to `__imp_PyExc_NotImplementedError'\r. C:\\Users\\johnr\\AppData\\Local\\Temp\\cczblgZ8.o: In function `_Py_INCREF':\r. C:/Users/johnr/anaconda3/envs/pymc3/include/object.h:408: undefined reference to `__imp__Py_NoneStruct'\r. C:\\Users\\johnr\\AppData\\Local\\Temp\\cczblgZ8.o: In function `run':\r. C:/Users/johnr/AppData/Local/Theano/compiledir_Windows-10-10.0.19041-SP0-Intel64_Family_6_Model_142_Stepping_9_GenuineIntel-3.9.1-64/tmp5q3qq05h/mod.cpp:485: undefined reference to `__imp_PyExc_RuntimeError'\r. C:/Users/johnr/AppData/Local/Theano/compiledir_Windows-10-10.0.19041-SP0-Intel64_Family_6_Model_142_Stepping_9_GenuineIntel-3.9.1-64/tmp5q3qq05h/mod.cpp:244: undefined reference to `__imp_PyExc_NotImplementedError'\r. C:/Users/johnr/AppData/Local/Theano/compiledir_Windows-10-10.0.19041-SP0-Intel64_Family_6_Model_142_Stepping_9_GenuineIntel-3.9.1-64/tmp5q3qq05h/mod.cpp:265: undefined reference to `__imp_PyExc_RuntimeError'\r. C:/Users/johnr/AppData/Local/Theano/compiledir_Windows-10-10.0.19041-SP0-Intel64_Family_6_Model_142_Stepping_9_GenuineIntel-3.9.1-64/tmp5q3qq05h/mod.cpp:173: undefined reference to `__imp_PyExc_TypeError'\r. C:/Users/johnr/AppData/Local/Theano/compiledir_Windows-10-10.0.19041-SP0-Intel64_Family_6_Model_142_Stepping_9_GenuineIntel-3.9.1-64/tmp5q3qq05h/mod.cpp:179: undefined reference to `__imp_PyExc_RuntimeError'\r. C:/Users/johnr/AppData/Local/Theano/compiledir_Windows-10-10.0.19041-SP0-Intel64_Family_6_Model_142_Stepping_9_GenuineIntel-3.9.1-64/tmp5q3qq05h/mod.cpp:515: undefined reference to `__imp__Py_NoneStruct'\r. C:/Users/johnr/AppData/Local/Theano/compiledir_Windows-10-10.0.19041-SP0-Intel64_Family_6_Model_142_Stepping_9_GenuineIntel-3.9.1-64/tmp5q3qq05h/mod.cpp:514: undefined reference to `__imp__Py_NoneStruct'\r. C:/Users/johnr/AppData/Local/Theano/compiledir_Windows-10-10.0.19041-SP0-Intel64_Family_6_Model_142_Stepping_9_GenuineIntel-3.9.1-64/tmp5q3qq05h/mod.cpp:516: undefined reference to `__imp__Py_NoneStruct'\r. C:/Users/johnr/AppData/Local/Theano/compiledir_Windows-10-10.0.19041-SP0-Intel64_Family_6_Model_142_Stepping_9_GenuineIntel-3.9.1-64/tmp5q3qq05h/mod.cpp:259: undefined reference to `__imp_PyExc_TypeError'\r. C:/Users/johnr/AppData/Local/Theano/compiledir_Windows-10-10.0.19041-SP0-Intel64_Family_6_Model_142_Stepping_9_GenuineIntel-3.9.1-64/tmp5q3qq05h/mod.cpp:279: undefined reference to `__imp__Py_NoneStruct'\r. C:/Users/johnr/AppData/Local/Theano/compiledir_Windows-10-10.0.19041-SP0-Intel64_Family_6_Model_142_Stepping_9_GenuineIntel-3.9.1-64/tmp5q3qq05h/mod.cpp:294: undefined reference to `__imp_PyExc_ValueError'\r. C:/Users/johnr/AppData/Local/Theano/compiledir_Windows-10-10.0.19041-SP0-Intel64_Family_6_Model_142_Stepping_9_GenuineIntel-3.9.1-64/tmp5q3qq05h/mod.cpp:328: undefined reference to `__imp_PyExc_NotImplementedError'\r. C:/Users/johnr/AppData/Local/Theano/compiledir_Windows-10-10.0.19041-SP0-Intel64_Family_6_Model_142_Stepping_9_GenuineIntel-3.9.1-64/tmp5q3qq05h/mod.cpp:349: undefined reference to `__imp_PyExc_RuntimeError'\r. C:/Users/johnr/AppData/Local/Theano/compiledir_Windows-10-10.0.19041-SP0-Intel64_Family_6_Model_142_Stepping_9_GenuineIntel-3.9.1-64/tmp5q3qq05h/mod.cpp:216: undefined reference to `__imp_PyExc_RuntimeError'\r. C:/Users/johnr/AppData/Local/Theano/compiledir_Windows-10-10.0.19041-SP0-Intel64_Family_6_Model_142_Stepping_9_GenuineIntel-3.9.1-64/tmp5q3qq05h/mod.cpp:343: undefined reference to `__imp_PyExc_TypeError'\r. C:/Users/johnr/AppData/Local/Theano/compiledir_Windows-10-10.0.19041-SP0-Intel64_Family_6_Model_142_Stepping_9_GenuineIntel-3.9.1-64/tmp5q3qq05h/mod.cpp:300: undefined reference to `__imp_PyExc_RuntimeError'\r. C:/Users/johnr/AppData/Local/Theano/compiledir_Windows-10-10.0.19041-SP0-Intel64_Family_6_Model_142_Stepping_9_GenuineIntel-3.9.1-64/tmp5q3qq05h/mod.cpp:397: undefined reference to `__imp_PyExc_RuntimeError'\r. C:\\Users\\johnr\\AppData\\Local\\Temp\\cczblgZ8.o: In function `instantiate':\r. C:/Users/johnr/AppData/Local/Theano/compiledir_Windows-10-10.0.19041-SP0-Intel64_Family_6_Model_142_Stepping_9_GenuineIntel-3.9.1-64/tmp5q3qq05h/mod.cpp:550: undefined reference to `__imp_PyExc_TypeError'\r. C:\\Users\\johnr\\AppData\\Local\\Temp\\cczblgZ8.o: In function `_import_array':\r. C:/Users/johnr/anaconda3/envs/pymc3/lib/site-packages/numpy/core/include/numpy/__multiarray_api.h:1480: undefined reference to `__imp_PyCapsule_Type'\r. C:/Users/johnr/anaconda3/envs/pymc3/lib/site-packages/numpy/core/include/numpy/__multiarray_api.h:1481: undefined reference to `__imp_PyExc_RuntimeError'\r. C:\\Users\\johnr\\AppData\\Local\\Temp\\cczblgZ8.o: In function `PyInit_mf6917bb35eaa79d4a20c20b9dc13c0435e656b1bdb67265fed3d06258ff43ef9':\r. C:/Users/johnr/AppData/Local/Theano/compiledir_Windows-10-10.0.19041-SP0-Intel64_Family_6_Model_142_Stepping_9_GenuineIntel-3.9.1-64/tmp5q3qq05h/mod.cpp:583: undefined reference to `__imp_PyExc_ImportError'\r. C:\\Users\\johnr\\AppData\\Local\\Temp\\cczblgZ8.o: In function `_import_array':\r. C:/Users/johnr/anaconda3/envs/pymc3/lib/site-packages/numpy/core/include/numpy/__multiarray_api.h:1512: undefined reference to `__imp_PyExc_RuntimeError'\r. C:/Users/johnr/anaconda3/envs/pymc3/lib/site-packages/numpy/core/include/numpy/__multiarray_api.h:1496: undefined reference to `__imp_PyExc_RuntimeError'\r. C:/Users/johnr/anaconda3/envs/pymc3/lib/site-packages/numpy/core/include/numpy/__multiarray_api.h:1502: undefined reference to `__imp_PyExc_RuntimeError'\r. C:/Users/johnr/anaconda3/envs/pymc3/lib/site-packages/numpy/core/include/numpy/__multiarray_api.h:1524: undefined reference to `__imp_PyExc_RuntimeError'\r. C:/Users/johnr/anaconda3/envs/pymc3/lib/site-packages/numpy/core/include/numpy/__multiarray_api.h:1476: undefined reference to `__imp_PyExc_AttributeError'\r. C:/Users/johnr/anaconda3/envs/pymc3/lib/site-packages/numpy/core/include/numpy/__multiarray_api.h:1488: undefined reference to `__imp_PyExc_RuntimeError'\r. collect2.exe: error: ld returned 1 exit status\r. ", 'FunctionGraph(Elemwise{add,no_inplace}(TensorConstant{1.0}, TensorConstant{99.0}))')