## Targetting an experiment

In [18]:
import numpy as np
from collections import Counter

In [24]:
new_zealand = {'Ncont': 6021, 'Xcont': 302, 'Nexp': 5979, 'Xexp': 374}
other = {'Ncont': 50000, 'Xcont': 2500, 'Nexp': 50000, 'Xexp': 2500}

global_ = dict(Counter(new_zealand) + Counter(other))

In [15]:
def p(cont_or_exp, data):
    return data['X{}'.format(cont_or_exp)] / data['N{}'.format(cont_or_exp)]

def p_pool(data):
    return (data['Xcont'] + data['Xexp']) / (data['Ncont'] + data['Nexp'])

def se_pool(data):
    return np.sqrt(p_pool(data) * (1-p_pool(data)) * ((1 / data['Ncont']) + (1 / data['Nexp'])))

In [37]:
print('The pooled global p is {}'.format(p_pool(global_)))

The pooled global p is 0.05067857142857143


In [38]:
print('The pooled global SE is {}'.format(se_pool(global_)))

The pooled global SE is 0.0013108102809227253


In [39]:
print('The estimated difference globally is {}'.format(p('exp', global_) - p('cont', global_)))

The estimated difference globally is 0.0013237234004343165


In [40]:
print('The margin of error is {}'.format(1.96 * se_pool(global_)))

The margin of error is 0.0025691881506085417


Since the margin of error is wider than the difference, the confidence interval will include 0, which means that the observed difference is not significant

In [44]:
print('The pooled new zealand p is {}'.format(p_pool(new_zealand)))
print('The pooled global SE is {}'.format(se_pool(new_zealand)))
print('The estimated difference globally is {}'.format(p('exp', new_zealand) - p('cont', new_zealand)))
print('The margin of error is {}'.format(1.96 * se_pool(new_zealand)))

The pooled new zealand p is 0.05633333333333333
The pooled global SE is 0.00420953442023799
The estimated difference globally is 0.012394485165776618
The margin of error is 0.00825068746366646


There is a statistically signficant effect by the experiment in New Zealand, but not in non-New Zealand countries.