## Determine if the difference between target and control proportions is significant

### Formulas
<div>
<img src="formula1.png" width="200"/>
</div>
<div>
<img src="formula2.png" width="200"/>
</div>

### Where
	-P	Pooled estimator of proportions
	-pt  	Target cell proportion
	-pc	Control cell proportion	
	-nt	Target cell size
	-nc	Control cell size
	-Z	Test metric
	-Zcrit	Critical value for test metric

    Significance Level         Zcrit
    99%                          2.58
    95%                          1.96
    90%                          1.64

In [9]:
import math

### Example:
We targeted 15.000 customers and sent them a message. 727 bought the product
The Control size is 970 custpomers. They haven't received the message.37 of them bought the product
Had the message real impact?

In [4]:
pt = 727/15000
pc = 37/970
print('Target cell proportion:',"{:.2%}".format(pt),'\nControl cell proportion:',"{:.2%}".format(pc))

Target cell proportion: 4.85% 
Control cell proportion: 3.81%


### Qestion: is this difference significant?

In [19]:
def prop_diff_sig(nt,pt,nc,pc):
    print('are ',"{:.2%}".format(pt),' and ',"{:.2%}".format(pc), 'significantly different?' )
    P=(nt*pt+nc*pc)/(nt+nc)
    Z=abs(pt-pc)/math.sqrt(P*(1-P)*((1/nt)+(1/nc)))
    if Z>2.58:
        print("sig 0.99")
    elif Z>1.96:
        print("sig 0.95")
    elif Z>1.64:
        print("sig 0.90")
    else:
        print("not significant")
    return Z

In [25]:
nt = 15000
nc = 970
st = 727 # succesful sales
sc = 37 # succesful sales
pt = st/nt
pc = sc/nc
prop_diff_sig(nt,pt,nc,pc)

are  4.85%  and  3.81% significantly different?
not significant


1.4598485617852262

## Determine if the difference between target and control averages is significant

### Formulas
<div>
<img src="formula3.png" width="200"/>
</div>
<div>
<img src="formula4.png" width="200"/>
</div>
<div>
<img src="formula5.png" width="200"/>
</div>

### Where

	-S	Pooled estimator of standard deviation
	-st Target cell standard deviation
	-sc	Control cell standard deviation	
	-nt	Target cell size
	-nc	Control cell size
	-Zcrit	Critical value for test metric
	-Xt Target cell average
	-Xc	Control cell average
	-t	Test metric

    If nt + nc > 28 the Zcrit value can be used to determine if t is significant

    Significance Level         Zcrit
    99%                          2.58
    95%                          1.96
    90%                          1.64

### Example:
    -Group A spent on average £10 std = £8. The size of the group is 800.
    -Group B spent on average $12 std = £7. The size of the group is 400.
     Are these average spends different?

In [29]:
def mean_diff_sig(nt,Xt,st, nc, Xc, sc):
    print('are ',Xt,' and ',Xc, 'significantly different?' )
    S = math.sqrt(((nt-1)*(st**2)+(nc-1)*(sc**2))/(nt+nc-2))
    t=abs(Xt-Xc)/(S*math.sqrt((1/nt)+(1/nc)))
    if nt+nc<28:
        print('Not enough sample')
    if t>2.58:
        print("sig 0.99")
    elif t>1.96:
        print("sig 0.95")
    elif t>1.64:
        print("sig 0.90")
    else:
        print("not significant")
    return t

In [30]:
nt = 400
nc = 200
st = 8 # standard deviation
sc = 7 # standard deviation
Xt = 10
Xc = 12

mean_diff_sig(nt,Xt,st, nc, Xc, sc)

are  10  and  12 significantly different?
sig 0.99


3.006371095133696