# Question 85 - Testing user conversion

Given the following [data set](https://docs.google.com/spreadsheets/d/1WnKwSW--x835Uokeq6xInuHg3xxdOyoKcuVfFVcV870/edit#gid=1401812744), can you see if there's a significant difference between the conversion rate of users between the test and control group? The relevant columns in the table are conversion and test. The conversion column has values of 0 and 1 which represent if the user converted (1) or not (0). The test table has values of 0 and 1 as well, 0 for the control group and 1 for the test group.

The solution for premium users will written using Python Pandas. 

In [1]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

df = pd.read_csv('q085_data.csv')
df.head()

Unnamed: 0,user_id,date,source,device,browser_language,ads_channel,browser,conversion,test
0,315281,2015-12-03,Direct,Web,ES,,IE,1,0
1,497851,2015-12-04,Ads,Web,ES,Google,IE,0,1
2,848402,2015-12-04,Ads,Web,ES,Facebook,Chrome,0,0
3,290051,2015-12-03,Ads,Mobile,Other,Facebook,Android_App,0,1
4,548435,2015-11-30,Ads,Web,ES,Google,FireFox,0,1


In [2]:
group_conv = dict(df.groupby('test').apply(lambda s: s['conversion'].sum() / len(s)))
conv0, conv1 = group_conv[0], group_conv[1]
print(conv0, conv1)

0.053033658104517274 0.04455658809772909


In [3]:
from statsmodels.stats.weightstats import ztest

ztest(
    df[df['test']==0]['conversion'], 
    x2=df[df['test']==1]['conversion'],
    value=0,
    alternative='two-sided'
)

# https://towardsdatascience.com/hypothesis-testing-in-machine-learning-using-python-a0dc89e169ce
# https://towardsdatascience.com/spotting-conversion-rate-drop-with-two-sample-hypothesis-testing-using-e-commerce-monitoring-24542ada6122


(3.631934592190223, 0.0002813044301045215)

# Conclusion

- Conversion is 5.30% for group 0 and 4.46% for group 1. 
- The difference is significant: p = .0003