### 0. Intro

#### Dataset was dawnloaded from https://www.kaggle.com/zhangluyuan/ab-testing

In this notebook i will test hypothesis of conversion rate between two groups: control, treatment. Groups have already been splitted and presented into initial dataset. 

There is plan for current task below: 
* Designing of the test
* Data preparation
* Testing of hypothesis

In [1]:
import pandas as pd
import statsmodels.stats.power as smp
from statsmodels.stats.proportion import proportions_ztest, proportion_confint

df = pd.read_csv('ab_data.csv')
df.head()

  import pandas.util.testing as tm


Unnamed: 0,user_id,timestamp,group,landing_page,converted
0,851104,2017-01-21 22:11:48.556739,control,old_page,0
1,804228,2017-01-12 08:01:45.159739,control,old_page,0
2,661590,2017-01-11 16:55:06.154213,treatment,new_page,0
3,853541,2017-01-08 18:28:03.143765,treatment,new_page,0
4,864975,2017-01-21 01:52:26.210827,control,old_page,1


### 1. Designing of the test

I will work with one-tailed z-test, using significance level (alpha) as 0.05:

H<sub>0</sub>: p<sub>0</sub> = p<sub>1</sub>  
H<sub>1</sub>: p<sub>0</sub> < p<sub>1</sub>  

Column **"converted"** contains values of target variable, by which we can calculate a required probability of conversion for given groups.  
Variable "converted", in term of presented users, has the binomial distribution, where each value is result of Bernoulli scheme with required probability of conversion.   
On the basis that ***p*** in Bernoulli scheme is expected value, we can compare probability of conversion between two groups using z-test (due to CLT)

### 2. Data preparation

In [2]:
# removing users, who occur twice or more in dataset
# checking, that user should be only into one single group

duplicate_users = list(pd.DataFrame(df['user_id'].value_counts() > 1).query('user_id == True').index)
df = df[~df['user_id'].isin(duplicate_users)]
pd.crosstab(df['group'], df['landing_page'])

landing_page,new_page,old_page
group,Unnamed: 1_level_1,Unnamed: 2_level_1
control,0,143293
treatment,143397,0


#### Calculating MDE and a sample size  

This should be presented in the part "Designing of the test". But we don't have any priori data about conversion rate and expected effect from current experiment.  
For simplicity let's refer, that expected effect is conversion rate of control group + 0.02. 

In [3]:
# Calculating a conversion rate of control group

control_conversion_prob = round(df[df['group'] == 'control'].converted.mean(), 3)
expected_conversion_prob = control_conversion_prob + 0.02


# Calculating MDE, using Cohen's method

mde = ((expected_conversion_prob - control_conversion_prob) ** 2 / control_conversion_prob) ** 0.5


# Calculating a sample size 

effect_size = mde
power = 0.8
alpha = 0.05
sample_size = round(smp.NormalIndPower().solve_power(effect_size, nobs1=None, alpha=alpha, power=power))
print(f'Sample size is {sample_size}, conversion rate for control group is {control_conversion_prob}')

Sample size is 4709, conversion rate for control group is 0.12


#### Sampling

In [4]:
control_sample = df[df['group'] == 'control'].sample(n=sample_size)
test_sample = df[df['group'] == 'treatment'].sample(n=sample_size)

control_conv_rate = round(control_sample.converted.mean(), 3)
test_conv_rate = round(test_sample.converted.mean(), 3)

print(f'Conversion rate for control group after sampling is {control_conv_rate}, for test group is {test_conv_rate}')

Conversion rate for control group after sampling is 0.117, for test group is 0.126


### 3. Testing of hypothesis

In [5]:
control_successes = control_sample.converted.sum()
test_successes = test_sample.converted.sum()
control_count = control_sample.converted.count()
test_count = test_sample.converted.count()
successes = [control_successes, test_successes]
nobs = [control_count, test_count]

z_stat, p_value = proportions_ztest(count=successes, nobs=nobs, alternative='smaller')
(lower_control, lower_test), (upper_control, upper_test) = proportion_confint(count=successes, nobs=nobs, alpha=0.05, 
                                                                       method='normal')

print(f'Z statistics of the current test is {z_stat:.3f}, with p-value equals {p_value:.3f}')
print(f'Confidence interval of conversion rate for contral group is: from {lower_control:.3f} to {upper_control:.3f}')
print(f'Confidence interval of conversion rate for test group is: from {lower_test:.3f} to {upper_test:.3f}')

Z statistics of the current test is -1.262, with p-value equals 0.104
Confidence interval of conversion rate for contral group is: from 0.108 to 0.126
Confidence interval of conversion rate for test group is: from 0.116 to 0.135


### Conclusion

Having p-value greater than alpha, we can't accept alternative hypothesis about significant difference between control and test group in period of experiment.  
But there are two points:   
first - p-value approaches to alpha,   
second - right border of confidence interval for test group near to target value (conversion rate for control group + 0.02). 
It means, that recent improvements, which were shown only for users in test group, made close result to be accepted as significant. Maybe after technical redesign, experiment will show a significant result. 