### Does the new landing page increase conversion rate compared with the old page?

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats

In [2]:
df = pd.read_csv('ab_data.csv')
df.head()

Unnamed: 0,user_id,timestamp,group,landing_page,converted
0,851104,11:48.6,control,old_page,0
1,804228,01:45.2,control,old_page,0
2,661590,55:06.2,treatment,new_page,0
3,853541,28:03.1,treatment,new_page,0
4,864975,52:26.2,control,old_page,1


In [3]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 294480 entries, 0 to 294479
Data columns (total 5 columns):
 #   Column        Non-Null Count   Dtype 
---  ------        --------------   ----- 
 0   user_id       294480 non-null  int64 
 1   timestamp     294480 non-null  object
 2   group         294480 non-null  object
 3   landing_page  294480 non-null  object
 4   converted     294480 non-null  int64 
dtypes: int64(2), object(3)
memory usage: 11.2+ MB


In [4]:
df.isna().sum()

user_id         0
timestamp       0
group           0
landing_page    0
converted       0
dtype: int64

In [13]:
total_control_number = df[df.group == 'control'].shape[0]
total_treatment_number = df[df.group == 'treatment'].shape[0]
print(f"Total Number of Control: {total_control_number}")
print(f"Total Number of Treatment: {total_treatment_number}")

Total Number of Control: 147202
Total Number of Treatment: 147278


In [17]:
total_control_converted = df[(df.group == 'control') & (df.converted == 1)].shape[0]
total_treatment_converted = df[(df.group == 'treatment') & (df.converted == 1)].shape[0]
print(f"Conversion Rate (Control): {total_control_converted/total_control_number}")
print(f"Conversion Rate (Treatment): {total_treatment_converted/total_treatment_number}")

Conversion Rate (Control): 0.12039917935897611
Conversion Rate (Treatment): 0.11891796466546259


#### Formulating Hypothesis:
 - Null Hypothesis: Conversion Rate (Control) = Conversion Rate (Treatment)
 - Alternative Hypothesis: Conversion Rate (Treatment) > Conversion Rate (Control)

In [19]:
from statsmodels.stats.proportion import proportions_ztest

In [21]:
counversions = [total_treatment_converted, total_control_converted]
total_size = [total_treatment_number, total_control_number]
z_stat, p_value = proportions_ztest(counversions, total_size, alternative='larger')
print("z-statistic:", z_stat)
print("p-value:", p_value)


z-statistic: -1.2382796210706355
p-value: 0.8921938012768196


**with significance level of 0.05, we can say that we failed at rejecting the null hypothesis and new and old landing page are the same in conversion rate.**