

























# Introduction

- For this project, we will be working to understand the results of an A/B test run by an e-commerce website.

- The company has developed a new web page in order to try and increase the number of users who "convert," meaning the number of users who decide to pay for the company's product.

- Your goal is to work through this notebook to help the company understand if they should implement this new page, keep the old page, or perhaps run the experiment longer to make their decision.

# Import Libraries

In [None]:
import numpy as np
import pandas as pd

import matplotlib.pyplot as plt
import seaborn as sns

# Import Data

In [None]:
df = pd.read_csv("
/main/ab_test.csv")
df.head()

Unnamed: 0,id,time,con_treat,page,converted
0,851104,11:48.6,control,old_page,0
1,804228,01:45.2,control,old_page,0
2,661590,55:06.2,treatment,new_page,0
3,853541,28:03.1,treatment,new_page,0
4,864975,52:26.2,control,old_page,1


# Part 01: EDA

In [None]:
# change column names to a more easy-to-understood name
df.columns = ["user_id", "timestamp", "group", "landing_page", "converted"]
df.head()

Unnamed: 0,user_id,timestamp,group,landing_page,converted
0,851104,11:48.6,control,old_page,0
1,804228,01:45.2,control,old_page,0
2,661590,55:06.2,treatment,new_page,0
3,853541,28:03.1,treatment,new_page,0
4,864975,52:26.2,control,old_page,1


In [None]:
#numer of rows and unique users
print(f'Number of rows: {df.shape[0]}')
print(f'Number of unique users: {df.user_id.nunique()}')

Number of rows: 294478
Number of unique users: 290584


In [None]:
#general info
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 294478 entries, 0 to 294477
Data columns (total 5 columns):
 #   Column        Non-Null Count   Dtype 
---  ------        --------------   ----- 
 0   user_id       294478 non-null  int64 
 1   timestamp     294478 non-null  object
 2   group         294478 non-null  object
 3   landing_page  294478 non-null  object
 4   converted     294478 non-null  int64 
dtypes: int64(2), object(3)
memory usage: 11.2+ MB


In [None]:
# Check missing value
df.isna().sum()

user_id         0
timestamp       0
group           0
landing_page    0
converted       0
dtype: int64

In [None]:
# Check duplicated rows
df.duplicated().sum()

0

In [None]:
# Check duplicated rows by user_id
df.duplicated(subset="user_id").sum()

3894

In [None]:
# Get duplicated data by user_id
df[df.duplicated(subset="user_id")]

Unnamed: 0,user_id,timestamp,group,landing_page,converted
2656,698120,13:42.6,control,old_page,0
2893,773192,55:59.6,treatment,new_page,0
7500,899953,06:54.1,control,new_page,0
8036,790934,32:20.3,treatment,new_page,0
10218,633793,16:00.7,treatment,old_page,0
...,...,...,...,...,...
294308,905197,56:47.5,treatment,new_page,0
294309,787083,15:21.0,control,old_page,0
294328,641570,59:27.7,control,old_page,0
294331,689637,34:28.3,control,new_page,0


In [None]:
# Explore several users
df[df["user_id"]==899953]

Unnamed: 0,user_id,timestamp,group,landing_page,converted
3489,899953,36:02.1,treatment,new_page,0
7500,899953,06:54.1,control,new_page,0


- Using groupby, check if there are any mismatch

- control should go to old_page only, treatment shoudl go to new_page only

In [None]:
df.groupby(["group", "landing_page"]).size()

group      landing_page
control    new_page          1928
           old_page        145274
treatment  new_page        145311
           old_page          1965
dtype: int64

- Check percentage

In [None]:
(1928 + 1965) / len(df)

0.013220002852505111

- The mismatched rows is only 1.32% of the whole data.

- Deleting these mismatched rows does not seems like it will affect the overall conclusion of our data.

- What other exploration can we do?

# Part 02: Data Cleaning

- Get clean data only.

In [None]:
df_clean = df[(df["group"] == "treatment") & (df["landing_page"] == "new_page")
            |(df["group"] == "control") & (df["landing_page"] == "old_page")]

len(df_clean)

290585

- Check duplicates.

In [None]:
df_clean[df_clean.duplicated(subset="user_id")]

Unnamed: 0,user_id,timestamp,group,landing_page,converted
2893,773192,55:59.6,treatment,new_page,0


In [None]:
df_clean[df_clean['user_id'] == 773192]

Unnamed: 0,user_id,timestamp,group,landing_page,converted
1899,773192,37:58.8,treatment,new_page,0
2893,773192,55:59.6,treatment,new_page,0


- It seems like the duplicated user ID can be rationalized as same user which lands on new page two times but decided not to convert both times

- In this case, we can simply delete one of the entry and treat the user as non-converted user

In [None]:
df_clean = df_clean.drop_duplicates("user_id", keep="first")

df_clean[df_clean.duplicated(subset="user_id")]

Unnamed: 0,user_id,timestamp,group,landing_page,converted


# Part 03: Designing the Hypothesis to Test

- We want to see if the new page will result in conversion rate better, worse, or the same as the old page.

- To be able to do that, we are going to use a two-tailed test.

- Null hypothesis: The new page's conversion rate is the same as the old page.

  - $H_0 : p = p_0$

- Alternative hypothesis: The new page's conversion rate is different than the old page.

  - $H_1 : p \not= p_0$

- We set the confidence level as 95% which will give us α = 0.05

## Checking if we have enough sample

- We need to check if we, in fact, have enough sample to make any conclusion about the difference in conversion rate.

- To do this, we need to determine the number of sampe needed. Remember that the number of sample needed is determined by several factors:

  - Power (1 - β) : Probability of finding a statistical difference between groups in our test when difference is actually present.
  - Critical value (α)
  - Effect size: how big of a difference of conversion rate we expect there to be.


- Let's assume that we are happy to see the difference of 2% conversion rate.

In [None]:
# Checking the conversion rate of the control group

df_clean.groupby("group")["converted"].mean()

group
control      0.120386
treatment    0.118808
Name: converted, dtype: float64

In [None]:
control_convertrate = df_clean[df_clean['group'] == 'control']['converted'].mean()
control_convertrate

0.1203863045004612

- Conversion rate of control group : 12.039%

- Based on this, we can use 12.039% and 14.029% to calculate the effect size we expect.


In [None]:
import scipy.stats as stats
import statsmodels.stats.api as sms

In [None]:
effect_size = sms.proportion_effectsize(control_convertrate, control_convertrate+0.02)

required_n = sms.NormalIndPower().solve_power(
    effect_size,
    power=0.8,
    alpha=0.05,
    ratio=1
    )


print("required data for each group : " + str(required_n))

required data for each group : 3846.44579257946


- For this experiment, we need at least 4444 observations for each group.

- Let's find out how many data we have for each group!

In [None]:
df_clean["group"].value_counts()

treatment    145310
control      145274
Name: group, dtype: int64

- We have around 145000 data for each group.

- This data size is way larger than the required 4444 data for each group.

- We can safely say that our data size is already large enough for us to conduct this experiment to find whether the new page can increase the conversion rate by 2%.

In [None]:
df_clean.groupby('group')['converted'].agg(["mean", "std"])

Unnamed: 0_level_0,mean,std
group,Unnamed: 1_level_1,Unnamed: 2_level_1
control,0.120386,0.325414
treatment,0.118808,0.323564


- Based on the statistics above:

  - It seems like the new page performs worse than the old page
  - It also seems like, while the new page performs worse, the change of conversion rate does not suggest that there is much change when we use the new page (12% vs 11.9%)

# Part 04: Hypothesis Testing

- Since we have a large sample size, w can use z-test to test out hypothesis.

In [None]:
from statsmodels.stats.proportion import proportions_ztest, proportion_confint

In [None]:
control_results = df_clean[df_clean['group'] == 'control']['converted']
treatment_results = df_clean[df_clean['group'] == 'treatment']['converted']

In [None]:
n_con = control_results.count()
n_treat = treatment_results.count()

In [None]:
successes = [control_results.sum(), treatment_results.sum()]
nobs = [n_con, n_treat]

In [None]:
z_stat, pval = proportions_ztest(successes, nobs=nobs, alternative='two-sided')

(lower_con, lower_treat), (upper_con, upper_treat) = proportion_confint(successes, nobs=nobs, alpha=0.05)


print(f'z statistic: {z_stat:.2f}')
print(f'p-value: {pval:.3f}')
print(f'ci 95% for control group: [{lower_con:.3f}, {upper_con:.3f}]')
print(f'ci 95% for treatment group: [{lower_treat:.3f}, {upper_treat:.3f}]')

z statistic: 1.31
p-value: 0.190
ci 95% for control group: [0.119, 0.122]
ci 95% for treatment group: [0.117, 0.120]


# Part 05: Conclusion

- The p-value from the test result  is 0.19 which is above our threshold of α = 0.05.


- The statistical conclusion would be that we failed to reject the null hypothesis.

- We don't have enough statistical evidence to conclude that the new page resulted in better / worse conversion rate for our platform.

- With this result, we can safely say that we should not implement the new design at the moment.


# End of Notebook