# E Commerse Website A/B Testing

by: Irfan Chairur Rachman

## Introduction

A/B Test is

- an experiment designed to test which version is better
- based on metrics(s), example: signup rate, average salses per user, etc.
- using ranodm assignment and analyzing results

Good use of A/B testing:

- Optimizing conversion rates
- Releasing new app features
- Evaluating incremental effects of ads
- Assessing the impact of drug trials

Case unfit with A/B test:

- No sufficient traffic/"small" sample size
- No clear logical hypothesis
- Ethical considerations
- High opportunity cost

## A/B Testing fundamental steps

1. Specify the goal and designs/experiences
2. Randomly sample useres for enrollment
3. Randomly assign useres to:
    - Control variant: current state
    - treatment/test variant(s): new design
4. Log user actions and compute metrics
5. Test for statistically significant differences

![](assets/fundamental_steps.png)

## The Value of A/B Testing

- Reduce uncertainty around the impact of new designs and features
- Decision-making -> scientific, evidence-based - not intuition
- Generous value for the invesment: Simple changes lead to major wins
- Continuous optimization at the mature stage of the business
- Correlation does not imply causation

# Load Dataset

In [1]:
import pandas as pd

data = pd.read_csv("ab_data.csv")

data

Unnamed: 0,user_id,timestamp,group,landing_page,converted
0,851104,2017-01-21 22:11:48.556739,control,old_page,0
1,804228,2017-01-12 08:01:45.159739,control,old_page,0
2,661590,2017-01-11 16:55:06.154213,treatment,new_page,0
3,853541,2017-01-08 18:28:03.143765,treatment,new_page,0
4,864975,2017-01-21 01:52:26.210827,control,old_page,1
...,...,...,...,...,...
294473,751197,2017-01-03 22:28:38.630509,control,old_page,0
294474,945152,2017-01-12 00:51:57.078372,control,old_page,0
294475,734608,2017-01-22 11:45:03.439544,control,old_page,0
294476,697314,2017-01-15 01:20:28.957438,control,old_page,0


Data Description:

- `user_id`: Unique ID
- `timestamp`: Time stamp when the user visited the webpage
- `group`: [control, treatment] 
    - control: group of users are expected to be served with old_page. - treatment: group of users are matched with the new_page.
- `landing_page`: [new_page, old_page] user visited new_page or old_page.
- `converted`: [0, 1] Whether the user decided to pay for the company's product. 1 means yes, 0 means no.

## Exploratory

In [2]:
data['group'].value_counts()

group
treatment    147276
control      147202
Name: count, dtype: int64

In [3]:
data['landing_page'].value_counts()

landing_page
old_page    147239
new_page    147239
Name: count, dtype: int64

In [4]:
data[data.duplicated(subset = ['user_id', 'landing_page', 'group'], keep=False)]

Unnamed: 0,user_id,timestamp,group,landing_page,converted
1899,773192,2017-01-09 05:37:58.781806,treatment,new_page,0
2893,773192,2017-01-14 02:55:59.590927,treatment,new_page,0


In [5]:
data.pivot_table(index = 'group',
                 columns = 'landing_page',
                 aggfunc = 'count')

Unnamed: 0_level_0,converted,converted,timestamp,timestamp,user_id,user_id
landing_page,new_page,old_page,new_page,old_page,new_page,old_page
group,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2
control,1928,145274,1928,145274,1928,145274
treatment,145311,1965,145311,1965,145311,1965


In [None]:
pd.crosstab(index = data['group'],
            columns = data['landing_page'])

landing_page,new_page,old_page
group,Unnamed: 1_level_1,Unnamed: 2_level_1
control,1928,145274
treatment,145311,1965


The treatment should received the new_page and vice versa. The data should be cleaned

In [9]:
cond_control = (data['group'] == 'control') & (data['landing_page'] == 'old_page')
cond_treatment = (data['group'] == 'treatment') & (data['landing_page'] == 'new_page')

clean_data = data[(cond_control) | (cond_treatment)].copy()

In [10]:
pd.crosstab(index = clean_data['group'],
            columns = clean_data['landing_page'])

landing_page,new_page,old_page
group,Unnamed: 1_level_1,Unnamed: 2_level_1
control,0,145274
treatment,145311,0


Calculate the mean of conversion for each group

In [11]:
mean_global = clean_data['converted'].mean()
mean_control = clean_data[clean_data['group'] == 'control']['converted'].mean()
mean_treatment = clean_data[clean_data['group'] == 'treatment']['converted'].mean()

print(f"Global mean: {mean_global}")
print(f"Control mean: {mean_control}")
print(f"Treatment mean: {mean_treatment}")

Global mean: 0.11959667567149027
Control mean: 0.1203863045004612
Treatment mean: 0.11880724790277405


If we see, the mean for control and treatment group are similar. To analyze the significance and conclude wether new_page is effective for conversion rate, we should testing it with a/b test.

## A/B Testing

Alternative hypothesis:
- Based on experience research, we believe that if we update the landing page design, then the percentage of purchasing customers will increase as measure by converted rate.

To test this hypothesis, we will use ztest and and `proportion_confint` function from statsmodels library

In [12]:
from statsmodels.stats.proportion import proportions_ztest, proportion_confint

In [14]:
n_C = clean_data[clean_data['group'] == 'control']['user_id'].nunique()
n_T = clean_data[clean_data['group'] == 'treatment']['user_id'].nunique()

print(f"Control user: {n_C}")
print(f"Treatment user: {n_T}")

Control user: 145274
Treatment user: 145310


In [16]:
converted_C = clean_data[clean_data['group'] == 'control'].groupby('user_id')['converted'].max().sum()
converted_T = clean_data[clean_data['group'] == 'treatment'].groupby('user_id')['converted'].max().sum()

print(f"Converted User Control: {converted_C}")
print(f"Converted User Treatment: {converted_T}")

Converted User Control: 17489
Converted User Treatment: 17264


In [17]:
converted_abtest = [converted_C, converted_T]
n_abtest = [n_C, n_T]

In [19]:
z_stat, pvalue = proportions_ztest(converted_abtest, nobs = n_abtest)

(C_lo95, T_lo95), (C_up95, T_up95) = proportion_confint(converted_abtest, 
                                                        nobs = n_abtest,
                                                        alpha = 0.05)

print(f"p-value {pvalue:.4f}")
print(f"Group Control 95% CI : [{C_lo95:.4f}, {C_up95:.4f}]")
print(f"Group Treatment 95% CI : [{T_lo95:.4f}, {T_up95:.4f}]")

p-value 0.1899
Group Control 95% CI : [0.1187, 0.1221]
Group Treatment 95% CI : [0.1171, 0.1205]


As we can see, the p-value is greater than 0.05, it means the group control and group treatment do not significantly different based on converted rate. So to conclude, the new landing page does not increase converted rate significantly.

### Adding countries

The global customers show insignificant converted rate between new and old landing page. 

We will test whether customers in certain country show significant converted rate between new and old landing page. In this case, will test UK constumers

In [20]:
clean_data

Unnamed: 0,user_id,timestamp,group,landing_page,converted
0,851104,2017-01-21 22:11:48.556739,control,old_page,0
1,804228,2017-01-12 08:01:45.159739,control,old_page,0
2,661590,2017-01-11 16:55:06.154213,treatment,new_page,0
3,853541,2017-01-08 18:28:03.143765,treatment,new_page,0
4,864975,2017-01-21 01:52:26.210827,control,old_page,1
...,...,...,...,...,...
294473,751197,2017-01-03 22:28:38.630509,control,old_page,0
294474,945152,2017-01-12 00:51:57.078372,control,old_page,0
294475,734608,2017-01-22 11:45:03.439544,control,old_page,0
294476,697314,2017-01-15 01:20:28.957438,control,old_page,0


Read `countries.csv` to merge with `clean_data`

In [21]:
countries = pd.read_csv('countries.csv')

countries.head()

Unnamed: 0,user_id,country
0,834778,UK
1,928468,US
2,822059,UK
3,711597,UK
4,710616,UK


In [23]:
merged_data = pd.merge(clean_data, countries, how='left', on = 'user_id')

merged_data.head()

Unnamed: 0,user_id,timestamp,group,landing_page,converted,country
0,851104,2017-01-21 22:11:48.556739,control,old_page,0,US
1,804228,2017-01-12 08:01:45.159739,control,old_page,0,US
2,661590,2017-01-11 16:55:06.154213,treatment,new_page,0,US
3,853541,2017-01-08 18:28:03.143765,treatment,new_page,0,US
4,864975,2017-01-21 01:52:26.210827,control,old_page,1,US


In [33]:
merged_data['country'].unique()

array(['US', 'CA', 'UK'], dtype=object)

In [None]:
UK_data = merged_data[merged_data['country'] == 'UK']

UK_data.shape

(203620, 6)

In [43]:
meanUK_global = UK_data['converted'].mean()
meanUK_control = UK_data[UK_data['group'] == 'control']['converted'].mean()
meanUK_treatment = UK_data[UK_data['group'] == 'treatment']['converted'].mean()

print(f"Global UK mean: {meanUK_global}")
print(f"Control UK mean: {meanUK_control}")
print(f"Treatment UK mean: {meanUK_treatment}")

Global UK mean: 0.11954621353501621
Control UK mean: 0.12062998938220143
Treatment UK mean: 0.11846443711728685


In [44]:
nUK_C = UK_data[UK_data['group'] == 'control']['user_id'].nunique()
nUK_T = UK_data[UK_data['group'] == 'treatment']['user_id'].nunique()

print(f"Control user: {nUK_C}")
print(f"Treatment user: {nUK_T}")

Control user: 101716
Treatment user: 101903


In [45]:
convertedUK_C = UK_data[UK_data['group'] == 'control'].groupby('user_id')['converted'].max().sum()
convertedUK_T = UK_data[UK_data['group'] == 'treatment'].groupby('user_id')['converted'].max().sum()

print(f"Converted User Control: {convertedUK_C}")
print(f"Converted User Treatment: {convertedUK_T}")

Converted User Control: 12270
Converted User Treatment: 12072


In [46]:
convertedUK_abtest = [convertedUK_C, convertedUK_T]
nUK_abtest = [nUK_C, nUK_T]

In [47]:
z_stat, pvalue = proportions_ztest(convertedUK_abtest, nobs = nUK_abtest)

(C_lo95, T_lo95), (C_up95, T_up95) = proportion_confint(convertedUK_abtest, 
                                                        nobs = nUK_abtest,
                                                        alpha = 0.05)

print(f"p-value {pvalue:.4f}")
print(f"Group Control 95% CI : [{C_lo95:.4f}, {C_up95:.4f}]")
print(f"Group Treatment 95% CI : [{T_lo95:.4f}, {T_up95:.4f}]")

p-value 0.1323
Group Control 95% CI : [0.1186, 0.1226]
Group Treatment 95% CI : [0.1165, 0.1204]


based on the result, we can see the behaviour of UK customers is pretty same with the global customers, means there is no significant increase of converted rate between new and old landing page as showed the p-value is greater than alpha (0.05)

# Conclusion

Based on the result of a/b testing, the new landing page does not show signifincant increase of converted rate compared to the old landing page. This has been tested for global customers and UK customers. Therefore, new approach should be prepared to increase the converted rate.