<h1 style="color:green; margin-bottom:15px;font-size:30px">Online Shop Conversion Rates - A/B Test Analysis 2</h1>

# Project Description

**A/B Test goal: testing changes associated with the introduction of an improved recommender system**
- groups: A - control, B - new payment funnel;
- launch date: 2020-12-07;
- date of stopping the recruitment of new users: 2020-12-21;
- test stop date: 2021-01-04;
- audience: 15% of new users from the EU region;
- expected number of test participants: 6000.
- expected effect: in 14 days from the moment of registration, users will show an improvement in each metric by at least 10%.

**Your task is to evaluate the results of the A/B test.** You have a dataset with user actions, a technical task and several additional datasets at your disposal.

- Evaluate the correctness of the test
- Analyze test results

**To evaluate the correctness of the test, check:**
- intersection of the test audience with a competing test,
- coincidence of the test and marketing events, other problems of time limits of the test.

# Contents

1. Libraries Import
2. Data Import
3. Data Description
4. Test Correctness Analysis and Data Preprocessing
5. A/B Test Analysis
6. Conclusion

## 1. Libraries Import

In [1]:
import pandas as pd
import math as mth
import numpy as np
import datetime as dt
from scipy import stats as st
import warnings
warnings.filterwarnings('ignore')
import matplotlib.pyplot as plt
import seaborn as sns
from io import BytesIO
import requests
import statistics
from plotly import graph_objects as go
import plotly.express as px

## 2. Data Import

In [2]:
#importing marketing campaigns data and removing all campaigns launched before December 2020
marketing=pd.read_csv('ab_project_marketing_events.csv')
marketing=marketing.sort_values(by='start_dt', ascending=True)
marketing['start_dt'] = pd.to_datetime(marketing['start_dt'])
marketing['finish_dt'] = pd.to_datetime(marketing['finish_dt'])
marketing.drop(marketing[marketing['start_dt'] < '2020-12-01'].index, inplace = True) 
display(marketing)
display(marketing.info())

FileNotFoundError: [Errno 2] No such file or directory: 'ab_project_marketing_events.csv'

In [None]:
#importing users data, convertinhg first_date into datetime
new_users=pd.read_csv('final_ab_new_users.csv')
new_users['first_date'] = pd.to_datetime(new_users['first_date'])

#removing the data on devices, we won't need it further
new_users.drop('device', axis=1, inplace=True)

display(new_users)
display(new_users.info())

In [None]:
#checking the range of dates
display(new_users['first_date'].describe())

In [None]:
display(new_users['user_id'].nunique())

In [None]:
#importing events data, convertinhg event_dt into datetime
events=pd.read_csv('final_ab_events.csv')
events['event_dt'] = pd.to_datetime(events['event_dt'])
display(events)
display(events.info())

In [None]:
events['event_name'].value_counts()

In [None]:
#checking for duplicates
events.duplicated().sum()

In [None]:
#checking the range of dates
display(events['event_dt'].describe())

In [None]:
#importing test participants data
participants=pd.read_csv('final_ab_participants.csv')
display(participants)
display(participants.info())

In [None]:
#checking for duplicates
participants.duplicated().sum()

## 3. Data Description

The datasets contains 4 tables:

**ab_project_marketing_events.csv** - marketing events for 2020, now contains 2 events intersecting with the test period.

File structure:

name — name of the marketing event;
regions — regions in which the advertising campaign will be carried out;
start_dt — campaign start date;
finish_dt — end date of the campaign.

**final_ab_new_users.csv** - users who registered from December 7 to December 23, 2020.

File structure:

user_id;
first_date — registration date;
region — user's region.

**final_ab_events.csv** - 4 targeted actions for new users from December 7, 2020 to December 30, 2021.

File structure:

user_id - user ID;
event_dt - date and time of purchase;
event_name - event type;
details — additional data about the event. For example, for purchases, purchase, this field stores the cost of the purchase in dollars.

**final_ab_participants.csv** - table of test participants, currently contains 18268 participants from two different tests.

File structure:

user_id - user ID;
ab_test - test name;
group — user group.

All 4 datasets do not contain outliers, gaps (except for the cost in events, which is indicated only for purchases) and duplicates. Data types have been converted.

## 4. Test Correctness Analysis and Data Preprocessing

### 4.1 Checking for tests inresections

In [None]:
participants['ab_test'].value_counts()

In [None]:
interface_eu_test = participants[participants['ab_test']=='interface_eu_test']['user_id'].unique()
recommender_system_test = participants[participants['ab_test']=='recommender_system_test']['user_id'].unique()

participants_to_delete = pd.Series(list(set(interface_eu_test).intersection(set(recommender_system_test))))
print(participants_to_delete)

Dataset contains 2 tests: **recommender_system_test** и **interface_eu_test**

There are 826 intersected users, let's remove them from the dataset as well as remove all **interface_eu_test** users

In [None]:
participants.drop(participants[participants['ab_test'] == 'interface_eu_test'].index, inplace=True)

for x in participants_to_delete:
    participants.drop(participants[participants['user_id'] == x].index, inplace=True)

In [None]:
participants['ab_test'].value_counts()

### 4.2 Checking the division of traffic

In [None]:
print('Unique users:', participants['user_id'].nunique())

In [None]:
mixed_participants = participants.groupby(['user_id']).agg({'group': 'nunique'}).sort_values(by='user_id', ascending=False).reset_index()
mixed_participants = mixed_participants[mixed_participants['group']>1]
participants_to_delete = mixed_participants['user_id']
print('Users appearing in both test groups:', mixed_participants['user_id'].count())

### 4.3 Merging the users and analysing  the regions they come from

In [None]:
participants = participants.merge(new_users, on=['user_id'], how='inner')
participants.info()

In [None]:
participants['region'].value_counts()

According to the test terms, only EU users should get into the test, let's delete the users from other regions

In [None]:
participants = participants[participants['region']=='EU']
print('Unique users:', participants['user_id'].nunique())

In [None]:
display(participants['first_date'].describe())

Let's check if our sample includes 15% of all new users from Europe, registered in the period from 7 to 21 December

In [None]:
europe = new_users[new_users['region'] == 'EU']
europe = europe[europe['first_date']>'2020-12-06']
europe = europe[europe['first_date']<'2020-12-22']
print('New users from Europe, registered in the period from 7 to 21 December:',europe['user_id'].nunique())

In [None]:
print('The percentage of new users from Europe, registered in the period from 7 to 21 December participating in the test:',round(100*participants['user_id'].nunique()/europe['user_id'].nunique(),2))

The dataset currently contains 4749 users (11% of new European users registered between December 7th and 21st) who only participated in the **recommender_system_test** and registered from December 7th to 21st, 2020.

In [None]:
#removing ab_test, first_date, region columns - we won't need them further
participants.drop('ab_test', axis=1, inplace=True)
participants.drop('region', axis=1, inplace=True)
participants.info()

### 4.4 Merging the data on users and events

In [None]:
events = participants.merge(events, on=['user_id'], how='inner')
events['event_dt'] = events['event_dt'].dt.date
events
events.info()

### 4.5 Removing all events that happened to the user 14 days after registration

In [None]:
events=events[events['first_date'] + dt.timedelta(days=14) >= events['event_dt']]
events.info()

### 4.6 Events that overlap with the advertising campaigns

The test overlaps with the Christmas&New Year Promo marketing campaign that ran from 2020-12-25 to 2021-01-03. In this task, we will not delete data on events that occurred before 2020-12-25 from the dataset (since there will only be 2340 events for 1530 participants, and the initial expected number of participants is 6000) and since this advertising campaign should equally affect both users from group A, and users from group B. But in future, we will try to avoid these intersections.

### 4.7 Events Analysis 

- How is the number of events distributed over the days?
- How does the conversion in the funnel change at different stages?
- Are the number of events per user equally distributed in the samples?

In [None]:
events_A = events[events['group']=='A']
events_B = events[events['group']=='B']

In [None]:
sample = events_A.groupby('event_dt').agg({'user_id': 'count'}).sort_values(by='event_dt', ascending=True).reset_index()

fig, ax = plt.subplots(figsize=(20,10))
rc={'axes.labelsize': 15, 'font.size': 15, 'legend.fontsize': 12, 'axes.titlesize': 18}
plt.rcParams.update(**rc)

splot=sns.barplot(x='event_dt', y='user_id', data=sample)

for p in splot.patches:
    splot.annotate(format(p.get_height(), '.0f'), 
                   (p.get_x() + p.get_width() / 2., p.get_height()), 
                   ha = 'center', va = 'center', 
                   xytext = (0, 9), 
                   textcoords = 'offset points')
    
plt.xticks(rotation=37)          
plt.xlabel(None)
plt.title('Events distribution by date in group A')
plt.ylabel(None)
plt.show()

In group A, the number of events fluctuates around 200-300 in the first week of observations and increases sharply on December 14 to 755, then the number of events goes up steadily and reaches its peak at 1442 per day (December 21) and drops to 301 per day until the end of the observation period (December 29th).

In [None]:
sample = events_B.groupby('event_dt').agg({'user_id': 'count'}).sort_values(by='event_dt', ascending=True).reset_index()

fig, ax = plt.subplots(figsize=(20,10))
rc={'axes.labelsize': 15, 'font.size': 15, 'legend.fontsize': 12, 'axes.titlesize': 18}
plt.rcParams.update(**rc)

splot=sns.barplot(x='event_dt', y='user_id', data=sample)

for p in splot.patches:
    splot.annotate(format(p.get_height(), '.0f'), 
                   (p.get_x() + p.get_width() / 2., p.get_height()), 
                   ha = 'center', va = 'center', 
                   xytext = (0, 9), 
                   textcoords = 'offset points')
    
plt.xticks(rotation=37)           
plt.xlabel(None)
plt.title('Events distribution by date in group B')
plt.ylabel(None)
plt.show()

In group B, the number of events fluctuates around 100-300 from December 7 to 23 and drops to 51 per day (December 29) until the end of the observation period. Only 2 events happened on December 30.

In [None]:
users_count_A = events_A.groupby('event_name').agg({'user_id': 'nunique'}).sort_values(by='user_id', ascending=False).reset_index()
users_count_A['%'] = round(100 * users_count_A['user_id'] / events_A['user_id'].nunique(),2)

display(users_count_A)

In [None]:
users_count_B = events_B.groupby('event_name').agg({'user_id': 'nunique'}).sort_values(by='user_id', ascending=False).reset_index()
users_count_B['%'] = round(100 * users_count_B['user_id'] / events_B['user_id'].nunique(),2)

display(users_count_B)

100% of all users in Group A of this test log in, 65% view the offer screen, 30% reach the shopping cart, and 32% complete the purchase.

100% of all Group B users in this test log in, 56% view the offer screen, 28% reach the shopping cart, and 29% complete the purchase.

In a correct funnel, the percentage of users who paid for the product should be less than the % of users who reached the cart.
It can be assumed that the user returns to pay for the product after some time (after clicking on the link from the SMS or email newsletter).

In [None]:
fig = go.Figure(
    go.Funnel(
        y=[
            'registration',
            'offer screen',
            'product cart',
            'successful purchase',
        ],
        x=[users_count_A['user_id'][0], users_count_A['user_id'][1], users_count_A['user_id'][3],users_count_A['user_id'][2]],
    )
)
fig.update_layout(title={'text': 'Funnel for group A'})           
fig.show() 

In [None]:
fig = go.Figure(
    go.Funnel(
        y=[
            'registration',
            'offer screen',
            'product cart',
            'successful purchase',
        ],
        x=[users_count_B['user_id'][0], users_count_B['user_id'][1], users_count_B['user_id'][3],users_count_B['user_id'][2]],
    )
) 
fig.update_layout(title={'text': 'Funnel for group B'})
fig.show() 

In [None]:
types_count = events_A.groupby('event_name').agg({'user_id': 'count'}).sort_values(by='user_id', ascending=False).reset_index()
types_count['%'] = types_count['user_id'] / types_count['user_id'].sum()

fig, ax = plt.subplots(figsize=(15,7))
rc={'axes.labelsize': 15, 'font.size': 12, 'legend.fontsize': 15, 'axes.titlesize': 15}
plt.rcParams.update(**rc)

splot=sns.barplot(x='event_name', y='%', data=types_count)

for p in splot.patches:
    splot.annotate(format(p.get_height(), '.3f'), 
                   (p.get_x() + p.get_width() / 2., p.get_height()), 
                   ha = 'center', va = 'center', 
                   xytext = (0, 9), 
                   textcoords = 'offset points')
            
plt.xlabel(None)
plt.title('Events Share by Type in Group A')
plt.ylabel(None)
plt.show()

In [None]:
types_count = events_B.groupby('event_name').agg({'user_id': 'count'}).sort_values(by='user_id', ascending=False).reset_index()
types_count['%'] = types_count['user_id'] / types_count['user_id'].sum()

fig, ax = plt.subplots(figsize=(15,7))
rc={'axes.labelsize': 15, 'font.size': 12, 'legend.fontsize': 15, 'axes.titlesize': 15}
plt.rcParams.update(**rc)

splot=sns.barplot(x='event_name', y='%', data=types_count)

for p in splot.patches:
    splot.annotate(format(p.get_height(), '.3f'), 
                   (p.get_x() + p.get_width() / 2., p.get_height()), 
                   ha = 'center', va = 'center', 
                   xytext = (0, 9), 
                   textcoords = 'offset points')
            
plt.xlabel(None)
plt.title('Events Share by Type in Group B')
plt.ylabel(None)
plt.show()

The data has been processed. Now the users and events fit the description of the test terms.

There are no users belonging to two groups at the same time.

The shares of events in groups A and B are approximately the same, we can proceed to the analysis of the A/B test.

## 5. A/B Test Analysis

### 5.1 Creating a dataframe with conversion rates for both groups (we'll start from login->product page conversion)

In [None]:
sample = pd.pivot_table(events, index ='event_name', columns ='group', values='user_id',aggfunc=pd.Series.nunique)
sample = sample.rename_axis(None).reset_index()
sample = sample.rename(columns={'index': 'event'})
print(sample)

In [None]:
sample['A'] = sample['A'].astype(float)
sample['B'] = sample['B'].astype(float)

sample['A'][1] = round(sample['A'][1] / sample['A'][0],4)
sample['A'][2] = round(sample['A'][2] / sample['A'][0],4)
sample['A'][3] = round(sample['A'][3] / sample['A'][0],4)

sample['B'][1] = round(sample['B'][1] / sample['B'][0],4)
sample['B'][2] = round(sample['B'][2] / sample['B'][0],4)
sample['B'][3] = round(sample['B'][3] / sample['B'][0],4)

sample=sample.drop(0)

In [None]:
sample

### 5.2 Choosing a threshold value for testing hypotheses

We'll need to run a multiple test (4 tests) with a significance level of 0.01 to check whether the introduction of an improved recommender system will affect user conversion and % of payments.

The probability of making a mistake at least once in 4 comparisons here is: 1-(1-0.01)^4 = 0.039 or 3.9%.

### 5.3 Defining the function for testing the hypothesis of equality of shares using the z test

Let's define the function for checking the hypothesis about the equality of shares **check**:

- k1 and k2 - the number of unique users in groups 1 and 2 who had a crertain event
- n1 and n2 - number of unique users in groups 1 and 2
- alpha - level of statistical significance

**Null hypothesis** - the proportions of users who performed a certain action in two samples are equal

**Alternative hypothesis** - the proportions of users who have completed a certain action differ in the two samples.

**Criterion:** if p_value < alpha, then we reject the null hypothesis: in 100*(1-alpha)% of cases there is a significant difference between the shares

In [None]:
def check(k1,k2,n1,n2,alpha):

    trials = np.array([n1, n2])
    successes = np.array([k1, k2])
    
    p1 = successes[0]/trials[0] 
    p2 = successes[1]/trials[1]
    p_combined = (successes[0] + successes[1]) / (trials[0] + trials[1])
    difference = p1 - p2 
    
    z_value = difference / mth.sqrt(p_combined * (1 - p_combined) * (1/trials[0] + 1/trials[1]))
    
    distr = st.norm(0, 1)  
    
    z_value = difference / mth.sqrt(
        p_combined * (1 - p_combined) * (1 / trials[0] + 1 / trials[1]))

    distr = st.norm(0, 1)
    p_value = (1 - distr.cdf(abs(z_value))) * 2
    
    
    print(trials, successes)

    if p_value < alpha:
        print('p-value: ', round(p_value,2), '\nRejecting the null hypothesis: the shares are different')
    else:
        print('p-value: ', round(p_value,2), '\nNo reason to reject the null hupothesis: the shares are equal') 

In [None]:
#let's call the function for checking the hypothesis about the equality of the shares of unique users for all events from our chain
for i in range(1,4):
    print('For event', sample['event'][i])
    check(sample['A'][i],sample['B'][i],sample['A'].sum(),sample['B'].sum(),0.004)
    print()

*Conclusion:* The introduction of an improved recommender system does not affect the conversion of users from login to the product view, from a product view to shopping cart and from a shopping cart to purchase in 96% of cases, which means that there is no expected effect of increasing conversion by 10%.

## 6. Conslusion

The test period overlaps with the Christmas&New Year Promo marketing campaign that ran from 2020-12-25 to 2021-01-03 and the competing AB test.

1. In this task, we did not delete data on events that occurred before 2020-12-25 from the dataset (since there will be only 2340 events for 1530 participants, and the initial expected number of participants is 6000) and since this advertising campaign should equally affect both users from group A, and users from group B. But in future real tests we will try to avoid these intersections.
2. All users that overlap with the competing test have been removed.
3. All non-European users have been removed.

There are 4749 unique users left that match the description of the test terms. There are no users belonging to two groups at the same time. The test covers 11% of new users from Europe registered between December 7 and 21.
This is less than stated in the test terms (6000 users and 15% of all registered).

Out of 4749 new unique users, only 1939 from group A and 654 from group B logged in during the test period.

100% of all users in Group A of this test log in, 65% view the offer screen, 30% reach the shopping cart, and 32% complete the purchase.

100% of all Group B users in this test log in, 56% view the offer screen, 28% reach the shopping cart, and 29% complete the purchase.

In a correct funnel, the percentage of users who paid for the product should be less than the % of users who reached the cart. It can be assumed that the user returns to pay for the product after some time (after clicking on the link from the SMS or email newsletter).

**If we assume that the test is valid and move on to the analysis, then according to the z test, the introduction of an improved recommender system does not affect the conversion of users from login to product page, shopping cart and purchase in 96% of cases, and hence there is no 10% increase in conversion.**