---

# goals of the research

<div class="alert alert-info"> <b>

    
The goal of this research is to analyze the A/B test results for an online store's new payment funnel. The purpose of the test was to test the introduction of an improved recommendation system and evaluate if there is at least a 10% increase in conversion for product page views, product card views, and purchases within 14 days of signing up. the test was launtched on December 7, 2020, and ended on January 1, 2021. with 15% of new users from the EU region and an expected number of 6,000 test participants. The research aims to explore the data to check if it was carried out correctly, including exploring conversion at different funnel stages, checking if the number of events per user is distributed equally in the samples, and evaluating the possible details in the data that need to be taken into account before starting the A/B test. Finally, the research aims to evaluate the A/B test results, including checking the statistical difference between the proportions using the z-criterion and drawing conclusions from the EDA stage and the evaluation of the A/B test results.

In [None]:
# importing libraries
import pandas as pd
from scipy import stats
import numpy as np
import matplotlib.pyplot as plt
import plotly.express as px
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = 'all'

# 1. Opening the data file and studying the general information

In [None]:
#reading the files
marketing_events = pd.read_csv('/datasets/ab_project_marketing_events_us.csv')
ab_events = pd.read_csv('/datasets/final_ab_events_upd_us.csv')
ab_users = pd.read_csv('/datasets/final_ab_new_users_upd_us.csv')
participants = pd.read_csv('/datasets/final_ab_participants_upd_us.csv')

# Explore the data

**data types, missing values, duplictes**

**Marketing events**

In [None]:
#marketing events
marketing_events.sample(3)

In [None]:
#converting columns types
marketing_events['start_dt'] = pd.to_datetime(marketing_events['start_dt'])
marketing_events['finish_dt'] = pd.to_datetime(marketing_events['finish_dt'])
marketing_events.info()

In [None]:
#duplicates in marketing events
duplicates = marketing_events.duplicated()
num_duplicates = duplicates.sum()
dup_percentages = (duplicates.mean() * 100).round(2)

# print results
print(f"Number of duplicates: {num_duplicates}")
print(f"Duplicate percentage by column:\n{dup_percentages}")

**A/B events**

In [None]:
#ab events
ab_events.sample(3)
print()
ab_events['event_dt'] = pd.to_datetime(ab_events['event_dt'])
ab_events.info()

In [None]:
ab_events['event_name'].value_counts()

In [None]:
#missing values
missing_values = ab_events.isnull().sum()
total_cells = np.product(ab_events.shape)
total_missing = missing_values.sum()

percent_missing = (total_missing/total_cells) * 100
print(f"Total Percentage of Missing Values: {percent_missing:.2f}%\n")

percent_missing_by_column = (ab_events.isnull().sum() / ab_events.isnull().count()) * 100
print(f"Percentage of Missing Values by Column:\n{percent_missing_by_column}")

In [None]:
#fill in missing values
ab_events['details'] = ab_events['details'].fillna('No additional information')

<div class="alert alert-info"> <b>

    here I filled in the missing values in 'details' columns with no additional information. because I saw the this column contains an additional data about the order/purchase.

In [None]:
#duplicates in A/B events
duplicates1 = ab_events.duplicated()
num_duplicates1 = duplicates1.sum()
dup_percentages1 = (duplicates1.mean() * 100).round(2)

# print results
print(f"Number of duplicates: {num_duplicates1}")
print(f"Duplicate percentage by column:\n{dup_percentages1}")

**A/B users**

In [None]:
#ab users
ab_users.sample(3)
print()
ab_users['first_date'] = pd.to_datetime(ab_users['first_date'])
ab_users.info()

In [None]:
#duplicates in A/B users
duplicates2 = ab_users.duplicated()
num_duplicates2 = duplicates2.sum()
dup_percentages2 = (duplicates2.mean() * 100).round(2)

# print results
print(f"Number of duplicates: {num_duplicates2}")
print(f"Duplicate percentage by column:\n{dup_percentages2}")

**Participants**

In [None]:
#participants
participants.sample(3)
print()
participants.info()

In [None]:
#duplicates in participants
duplicates3 = ab_users.duplicated()
num_duplicates3 = duplicates3.sum()
dup_percentages3 = (duplicates3.mean() * 100).round(2)

# print results
print(f"Number of duplicates: {num_duplicates3}")
print(f"Duplicate percentage by column:\n{dup_percentages3}")

**Groups**

In [None]:
participants['group'].value_counts()

In [None]:
# users who appear in both groups
users_a = set(participants[participants['group'] == 'A']['user_id'])
users_b = set(participants[participants['group'] == 'B']['user_id'])
users_intersection = users_a.intersection(users_b)

In [None]:
# percentage of users who appear in both groups
percent_intersection = len(users_intersection) / len(participants) * 100
print(f"{percent_intersection:.2f}% of users appeared in both groups A and B.")

In [None]:
# the user ids who appear in both groups
if users_intersection:
    print("The following users appeared in both groups:")
    for user in users_intersection:
        print(user)
else:
    print("No users appeared in both groups.")

In [None]:
my_list = list(users_intersection)
print("Length of the list:", len(my_list))

In [None]:
users_a = set(participants[participants['group'] == 'A']['user_id'])
users_b = set(participants[participants['group'] == 'B']['user_id'])
users_intersection = users_a.intersection(users_b)
participants = participants[~participants['user_id'].isin(users_intersection)]

In [None]:
participants.info()

In [None]:
#checked a random user - it is existed in both groups
participants.query('user_id == "23F10CDFF7372B06"')

In [None]:
#unique tests running
unique_tests = participants['ab_test'].nunique()
print("Number of unique tests running: ", unique_tests)
participants['ab_test'].value_counts()

<div class="alert alert-info"> <b>

    I had prepared the data in four datasets untill now, filled in the missing values in 'details' columns where there is more information about the purchande or the event, checked how many tests launched there was 2: interface_eu_test and recommender_system_test. it looks like we have users that appears in both groups in the tests, I think those users should be removed so the test will give the correct results as much as possible. changed columms types where needed to datetime if will be needed to calculations moving forward. will check again about users appears in both groups and delete them.

# EDA 

**Studying conversion at different funnel stages**

In [None]:
# seperating users by tests
recommender_system_participants = participants[participants['ab_test'] == 'recommender_system_test']
interface_eu_participants = participants[participants['ab_test'] == 'interface_eu_test']
recommender_system_users = pd.merge(recommender_system_participants, ab_users, on='user_id')
interface_eu_users = pd.merge(interface_eu_participants, ab_users, on='user_id')

# all info about users for funnels
recommender_system_events = pd.merge(recommender_system_users, ab_events, on='user_id')
interface_eu_events = pd.merge(interface_eu_users, ab_events, on='user_id')

# participants in each test
print("participants in recommender_system_test test:", len(recommender_system_participants))
print("participants in interface_eu_test test:", len(interface_eu_participants))

**interface eu test participants funnel**

In [None]:
# funnel stages
interface_funnel = ['product_page', 'product_cart', 'purchase']
# unique users at each stage
for stage in interface_funnel:
    users_stage = interface_eu_events[interface_eu_events['event_name'] == stage]['user_id'].nunique()
    print(f"number of unique users at {stage}: {users_stage}")

In [None]:
# converstion
users_at_product_page = interface_eu_events[interface_eu_events['event_name'] == 'product_page']['user_id'].nunique()
users_at_purchase = interface_eu_events[interface_eu_events['event_name'] == 'purchase']['user_id'].nunique()
conversion_rates = users_at_purchase / users_at_product_page * 100
print(f"percent of users who purchased out of the total users in the product page stage: {conversion_rates:.2f}%")

In [None]:
interface_eu_events_grouped = interface_eu_events.groupby(['group', 'event_name'])['user_id'].nunique().reset_index(name='count')
# funnel stages and groups A and B
interface_eu_events_filtered = interface_eu_events_grouped[(interface_eu_events_grouped['event_name'].isin(interface_funnel)) &
                                                           (interface_eu_events_grouped['group'].isin(['A', 'B']))]
interface_eu_events_pivot = pd.pivot_table(interface_eu_events_filtered, values='count', index='event_name', columns='group')
interface_eu_events_pivot = interface_eu_events_pivot.reset_index()

In [None]:
# plot
fig = px.bar(interface_eu_events_pivot, x='event_name', y=['A', 'B'], barmode='group', title='Interface Sales Funnel by Test Group')
fig.update_layout(xaxis_title='Funnel Stages', yaxis_title='Number of Users')
fig.show();

**recommender system test participants funnel**

In [None]:
# funnel stages
recommender_system_funnel = ['product_page', 'product_cart', 'purchase']
# unique users at each stage
for stage1 in recommender_system_funnel:
    users_stage1 = recommender_system_events[recommender_system_events['event_name'] == stage1]['user_id'].nunique()
    print(f"number of unique users at {stage1}: {users_stage1}")

In [None]:
# converstion
users_at_product_page1 = recommender_system_events[recommender_system_events['event_name'] == 'product_page']['user_id'].nunique()
users_at_purchase1 = recommender_system_events[recommender_system_events['event_name'] == 'purchase']['user_id'].nunique()
conversion_rate1 = users_at_purchase1 / users_at_product_page1 * 100
print(f"percent of users who purchased out of the total users in the product page stage: {conversion_rate1:.2f}%")

In [None]:
recommender_system_events_grouped = recommender_system_events.groupby(['group', 'event_name'])['user_id'].nunique().reset_index(name='count')
# funnel stages and groups A and B
recommender_system_events_filtered = recommender_system_events_grouped[(recommender_system_events_grouped['event_name'].isin(recommender_system_funnel)) &
                                                           (recommender_system_events_grouped['group'].isin(['A', 'B']))]
recommender_system_events_pivot = pd.pivot_table(recommender_system_events_filtered, values='count', index='event_name', columns='group')
recommender_system_events_pivot = recommender_system_events_pivot.reset_index()

In [None]:
# plot
fig = px.bar(recommender_system_events_pivot, x='event_name', y=['A', 'B'], barmode='group', title='Recommender System Sales Funnel by Test Group')
fig.update_layout(xaxis_title='Funnel Stages', yaxis_title='Number of Users')
fig.show();

<div class="alert alert-info"> <b>

    we can see a chart showing three funnel stages in the product, we can see clearly that almost 50% of those users on the product_page stage are reaching out to the purchace stage. maybe if some customizations to the campaigns and               trying to exposure the product page to more users to visit it, this will increase the users reaching out the             purchase stage.

In [None]:
# adding date column
recommender_system_events['event_date'] = recommender_system_events['event_dt'].dt.date
interface_eu_events['event_date'] = interface_eu_events['event_dt'].dt.date

**recommender system test**

In [None]:
recommender_system_events.sample(3)

In [None]:
# number of events per user
recommender_event_counts = recommender_system_events.groupby('user_id').size().reset_index(name='event_count')
# mean and standard deviation of event counts
mean_count = recommender_event_counts['event_count'].mean()
std_count = recommender_event_counts['event_count'].std()
# checking if normally distributed
is_normal = (recommender_event_counts['event_count'].max() - recommender_event_counts['event_count'].min()) < (3 * std_count)
recommender_event_counts.sample(3)

In [None]:
# results
if is_normal:
    print('the number of events is distributed equally per user')
else:
    print('the number of events is not distributed equally per user')
print()
print(f"average number of events per user: {mean_count:.2f}")
print()
print(f"maximum number of events: {recommender_event_counts['event_count'].max()}")
print()
print(f"minimum number of events: {recommender_event_counts['event_count'].min()}")

In [None]:
# plot
recommender_system__event_counts = recommender_event_counts.merge(recommender_system_events[['user_id', 'group']], on='user_id')
fig = px.histogram(recommender_system__event_counts, x='event_count', color='group', barmode='group',
                   nbins=28, color_discrete_sequence=['blue', 'orange'],
                   labels={'event_count': 'Number of Events', 'group': 'Test Group'},
                   title='Events per User by est Group')

fig.update_layout(xaxis_tickangle=-45)

**interface eu test**

In [None]:
interface_eu_events.sample(3)

In [None]:
# number of events per user
interface_event_counts = interface_eu_events.groupby('user_id').size().reset_index(name='event_count')
# mean and standard deviation of event counts
mean_count = interface_event_counts['event_count'].mean()
std_count = interface_event_counts['event_count'].std()
# checking if normally distributed
is_normal1 = (interface_event_counts['event_count'].max() - interface_event_counts['event_count'].min()) < (3 * std_count)
interface_event_counts.sample(3)

In [None]:
# results
if is_normal1:
    print('the number of events is distributed equally per user')
else:
    print('the number of events is not distributed equally per user')
print()
print(f"average number of events per user: {mean_count:.2f}")
print()
print(f"maximum number of events: {interface_event_counts['event_count'].max()}")
print()
print(f"minimum number of events: {interface_event_counts['event_count'].min()}")

In [None]:
# plot
interr_event_counts = interface_event_counts.merge(interface_eu_events[['user_id', 'group']], on='user_id')
fig = px.histogram(interr_event_counts, x='event_count', color='group', barmode='group',
                   nbins=28, color_discrete_sequence=['blue', 'orange'],
                   labels={'event_count': 'Number of Events', 'group': 'Test Group'},
                   title='Events per User by est Group')

fig.update_layout(xaxis_tickangle=-45)

<div class="alert alert-info"> <b>showed above a distibution of number of events per user in each test. after correcting the         results it seems like events are not distributed equally per each user. in the recommender_system_test we can see            that the average events per user is around 6 events while we saw there are many users with different number of the       average. the same of the interface_eu_test the average events per user is around 7.3 events but there are many users with different number of events.

<div class="alert alert-info"> <b>

    I added below histograms, one for each test showing the liftime in days for an event since the first day the same        user logged in.

**recommender system test**

In [None]:
# datetime
recommender_system_events['event_date'] = pd.to_datetime(recommender_system_events['event_date'])
recommender_system_events['lifetime'] = (recommender_system_events['event_date'] - recommender_system_events['first_date']).dt.days
recommender_system_events.sample()

In [None]:
# histogram
fig, ax = plt.subplots(figsize=(10, 7))
ax.hist(recommender_system_events['lifetime'], bins=32, color='skyblue', edgecolor='black')
ax.set_title('Lifetime of Event since Registration')
ax.set_xlabel('Days since Registration')
ax.set_ylabel('Number of Events')
plt.show();

**interface eu test**

In [None]:
# datetime
interface_eu_events['event_date'] = pd.to_datetime(interface_eu_events['event_date'])
interface_eu_events['lifetime'] = (interface_eu_events['event_date'] - interface_eu_events['first_date']).dt.days
interface_eu_events.sample()

In [None]:
# histogram
fig, ax = plt.subplots(figsize=(10, 7))
ax.hist(interface_eu_events['lifetime'], bins=32, color='skyblue', edgecolor='black')
ax.set_title('Lifetime of Event since Registration')
ax.set_xlabel('Days since Registration')
ax.set_ylabel('Number of Events')
plt.show();

**Are there users who enter both samples?**

In [None]:
participants['ab_test'].unique()
participants = participants.drop_duplicates(subset='user_id', keep=False)

In [None]:
# Check the number of participants remaining in each test
test_counts = participants['ab_test'].value_counts()
print(f"Number of participants in each test:\n{test_counts}")

In [None]:
group_counts = participants.groupby(['ab_test', 'group']).size()
print("Number of users in each group for each test:")
print(group_counts)

**How is the number of events distributed by days?**

**recommender system test**

In [None]:
recommender_system_events.sample()
# Get the day of the week for each event and count the number of events in each day
recommender_events_by_day = recommender_system_events.groupby(recommender_system_events['event_dt'].dt.day_name()).size()
recommender_events_by_day

In [None]:
# histogram
recommender_events_by_day = recommender_events_by_day.reindex(["Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday"])
fig = px.bar(x=recommender_events_by_day.index, y=recommender_events_by_day.values, 
             labels={'x': 'Day of the Week', 'y': 'Count of Events'})
fig.update_layout(title='Number of Events by Day of the Week')

**interface eu test**

In [None]:
interface_eu_events.sample()
# Get the day of the week for each event and count the number of events in each day
interface_events_by_day = interface_eu_events.groupby(interface_eu_events['event_dt'].dt.day_name()).size()
interface_events_by_day

In [None]:
# histogram
interface_events_by_day = interface_events_by_day.reindex(["Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday"])
fig = px.bar(x=interface_events_by_day.index, y=interface_events_by_day.values, 
             labels={'x': 'Day of the Week', 'y': 'Count of Events'})
fig.update_layout(title='Number of Events by Day of the Week')

<div class="alert alert-info"> <b>

    the highet count of events is on Monday and Tuesday which is in the middle of the week, the lowest count is on Fridays maybe because it's before the weekend.

**Think of the possible details in the data that you have to take into account before starting the A/B test?**

**recommender system test**

In [None]:
# new user
new_users = recommender_system_events[recommender_system_events['lifetime'] == 0]
new_users_by_group = new_users.groupby(['ab_test', 'group'])['user_id'].nunique()
new_users_by_group

In [None]:
recommender_new_users = recommender_system_events.groupby(['group', 'first_date'])['user_id'].nunique().reset_index()
new_users_agg = recommender_new_users.groupby(['first_date', 'group', 'user_id']).size().reset_index(name='count')
# pivot
new_users_pivot = new_users_agg.pivot(index='first_date', columns='group', values='user_id')
# cumsum
cumulative_new_users = new_users_pivot.cumsum()
cumulative_new_users = cumulative_new_users.reset_index().melt(id_vars='first_date', var_name='group', value_name='cumulative_new_users')

In [None]:
# plot 
fig = px.line(cumulative_new_users, x='first_date', y='cumulative_new_users', color='group',
              title='Recommender Cumulative New Users Over Time')
fig.show()

**interface eu test**

In [None]:
# new user
new_users1 = interface_eu_events[interface_eu_events['lifetime'] == 0]
new_users_by_group1 = new_users1.groupby(['ab_test', 'group'])['user_id'].nunique()
new_users_by_group1

In [None]:
interface_eu_new_users = interface_eu_events.groupby(['group', 'first_date'])['user_id'].nunique().reset_index()
interface_new_users_agg = interface_eu_new_users.groupby(['first_date', 'group', 'user_id']).size().reset_index(name='count')
# pivot
interface_new_users_pivot = interface_new_users_agg.pivot(index='first_date', columns='group', values='user_id')
# cumsum
interface_cumulative_new_users = interface_new_users_pivot.cumsum()
interface_cumulative_new_users = interface_cumulative_new_users.reset_index().melt(id_vars='first_date', var_name='group', value_name='interface_cumulative_new_users')

In [None]:
# plot 
fig = px.line(interface_cumulative_new_users, x='first_date', y='interface_cumulative_new_users', color='group',
              title='Inter Cumulative New Users Over Time')
fig.show()

<div class="alert alert-info"> <b>
    
    ensuring that user demographics are evenly distributed across different regions and devices, and checking for any seasonality effects on user behavior, to ensure that the users in the A/B test are randomly selected from the entire         user population to ensure they are representative.

<div class="alert alert-info"> <b>

    I showed above two plots showing the cumulative new users in both tests since starting it, we can see clearly that         in the recommender test the number of new users was kind of stable untill the 14th of the month, it started to go up with groups A while group B kept with stable movements with the new users count. in the EU test we can see that it                    started from the beggining with up scale, but also we can see clearly that after 14th of the month started to            increase with both groups A and B.

<div class="alert alert-info"> <b>

    sure, as the technical descreption of the project: Groups: А (control), B (new payment funnel). we can see clearly         in the graph plotted that after about 7 days of launching the test the number of the cumulative new users fro               group A (control) is getting higher more and more after launching the test, group B supposed to be the users               trying the new update/change and it's cumulative new users while keep going on in the test was lower than                     the control group, stayed stable untill the end of the test.

**Evaluate the A/B test results**

**recommender system test**

In [None]:
# dates of the test
recommender_test_dates = sorted(recommender_system_events['first_date'].unique())
recommender_events_by_date = recommender_system_events.groupby('first_date')['event_name'].count().reset_index()
recommender_events_by_date.columns = ['date', 'events']
recommender_events_by_date

In [None]:
# plot 
fig = px.bar(recommender_events_by_date, x='date', y='events', title='Recommender events by dates')
fig.update_layout(xaxis=dict(type='category'))

<div class="alert alert-info"> <b>
    
    I added a title for the project, from what it looks in the plot and dates of the events comparing it with                 the technical descreption of the project the time period allign with the time range because the test was                 launched on 7 DEC and kept running unitil 1 JAN bul known that after 14 days so the test at least was for                     two weeks then the plot above can match the project descreption

In [None]:
# regions of the test
recommender_total_users = len(recommender_system_events['user_id'].unique())
# percentage of users in each region
recommender_region_percentages = recommender_system_events.groupby("region")["user_id"].nunique() / recommender_total_users * 100
recommender_region_percentages = recommender_region_percentages.reset_index()
recommender_region_percentages.columns = ["region", "percentage"]
recommender_region_percentages
recommender_total_users

**interface eu test**

In [None]:
# dates of the test
interface_test_dates = sorted(interface_eu_events['first_date'].unique())
interface_events_by_date = interface_eu_events.groupby('first_date')['event_name'].count().reset_index()
interface_events_by_date.columns = ['date', 'events']
interface_events_by_date

In [None]:
# plot 
fig = px.bar(interface_events_by_date, x='date', y='events', title='INTERFACE events by dates')
fig.update_layout(xaxis=dict(type='category'))

<div class="alert alert-info"> <b>

    also here the dates we see in the plot alligns with the technical descreption of the project, but it alligns more than the other test because we know that the test kept running untill 1 JAN.

In [None]:
# regions
interface_total_users = len(interface_eu_events['user_id'].unique())
# percentage of users in each region
interface_region_percentages = interface_eu_events.groupby('region')['user_id'].nunique() / interface_total_users * 100
interface_region_percentages = interface_region_percentages.reset_index()
interface_region_percentages.columns = ['region', 'percentage']
interface_region_percentages
interface_total_users

<div class="alert alert-info"> <b>
    sure, we can see in the plots above the number of events were on the website in the same date, for the Recommender system test we can see clearly that the number of the events after two weeks launching the test the number of events increased highly, for both tests I noticed that every 7 days there is a peak with the number of events on the same day, maybe it's a weekend day so users are more active in this spicific day in the week, the number of events per day for the EU Interface test looks ditributed more normally than the other test. the test that fits better is the EU interface test in general, we can see that the test runs for a longer period of time than the other test but if wel aslo look at the fact that after 14 days users will show increasing in the number of events in general maybe also the increasing number of events after 14 days in the recommender be a good sign 

---

<div class="alert alert-info"> <b>

    I used two tests for each test in the project to check which test could fit better the descreption, the first            test is a mann-whitney U-test, the test statistic is used to compare the conversion rates of the control and            treatment groups in the recommender system experiment. the second test is a z-test and the test statistic is           calculated based on the difference between the sample proportions and the pooled standard error of the two                 groups A and B.

**recommender system test**

<div class="alert alert-info"> <b>
    
    H0: there is no significant difference in conversion rates between the two groups A and B in the Recommender test
    H1: there is a significant difference in conversion rates between the two groups A and B in the Recommender test
    Statistical significance: 0.05

In [None]:
from scipy.stats import mannwhitneyu
# control and treatment groups
recommender_control_group = recommender_system_events[recommender_system_events['group'] == 'A']
recommender_treatment_group = recommender_system_events[recommender_system_events['group'] == 'B']
# overall conversion rate
recommender_overall_conversion_rate = recommender_system_events[recommender_system_events['event_name'] == 'product_cart'].shape[0] / recommender_system_events.shape[0]
recommender_control_conversion_rate = recommender_control_group[recommender_control_group['event_name'] == 'product_cart'].shape[0] / recommender_control_group.shape[0]
recommender_treatment_conversion_rate = recommender_treatment_group[recommender_treatment_group['event_name'] == 'product_cart'].shape[0] / recommender_treatment_group.shape[0]
# mann-whitney U test
statistic, recommender_p_value = mannwhitneyu(recommender_control_group['event_name'] == 'product_cart', recommender_treatment_group['event_name'] == 'product_cart', alternative='two-sided')

In [None]:
# result
if recommender_p_value < 0.05:
    print("The difference between the control and treatment groups is statistically significant.")
    if recommender_treatment_conversion_rate > recommender_control_conversion_rate:
        print("The new recommendation system has a higher conversion rate than the old system.")
    else:
        print("The new recommendation system does not have a higher conversion rate than the old system.")
else:
    print("The difference between the control and treatment groups is not statistically significant.")
    print("We recommend further testing or staying with the old recommendation system.")

<div class="alert alert-info"> <b>

    the test showed that the change will be done in the website will not lead to increasing the conversion rate, the test showed that the new recommendation system is not a significant improvement over the old system

**interface eu test**

<div class="alert alert-info"> <b>
    
    H0: there is no significant difference in conversion rates between the two groups A and B in the Recommender test
    H1: there is a significant difference in conversion rates between the two groups A and B in the Recommender test
    Statistical significance: 0.05

In [None]:
# control and treatment groups
inter_control_group = interface_eu_events[interface_eu_events['group'] == 'A']
inter_treatment_group = interface_eu_events[interface_eu_events['group'] == 'B']
# overall conversion rate
inter_overall_conversion_rate = interface_eu_events[interface_eu_events['event_name'] == 'product_cart'].shape[0] / interface_eu_events.shape[0]
inter_control_conversion_rate = inter_control_group[inter_control_group['event_name'] == 'product_cart'].shape[0] / inter_control_group.shape[0]
inter_treatment_conversion_rate = inter_treatment_group[inter_treatment_group['event_name'] == 'product_cart'].shape[0] / inter_treatment_group.shape[0]
# mann-whitney U test
statistic, inter_p_value = mannwhitneyu(inter_control_group['event_name'] == 'product_cart', inter_treatment_group['event_name'] == 'product_cart', alternative='two-sided')

In [None]:
# result
if inter_p_value < 0.05:
    print("The difference between the control and treatment groups is statistically significant.")
    if inter_treatment_conversion_rate > inter_control_conversion_rate:
        print("The new recommendation system has a higher conversion rate than the old system.")
    else:
        print("The new recommendation system does not have a higher conversion rate than the old system.")
else:
    print("The difference between the control and treatment groups is not statistically significant.")
    print("We recommend further testing or staying with the old recommendation system.")

<div class="alert alert-info"> <b>

    the test is showing here for the Interface EU test that the difference between the two groups is statistacly          significant, the treatment group have a higher conversion rate than the control group based on these test                 results the new recommendation system is recommended.

**What can you tell about the A/B test results?**

<div class="alert alert-info"> <b>
    
    The test results indicate that the new recommendations system has a statistically significant higher conversion rate than the old system, as indicated by the p-value less than 0.05. Therefore, the recommendation is to implement the                 new recommendation system, for the Interface test

**Use the z-criterion to check the statistical difference between the proportions**

**recommender system test**

<div class="alert alert-info"> <b>
    
    H0: there is no significant difference in conversion rates between the two groups A and B in the Recommender test
    H1: there is a significant difference in conversion rates between the two groups A and B in the Recommender test
    Statistical significance: 0.05

In [None]:
# funnel stages
recommender_product_page = recommender_system_events[recommender_system_events['event_name'] == 'product_page']
recommender_product_cart = recommender_system_events[recommender_system_events['event_name'] == 'product_cart']
recommender_purchase = recommender_system_events[recommender_system_events['event_name'] == 'purchase']
# conversion rates
recommender_product_page_conv = len(recommender_product_page) / len(recommender_system_events)
recommender_product_cart_conv = len(recommender_product_cart) / len(recommender_system_events)
recommender_purchase_conv = len(recommender_purchase) / len(recommender_system_events)
# control and treatment groups
recommender_control_group = recommender_system_events[recommender_system_events['group'] == 'A']
recommender_treatment_group = recommender_system_events[recommender_system_events['group'] == 'B']
# conversion rates for control and treatment groups
recommender_control_product_page_conv = len(recommender_control_group[recommender_control_group['event_name'] == 'product_page']) / len(recommender_control_group)
recommender_control_product_cart_conv = len(recommender_control_group[recommender_control_group['event_name'] == 'product_cart']) / len(recommender_control_group)
recommender_control_purchase_conv = len(recommender_control_group[recommender_control_group['event_name'] == 'purchase']) / len(recommender_control_group)
recommender_treatment_product_page_conv = len(recommender_treatment_group[recommender_treatment_group['event_name'] == 'product_page']) / len(recommender_treatment_group)
recommender_treatment_product_cart_conv = len(recommender_treatment_group[recommender_treatment_group['event_name'] == 'product_cart']) / len(recommender_treatment_group)
recommender_treatment_purchase_conv = len(recommender_treatment_group[recommender_treatment_group['event_name'] == 'purchase']) / len(recommender_treatment_group)

In [None]:
# Perform z-test for each funnel stage
def z_test(recommender_control_purchase_conv, recommender_treatment_purchase_conv, recommender_control_group, recommender_treatment_group):
    p1 = recommender_control_purchase_conv
    p2 = recommender_treatment_purchase_conv
    n1 = len(recommender_control_group)
    n2 = len(recommender_treatment_group)
    p_pool = (n1 * p1 + n2 * p2) / (n1 + n2)
    z_score = (p1 - p2) / np.sqrt(p_pool * (1 - p_pool) * (1 / n1 + 1 / n2))
    p_value = stats.norm.sf(abs(z_score)) * 2
    return z_score, p_value

recommender_z_score_product_page, recommender_p_value_product_page = z_test(recommender_control_product_page_conv, recommender_treatment_product_page_conv, recommender_control_group, recommender_treatment_group)
recommender_z_score_product_cart, recommender_p_value_product_cart = z_test(recommender_control_product_cart_conv, recommender_treatment_product_cart_conv, recommender_control_group, recommender_treatment_group)
recommender_z_score_purchase, recommender_p_value_purchase = z_test(recommender_control_purchase_conv, recommender_treatment_purchase_conv, recommender_control_group, recommender_treatment_group)

In [None]:
# results
print(f"Product page conversion rates: Control {recommender_control_product_page_conv:.4f}, Treatment {recommender_treatment_product_page_conv:.4f}")
print(f"Product cart conversion rates: Control {recommender_control_product_cart_conv:.4f}, Treatment {recommender_treatment_product_cart_conv:.4f}")
print(f"Purchase conversion rates: Control {recommender_control_purchase_conv:.4f}, Treatment {recommender_treatment_purchase_conv:.4f}")

if recommender_p_value_product_page < 0.05:
    print("The difference between the control and treatment groups for product page is statistically significant.")
else:
    print("The difference between the control and treatment groups for product page is not statistically significant.")

if recommender_p_value_product_cart < 0.05:
    print("The difference between the control and treatment groups for product cart is statistically significant.")
else:
    print("The difference between the control and treatment groups for product cart is not statistically significant.")

if recommender_p_value_purchase < 0.05:
    print("The difference between the control and treatment groups for purchase is statistically significant.")
else:
    print("The difference between the control and treatment groups for purchase is not statistically significant.")

In [None]:
# conversion rates
if recommender_treatment_product_page_conv > recommender_control_product_page_conv:
    print("Treatment group has a higher conversion rate for product page.")
else:
    print("Control group has a higher conversion rate for product page.")

if recommender_treatment_product_cart_conv > recommender_control_product_cart_conv:
    print("Treatment group has a higher conversion rate for product cart.")
else:
    print("Control group has a higher conversion rate for product cart.")

if recommender_treatment_purchase_conv > recommender_control_purchase_conv:
    print("Treatment group has a higher conversion rate for purchase.")
else:
    print("Control group has a higher conversion rate for purchase.")

<div class="alert alert-info"> <b>
    
    p-value is less than 0.05 that means that the null hypothesis will be rejected and the alternative                  hypothesis will be accepted. this indicates that there is a statistically significant difference in conversion              rates between the control and treatment groups.

**interface eu test**

<div class="alert alert-info"> <b>

    H0: there is no significant difference between the conversion rates of the two groups A and B
    H1: there is a significant difference in conversion rates between the two groups A and B

In [None]:
# funnel stages
eu_product_page = interface_eu_events[interface_eu_events['event_name'] == 'product_page']
eu_product_cart = interface_eu_events[interface_eu_events['event_name'] == 'product_cart']
eu_purchase = interface_eu_events[interface_eu_events['event_name'] == 'purchase']
# conversion rates
eu_product_page_conv = len(eu_product_page) / len(interface_eu_events)
eu_product_cart_conv = len(eu_product_cart) / len(interface_eu_events)
eu_purchase_conv = len(eu_purchase) / len(interface_eu_events)
# control and treatment groups
eu_control_group = interface_eu_events[interface_eu_events['group'] == 'A']
eu_treatment_group = interface_eu_events[interface_eu_events['group'] == 'B']
# conversion rates for control and treatment groups
eu_control_product_page_conv = len(eu_control_group[eu_control_group['event_name'] == 'product_page']) / len(eu_control_group)
eu_control_product_cart_conv = len(eu_control_group[eu_control_group['event_name'] == 'product_cart']) / len(eu_control_group)
eu_control_purchase_conv = len(eu_control_group[eu_control_group['event_name'] == 'purchase']) / len(eu_control_group)
eu_treatment_product_page_conv = len(eu_treatment_group[eu_treatment_group['event_name'] == 'product_page']) / len(eu_treatment_group)
eu_treatment_product_cart_conv = len(eu_treatment_group[eu_treatment_group['event_name'] == 'product_cart']) / len(eu_treatment_group)
eu_treatment_purchase_conv = len(eu_treatment_group[eu_treatment_group['event_name'] == 'purchase']) / len(eu_treatment_group)

In [None]:
# Perform z-test for each funnel stage
def z_test(eu_control_purchase_conv, eu_treatment_purchase_conv, eu_control_group, eu_treatment_group):
    p1 = eu_control_purchase_conv
    p2 = eu_treatment_purchase_conv
    n1 = len(eu_control_group)
    n2 = len(eu_treatment_group)
    p_pool = (n1 * p1 + n2 * p2) / (n1 + n2)
    z_score = (p1 - p2) / np.sqrt(p_pool * (1 - p_pool) * (1 / n1 + 1 / n2))
    p_value = stats.norm.sf(abs(z_score)) * 2
    return z_score, p_value

eu_z_score_product_page, eu_p_value_product_page = z_test(eu_control_product_page_conv, eu_treatment_product_page_conv, eu_control_group, eu_treatment_group)
eu_z_score_product_cart, eu_p_value_product_cart = z_test(eu_control_product_cart_conv, eu_treatment_product_cart_conv, eu_control_group, eu_treatment_group)
eu_z_score_purchase, eu_p_value_purchase = z_test(eu_control_purchase_conv, eu_treatment_purchase_conv, eu_control_group, eu_treatment_group)

In [None]:
# results
print(f"Product page conversion rates: Control {eu_control_product_page_conv:.4f}, Treatment {eu_treatment_product_page_conv:.4f}")
print(f"Product cart conversion rates: Control {eu_control_product_cart_conv:.4f}, Treatment {eu_treatment_product_cart_conv:.4f}")
print(f"Purchase conversion rates: Control {eu_control_purchase_conv:.4f}, Treatment {eu_treatment_purchase_conv:.4f}")

if eu_p_value_product_page < 0.05:
    print("The difference between the control and treatment groups for product page is statistically significant.")
else:
    print("The difference between the control and treatment groups for product page is not statistically significant.")

if eu_p_value_product_cart < 0.05:
    print("The difference between the control and treatment groups for product cart is statistically significant.")
else:
    print("The difference between the control and treatment groups for product cart is not statistically significant.")

if eu_p_value_purchase < 0.05:
    print("The difference between the control and treatment groups for purchase is statistically significant.")
else:
    print("The difference between the control and treatment groups for purchase is not statistically significant.")

In [None]:
# conversion rates
if eu_treatment_product_page_conv > eu_control_product_page_conv:
    print("Treatment group has a higher conversion rate for product page.")
else:
    print("Control group has a higher conversion rate for product page.")

if eu_treatment_product_cart_conv > eu_control_product_cart_conv:
    print("Treatment group has a higher conversion rate for product cart.")
else:
    print("Control group has a higher conversion rate for product cart.")

if eu_treatment_purchase_conv > eu_control_purchase_conv:
    print("Treatment group has a higher conversion rate for purchase.")
else:
    print("Control group has a higher conversion rate for purchase.")

**Describe the results**

<div class="alert alert-info"> <b>

    it is recommended that the store implement the new interface_test, the test showed that the new system                    had a statistically significantly higher conversion rate than the old system, by both tests. this indicates                  that customers are more likely to add products to their cart when new recommendations are made, overall,                      the A/B testing provided valuable insights into the effectiveness of a store recommendation system and               demonstrates the importance of experimentation to improve customer engagement and drive business growth. 