# Introduction
The core business challenge confronting the ecommerce revolves around the perceived inefficiency of the current product page layout, characterized by a horizontal media rail. In response to this concern, the investigative efforts are directed towards assessing the potential impact of transitioning to a vertical media rail. The objective is to verify whether such a change could yield improvements in key performance indicators (KPIs) crucial to the company's success.

At the forefront of this evaluation is the order ratio (aka conversion), serving as the chosen primary metric since it is a direct indicator of completed transactions originating from the product page and aligns with the business goal to increase revenue.

Complementary to the primary metric, secondary metrics such as the add-to-cart rate and clicks on media are incorporated into the analysis. The add-to-cart ratio provides insights into users' intent to make a purchase, while clicks on media gauge engagement with visual elements. These secondary metrics contribute to a comprehensive evaluation of user behavior and interaction.

Ultimately, the findings from this analysis will serve as a guiding compass for the company, informing decisions aimed at optimizing the product page layout for heightened user engagement and conversion, specifically in terms of order ratios. 


## Potential benefits and drawbacks of the vertical media rail
The change to a vertical media rail could enhance the visibility of media, leading to a potentially higher engagement level among users. This improved visibility could also improve the ease of navigation and hence result in a more enjoyable user experience and potentially positively impact conversion rates.

However, there's a risk of disrupting established user habits, as sudden changes in content presentation may require users to adapt to a new interface. This could have negative impact on both user experience and conversion rates.


## Hypothesis
For each metric (order ratio, add-to-cart rate, clicks on media) the Hypotheses are phrased as followed:


Null Hypothesis (H0): Changing the media rail from horizontal to vertical will have no significant impact on [metric].

Alternative Hypothesis (H1): Changing the media rail from horizontal to vertical will have a significant impact on [metric].


## Hypothesis Testing
To test the hypothesis an A/B test has been chosen due to its ability to provide direct and quantitative insights into the impact of changes on user behavior, while the t-test has been selected as the appropriate statistical method due to its suitability for continuous variables and capability to compare two independent samples.


## Results
There is no evidence to reject the Null Hypothesis (H0) for order ratio (conversion) at the 5% significance level (t-statistic: -1.338, p_value: 0.181). 

We can reject the Null Hypothesis (H0) for add to cart rate at the 5% significance level (t-statistic: -2.654, p_value: 0.008). 

We can reject the Null Hypothesis (H0) for Clicks on media at the 5% significance level (t-statistic: 3.171, p_value: 0.002).


## Summary & Recommendations
The order ratio in Variant B is higher, but the observed uplift does not have statistical significance, while the uplift in the add-to-cart rate in Variant B is statistically significant. Interestingly, the average Clicks on media performs better in Variant A with a statistically significant result. These inconclusive results suggest further testing is needed in order to be able to take informed decisions about the tested change on the Product Pages. 

Prolonging the test and thereby increasing the sample size would increase the power of the test, making it more likely to detect the true effect on the order ratio. It could also be a good idea to run complimentary experiments in the checkout process as there was a significant uplift in the add-to-cart rate that did not translate all the way to the conversion level.

In [1]:
# import packages, load and preview data

import pandas as pd
import numpy as np
from scipy import stats

df = pd.read_csv("teaching/data/assessment_da25.csv")
df.head()

Unnamed: 0,Variant,Number of page views,GMV (in $),Number of add to cart,Clicks on media,Time on Page (sec),user_id
0,A,5,0.0,0,2,74,0
1,A,4,0.0,4,1,21,1
2,A,4,0.0,2,0,1,2
3,A,5,0.0,0,1,26,3
4,A,5,0.0,3,3,46,4


In [2]:
# add new column "conversion" to the Dataframe based on the "GMV" column
# "conversion" is binary, 0=not converted, 1=converted

df['conversion'] = df['GMV (in $)'].apply(lambda x: 1 if x > 0 else 0)


# add new column "add to cart rate" to the Dataframe based on the "Number of add to cart" column
# "add to cart rate" is binary, 0=no items added to cart, 1=minimum 1 item added to cart

df['add to cart rate'] = df['Number of add to cart'].apply(lambda x: 1 if x > 0 else 0)

In [3]:
df.head()

Unnamed: 0,Variant,Number of page views,GMV (in $),Number of add to cart,Clicks on media,Time on Page (sec),user_id,conversion,add to cart rate
0,A,5,0.0,0,2,74,0,0,0
1,A,4,0.0,4,1,21,1,0,1
2,A,4,0.0,2,0,1,2,0,1
3,A,5,0.0,0,1,26,3,0,0
4,A,5,0.0,3,3,46,4,0,1


In [4]:
# variable to filter on Variant A
control = df[df["Variant"] == "A"]

# variable to filter on Variant B
treatment = df[df["Variant"] == "B"]

In [5]:
## ttest loop over metrics

for metric in ['conversion', 'add to cart rate', 'Clicks on media']:
    metric_mean_A = control[metric].mean()
    metric_mean_B = treatment[metric].mean()
    t_statistic, p_value = stats.ttest_ind(control[metric], treatment[metric], equal_var= False)
    
    print(f"\033[94mResults for {metric}:\033[0m")
    print(f"Average {metric} Variant A: {metric_mean_A}")
    print(f"Average {metric} Variant B: {metric_mean_B}")
    print(f"t-statistic: {t_statistic:.3f}, p_value: {p_value:.3f}")

    print("")
    print(f"Interpretation for {metric} results")

    if p_value < 0.05:
        print(f"The p-value of {p_value:.3f} indicates a statistically significant difference between the variants at the 5% significance level")
        print(f"-> We can reject the Null Hypothesis (H0) for {metric}")
    else:
        print(f"The p-value of {p_value:.3f} indicates no statistically significant difference between the variants at the 5% significane level")
        print(f"-> There is no evidence to reject the Null Hypothesis (H0) for {metric}")
    print("")
    print("")

[94mResults for conversion:[0m
Average conversion Variant A: 0.08
Average conversion Variant B: 0.097
t-statistic: -1.338, p_value: 0.181

Interpretation for conversion results
The p-value of 0.181 indicates no statistically significant difference between the variants at the 5% significane level
-> There is no evidence to reject the Null Hypothesis (H0) for conversion


[94mResults for add to cart rate:[0m
Average add to cart rate Variant A: 0.857
Average add to cart rate Variant B: 0.896
t-statistic: -2.654, p_value: 0.008

Interpretation for add to cart rate results
The p-value of 0.008 indicates a statistically significant difference between the variants at the 5% significance level
-> We can reject the Null Hypothesis (H0) for add to cart rate


[94mResults for Clicks on media:[0m
Average Clicks on media Variant A: 1.495
Average Clicks on media Variant B: 1.324
t-statistic: 3.171, p_value: 0.002

Interpretation for Clicks on media results
The p-value of 0.002 indicates a stat