# Hypothesis Testing: A Case Study of Light Theme Vs Dark Theme

### **Step 1**: Summarizing the data

In [97]:
import pandas as pd
from scipy.stats import ttest_ind, chi2_contingency

In [5]:
# import data
data = pd.read_csv('./hypothesis_testing/website_ab_test.csv')
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 10 columns):
 #   Column              Non-Null Count  Dtype  
---  ------              --------------  -----  
 0   Theme               1000 non-null   object 
 1   Click Through Rate  1000 non-null   float64
 2   Conversion Rate     1000 non-null   float64
 3   Bounce Rate         1000 non-null   float64
 4   Scroll_Depth        1000 non-null   float64
 5   Age                 1000 non-null   int64  
 6   Location            1000 non-null   object 
 7   Session_Duration    1000 non-null   int64  
 8   Purchases           1000 non-null   object 
 9   Added_to_Cart       1000 non-null   object 
dtypes: float64(4), int64(2), object(4)
memory usage: 78.2+ KB


In [6]:
data.head()

Unnamed: 0,Theme,Click Through Rate,Conversion Rate,Bounce Rate,Scroll_Depth,Age,Location,Session_Duration,Purchases,Added_to_Cart
0,Light Theme,0.05492,0.282367,0.405085,72.489458,25,Chennai,1535,No,Yes
1,Light Theme,0.113932,0.032973,0.732759,61.858568,19,Pune,303,No,Yes
2,Dark Theme,0.323352,0.178763,0.296543,45.737376,47,Chennai,563,Yes,Yes
3,Light Theme,0.485836,0.325225,0.245001,76.305298,58,Pune,385,Yes,No
4,Light Theme,0.034783,0.196766,0.7651,48.927407,25,New Delhi,1437,No,No


In [None]:
# function to return characteristics of the data
from encodings.punycode import T


def data_characteristics(data: pd.DataFrame | pd.Series):
    print(f'Rows: {data.shape[0]}')
    print(f'Columns: {data.shape[1]}')
    print(f'\nFeatures: {data.columns.tolist()}')
    print(f'\nMissing Values:\n {data.isnull().sum()}')
    print(f'\nUnique Values:\n {data.nunique()}')

In [15]:
data_characteristics(data)

Rows: 1000
Columns: 10

Features: ['Theme', 'Click Through Rate', 'Conversion Rate', 'Bounce Rate', 'Scroll_Depth', 'Age', 'Location', 'Session_Duration', 'Purchases', 'Added_to_Cart']

Available Values:
 Theme                 0
Click Through Rate    0
Conversion Rate       0
Bounce Rate           0
Scroll_Depth          0
Age                   0
Location              0
Session_Duration      0
Purchases             0
Added_to_Cart         0
dtype: int64

Unique Values:
 Theme                    2
Click Through Rate    1000
Conversion Rate       1000
Bounce Rate           1000
Scroll_Depth          1000
Age                     48
Location                 5
Session_Duration       770
Purchases                2
Added_to_Cart            2
dtype: int64


In [16]:
# return frequency of unique values
data.value_counts() #normalize=True)

Theme        Click Through Rate  Conversion Rate  Bounce Rate  Scroll_Depth  Age  Location   Session_Duration  Purchases  Added_to_Cart
Light Theme  0.499328            0.073991         0.559971     63.070712     26   New Delhi  1728              No         Yes              1
Dark Theme   0.016901            0.183432         0.487972     66.572883     37   Bangalore  1580              No         Yes              1
             0.019312            0.483299         0.670856     78.182973     45   New Delhi  635               Yes        No               1
             0.019964            0.313511         0.711349     34.829731     47   Pune       403               No         No               1
             0.020668            0.176972         0.783604     37.326832     33   Kolkata    1487              Yes        No               1
                                                                                                                                          ..
             0.029

In [17]:
# return summary of numeric columns
data.describe()

Unnamed: 0,Click Through Rate,Conversion Rate,Bounce Rate,Scroll_Depth,Age,Session_Duration
count,1000.0,1000.0,1000.0,1000.0,1000.0,1000.0
mean,0.256048,0.253312,0.505758,50.319494,41.528,924.999
std,0.139265,0.139092,0.172195,16.895269,14.114334,508.231723
min,0.010767,0.010881,0.20072,20.011738,18.0,38.0
25%,0.140794,0.131564,0.353609,35.655167,29.0,466.5
50%,0.253715,0.252823,0.514049,51.130712,42.0,931.0
75%,0.370674,0.37304,0.648557,64.666258,54.0,1375.25
max,0.499989,0.498916,0.799658,79.997108,65.0,1797.0


### **Step 2**: Analysis of the dataset

In [None]:
# Group data by theme & return mean values for numeric columns
theme_performance_numeric = (
    data.groupby('Theme')
        .mean(numeric_only=True)
        .sort_values(ascending=False, by='Conversion Rate')
)

In [None]:
theme_performance_numeric

Unnamed: 0_level_0,Click Through Rate,Conversion Rate,Bounce Rate,Scroll_Depth,Age,Session_Duration
Theme,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Light Theme,0.247109,0.255459,0.499035,50.735232,41.734568,930.833333
Dark Theme,0.264501,0.251282,0.512115,49.926404,41.332685,919.48249


The following insights are revealed:  

- Click Through Rate: Dark theme resulted in a higher CTR than light theme  
- Conversion Rate: Light theme slightly leads with a conversion rate of 25.5%  
- Bounce Rate: Dark theme is slightly higher for bounce rate at 51.2%  
- Scroll_Depth: Users on light theme scroll further on average (50.74%) than users on dark theme  
- Age: The average age of users is similar across themes  
- Session_Duration: Users on average spend more time on light theme (930.8 seconds) than users on dark theme (919.5 seconds) 

In [53]:
# Group by themes & analyze by categoric columns
theme_performance_categoric = (
    data.groupby('Theme')
        .agg({
            'Location': lambda x: x.mode()[0],
            'Purchases': lambda x: x.value_counts(normalize=True).iloc[0],
            'Added_to_Cart': lambda x: x.value_counts(normalize=True).iloc[0],
        })
        .rename(columns={
            'Location': 'Most Common Location',
            'Purchases': 'Proportion of Purchases (Yes)',
            'Added_to_Cart': 'Proportion of Added to Cart (Yes)'
        })
)

In [54]:
theme_performance_categoric

Unnamed: 0_level_0,Most Common Location,Proportion of Purchases (Yes),Proportion of Added to Cart (Yes)
Theme,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Dark Theme,Bangalore,0.503891,0.519455
Light Theme,Chennai,0.530864,0.532922


Further insights:  

- It appears the most common location for Dark theme users is Bangalore, while that for Light theme users is Chennai  
- Light theme users added more items to cart (53.3%) than Dark theme users (51.9%)
- Light theme users made more purchases (53%) than Dark them users (50%)  


Overvall, it appears Light theme outperformed Dark theme in causing users to convert, add items to cart and make purchases. However the differences are minor

### **Step 3** Test of Hypothesis  

The aim of the hypothesis is to test whether there is significant difference between the two independent treatments, Light theme and Dark theme across all the variables.

**a.** _Two Sample T-test for Numeric Variables_

Two-Sample T-test will be used to perform the test procedures  
Conventionally the significance level  **(α) = 0.05**

In [79]:
# stored test statistics and p_values of variables
t_statistics = {}
p_values = {}

In [None]:
# function for performing two sample t-test on the numeric columns
def perform_t_test(data, variable, alpha=.05):
    """
    Performs two sample t-test on the specified variable from the dataset for the treatments, Light theme and Dark theme.

    Parameters:
    - data: the dataset for performing the test procedure.
    - variable: the specified variable for the test to be performed on for the two treatments
    - alpha: the level of significance for the test

    Returns:
    conclusion on whether the test was significant.
    """
    # get the values of a specified variable for both themes
    light_theme = data[data['Theme'] == 'Light Theme'][variable]
    dark_theme = data[data['Theme'] == 'Dark Theme'][variable]
    
    # perform two sample t-test
    test_statistic, p_value = ttest_ind(light_theme, dark_theme, equal_var=False)

    # save output
    t_statistics[variable] = test_statistic
    p_values[variable] = p_value

    # compare p-value with level of significance(α)
    if p_value <= alpha:
        print(
            f"Reject the null hypothesis.\n"
            f"With a p-value of {p_value:.4f} (less than {alpha}), there is convincing evidence\n"
            f"that a statistically significant difference exists in '{variable}'\n"
            f"between the Light Theme and Dark Theme.\n"
        )
    else:
        print(
            f"Fail to reject the null hypothesis.\n"
            f"Since the p-value of {p_value:.4f} is greater than the alpha value of {alpha},\n"
            f"there is not enough evidence to conclude that a statistically significant\n"
            f"difference exists in '{variable}' between the Light Theme and Dark Theme.\n"
        )

In [72]:
#return variables for the test
numeric_variables = [column for column in data.select_dtypes('number')]
numeric_variables

['Click Through Rate',
 'Conversion Rate',
 'Bounce Rate',
 'Scroll_Depth',
 'Age',
 'Session_Duration']

**Click Through Rate**
- Null Hypothesis: There is no difference in click through rates between the Light theme and Dark theme.  
- Alternate Hypothesis: There is a significant difference

In [None]:
# test for click through rate
perform_t_test(data, 'Click Through Rate')

Reject the null hypothesis.
With a p-value of 0.0482 (less than 0.05), there is convincing evidence
that a statistically significant difference exists in 'Click Through Rate'
between the Light Theme and Dark Theme.



**Conversion Rate**
- Null Hypothesis: There is no difference in conversion rates between the Light theme and Dark theme.  
- Alternate Hypothesis: There is a significant difference

In [None]:
# test for conversion rate
perform_t_test(data, 'Conversion Rate')

Fail to reject the null hypothesis.
Since the p-value of 0.6350 is greater than the alpha value of 0.05,
there is not enough evidence to conclude that a statistically significant
difference exists in 'Conversion Rate' between the Light Theme and Dark Theme.



**Bounce Rate**
- Null Hypothesis: There is no difference in bounce rate between the Light theme and Dark theme.  
- Alternate Hypothesis: There is a significant difference

In [None]:
# test for bounce rate
perform_t_test(data, 'Bounce Rate')

Fail to reject the null hypothesis.
Since the p-value of 0.2297 is greater than the alpha value of 0.05,
there is not enough evidence to conclude that a statistically significant
difference exists in 'Bounce Rate' between the Light Theme and Dark Theme.



**Scroll_Depth**
- Null Hypothesis: There is no difference in Scroll_Depth between the Light theme and Dark theme.  
- Alternate Hypothesis: There is a significant difference

In [None]:
# test for scroll depth rate
perform_t_test(data, 'Scroll_Depth')

Fail to reject the null hypothesis.
Since the p-value of 0.4497 is greater than the alpha value of 0.05,
there is not enough evidence to conclude that a statistically significant
difference exists in 'Scroll_Depth' between the Light Theme and Dark Theme.



In [91]:
# summary
statistics = pd.DataFrame({'metric': list(t_statistics.keys()),
                           'test_statistic': list(t_statistics.values()),
                           'p_value': list(p_values.values())})
statistics

Unnamed: 0,metric,test_statistic,p_value
0,Click Through Rate,-1.978171,0.048184
1,Conversion Rate,0.474849,0.634998
2,Bounce Rate,-1.201888,0.229692
3,Scroll_Depth,0.756228,0.449692


In conclusion, only Click Through Rate is influenced by the Theme a user uses.  
And from the Analysis prior, Dark Theme has a higher Click Through Rate than the Light Theme.

**b.** _Chi-Square Test of Independence for Categorical Variables_

Chi-Square Test of Independence will be used to access, whether the categorical variables are related to Theme
Conventionally the significance level  **(α) = 0.05**

In [105]:
# stored chi statistic and p_values of categorical variables
chi2_statistics = {}
chi2_p_values = {}

In [110]:
def perform_chi2_test(data, variable, alpha=.05):
        # Create a contingency table for 'Theme' and defined column
        contingency_table = pd.crosstab(data['Theme'], data[variable])

        # Perform the Chi-Square Test of Independence
        chi2_stat, chi2_p_val, dof, expected = chi2_contingency(contingency_table)

        # save output
        chi2_statistics[variable] = chi2_stat
        chi2_p_values[variable] = chi2_p_val
        
        
        # Print  results
        print("Chi-Square Test of Independence Results:")
        print(f"Chi-Square Statistic: {chi2_stat:.4f}")
        print(f"Degrees of Freedom: {dof}")
        print(f"P-Value: {chi2_p_val:.4f}\n")

        if chi2_p_val <= alpha:
            print(
                f"Reject the null hypothesis.\n"
                f"With a p-value of {chi2_p_val:.4f} (less than {alpha}), there is convincing evidence\n"
                f"that {variable} is statistically significant to 'Theme'.\n"
            )
        else:
            print(
                f"Fail to reject the null hypothesis.\n"
                f"Since the p-value of {chi2_p_val:.4f} is greater than the alpha value of {alpha},\n"
                f"there is not enough evidence to conclude that {variable} is\n"
                f"statistically significant to 'Theme'.\n"
            )
        
        return contingency_table

In [100]:
#return variables for the test
categoric_variables = [column for column in data.select_dtypes('object') if column != 'Theme']
categoric_variables

['Location', 'Purchases', 'Added_to_Cart']

**Location**
- Null Hypothesis: There is no association between Themes and Locations.    
- Alternate Hypothesis: There is an association

In [111]:
perform_chi2_test(data, 'Location')

Chi-Square Test of Independence Results:
Chi-Square Statistic: 1.6588
Degrees of Freedom: 4
P-Value: 0.7982

Fail to reject the null hypothesis.
Since the p-value of 0.7982 is greater than the alpha value of 0.05,
there is not enough evidence to conclude that Location is
statistically significant to 'Theme'.



Location,Bangalore,Chennai,Kolkata,New Delhi,Pune
Theme,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Dark Theme,112,106,98,104,94
Light Theme,98,110,90,90,98


**Purchases**
- Null Hypothesis: There is no association between the Themes and Purchases.    
- Alternate Hypothesis: There is a an association

In [113]:
perform_chi2_test(data, 'Purchases')

Chi-Square Test of Independence Results:
Chi-Square Statistic: 0.6238
Degrees of Freedom: 1
P-Value: 0.4296

Fail to reject the null hypothesis.
Since the p-value of 0.4296 is greater than the alpha value of 0.05,
there is not enough evidence to conclude that Purchases is
statistically significant to 'Theme'.



Purchases,No,Yes
Theme,Unnamed: 1_level_1,Unnamed: 2_level_1
Dark Theme,255,259
Light Theme,228,258


**Added_to_Cart**
- Null Hypothesis: There is no association between the Themes and Added_to_Cart.    
- Alternate Hypothesis: There is a an association

In [114]:
perform_chi2_test(data, 'Added_to_Cart')

Chi-Square Test of Independence Results:
Chi-Square Statistic: 0.1317
Degrees of Freedom: 1
P-Value: 0.7167

Fail to reject the null hypothesis.
Since the p-value of 0.7167 is greater than the alpha value of 0.05,
there is not enough evidence to conclude that Added_to_Cart is
statistically significant to 'Theme'.



Added_to_Cart,No,Yes
Theme,Unnamed: 1_level_1,Unnamed: 2_level_1
Dark Theme,247,267
Light Theme,227,259


In [109]:
# summary
statistics = pd.DataFrame({'metric': list(chi2_statistics.keys()),
                           'test_statistic': list(chi2_statistics.values()),
                           'p_value': list(chi2_p_values.values())})
statistics

Unnamed: 0,metric,test_statistic,p_value
0,Location,1.658776,0.798192
1,Purchases,0.623812,0.429634
2,Added_to_Cart,0.131699,0.716677


To conclude the test shows the choice of Theme (Light Theme or Dark Theme) does not   
appear to have an influence on whether users add items to cart, made a purchase or location the user is.  

This is likely due to random chance rather than a meaningful relationship.