### Light Theme vs Dark theme case study

An online bookstore is looking to optimize its website design to improve user engagement and ultimately increase book purchases.  
The website currently offers two themes for its users: “Light Theme” and “Dark Theme.”  
The bookstore’s data science team wants to conduct an A/B testing experiment to determine which theme leads to better user engagement and higher conversion rates for book purchases.  

The data collected by the bookstore contains user interactions and engagement metrics for both the Light Theme and Dark Theme.  
The dataset includes the following key features:  

- **Theme:** dark or light  
- **Click Through Rate:** The proportion of the users who click on links or buttons on the website.  
- **Conversion Rate:** The percentage of users who signed up on the platform after visiting for the first time.  
- **Bounce Rate:** The percentage of users who leave the website without further interaction after visiting a single page.  
- **Scroll Depth:** The depth to which users scroll through the website pages.  
- **Age:** The age of the user.  
- **Location:** The location of the user.  
- **Session Duration:** The duration of the user’s session on the website.  
- **Purchases:** Whether the user purchased the book (Yes/No).  
- **Added_to_Cart:** Whether the user added books to the cart (Yes/No).  

Your task is to identify which theme, Light Theme or Dark Theme, yields better user engagement, purchases and conversion rates.  
You need to determine if there is a statistically significant difference in the key metrics between the two themes.

**Hypothesis Testing Process**

1. **Gather Data:** Collect the necessary data required for the hypothesis test.
2. **Define Hypotheses:**  
    - **Null Hypothesis (H₀):** The default assumption (e.g., no difference between themes).
    - **Alternative Hypothesis (H₁ or Ha):** What you want to prove (e.g., a difference exists).
3. **Choose Significance Level (α):**  
    - Common choices are 0.05 or 0.01.
    - This is the probability of rejecting the null hypothesis when it is actually true.
4. **Select Statistical Test:**  
    - Use t-tests for comparing means.
    - Use chi-square tests for categorical data.
    - Use ANOVA for comparing means across more than two groups.
5. **Perform the Test:** Apply the chosen statistical test to your data.
6. **Interpret Results:**  
    - Determine the p-value.
    - Compare the p-value to α and interpret whether to reject or fail to reject the null hypothesis.

Step 1: Data set - website_ab_test.csv
Step 2: Define Hypothesis
Null Hypothesis : No difference in the theme
Alternate Hypothesis : Difference exists
Step 3: Significance level (α) - 0.5


In [2]:
!pip install pandas
!pip install numpy
!pip install scipy

Collecting pandas
  Downloading pandas-2.3.0-cp313-cp313-macosx_11_0_arm64.whl.metadata (91 kB)
Collecting numpy>=1.26.0 (from pandas)
  Downloading numpy-2.3.0-cp313-cp313-macosx_14_0_arm64.whl.metadata (62 kB)
Collecting pytz>=2020.1 (from pandas)
  Using cached pytz-2025.2-py2.py3-none-any.whl.metadata (22 kB)
Collecting tzdata>=2022.7 (from pandas)
  Using cached tzdata-2025.2-py2.py3-none-any.whl.metadata (1.4 kB)
Downloading pandas-2.3.0-cp313-cp313-macosx_11_0_arm64.whl (10.7 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m10.7/10.7 MB[0m [31m19.7 MB/s[0m eta [36m0:00:00[0m [36m0:00:01[0m
[?25hDownloading numpy-2.3.0-cp313-cp313-macosx_14_0_arm64.whl (5.1 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m5.1/5.1 MB[0m [31m19.5 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
[?25hUsing cached pytz-2025.2-py2.py3-none-any.whl (509 kB)
Using cached tzdata-2025.2-py2.py3-none-any.whl (347 kB)
Installing collected packages: pytz, tzdata, n

In [3]:
import pandas as pd
from scipy.stats import ttest_ind

df = pd.read_csv("./website_ab_test.csv")

print(df.head())

         Theme  Click Through Rate  Conversion Rate  Bounce Rate  \
0  Light Theme            0.054920         0.282367     0.405085   
1  Light Theme            0.113932         0.032973     0.732759   
2   Dark Theme            0.323352         0.178763     0.296543   
3  Light Theme            0.485836         0.325225     0.245001   
4  Light Theme            0.034783         0.196766     0.765100   

   Scroll_Depth  Age   Location  Session_Duration Purchases Added_to_Cart  
0     72.489458   25    Chennai              1535        No           Yes  
1     61.858568   19       Pune               303        No           Yes  
2     45.737376   47    Chennai               563       Yes           Yes  
3     76.305298   58       Pune               385       Yes            No  
4     48.927407   25  New Delhi              1437        No            No  


In [5]:
df.describe()

Unnamed: 0,Click Through Rate,Conversion Rate,Bounce Rate,Scroll_Depth,Age,Session_Duration
count,1000.0,1000.0,1000.0,1000.0,1000.0,1000.0
mean,0.256048,0.253312,0.505758,50.319494,41.528,924.999
std,0.139265,0.139092,0.172195,16.895269,14.114334,508.231723
min,0.010767,0.010881,0.20072,20.011738,18.0,38.0
25%,0.140794,0.131564,0.353609,35.655167,29.0,466.5
50%,0.253715,0.252823,0.514049,51.130712,42.0,931.0
75%,0.370674,0.37304,0.648557,64.666258,54.0,1375.25
max,0.499989,0.498916,0.799658,79.997108,65.0,1797.0


In [6]:
# dataset summary
summary = {
    'Number of Records': df.shape[0],
    'Number of Columns': df.shape[1],
    'Missing Values': df.isnull().sum()
}

summary

{'Number of Records': 1000,
 'Number of Columns': 10,
 'Missing Values': Theme                 0
 Click Through Rate    0
 Conversion Rate       0
 Bounce Rate           0
 Scroll_Depth          0
 Age                   0
 Location              0
 Session_Duration      0
 Purchases             0
 Added_to_Cart         0
 dtype: int64}

In [24]:
# grouping data by theme and calculating mean values for the metrics
theme_performance = df.groupby('Theme')[['Conversion Rate', 'Bounce Rate', 'Session_Duration', 'Age', 'Scroll_Depth', 'Click Through Rate' ]].mean()

In [25]:
theme_performance_coversation_rate = theme_performance.sort_values(by='Conversion Rate', ascending=False)
theme_performance_coversation_rate

Unnamed: 0_level_0,Conversion Rate,Bounce Rate,Session_Duration,Age,Scroll_Depth,Click Through Rate
Theme,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Light Theme,0.255459,0.499035,930.833333,41.734568,50.735232,0.247109
Dark Theme,0.251282,0.512115,919.48249,41.332685,49.926404,0.264501


DT > LT
Click through Rate 
Bounce Rate

LT > DT
Conversion Rate
Session_Duration
Age
Scroll_Depth


### Our hypotheses are as follows:

- **Null Hypothesis (H₀):** There is no difference in Conversion Rates between the Light Theme and Dark Theme.
- **Alternative Hypothesis (Hₐ):** There is a difference in Conversion Rates between the Light Theme and Dark Theme.

In [30]:
# extracting click through rates for both themes
ctr_light = df[df['Theme'] == 'Light Theme']['Click Through Rate']
ctr_dark = df[df['Theme'] == 'Dark Theme']['Click Through Rate']

print(ctr_dark, ctr_light)

# performing a two-sample t-test
t_stat_ctr, p_value_ctr = ttest_ind(ctr_light, ctr_dark, equal_var=False)

t_stat_ctr, p_value_ctr

2      0.323352
8      0.110551
16     0.302031
18     0.492174
19     0.493888
         ...   
992    0.265413
993    0.212645
995    0.282792
996    0.299917
999    0.342588
Name: Click Through Rate, Length: 514, dtype: float64 0      0.054920
1      0.113932
3      0.485836
4      0.034783
5      0.173419
         ...   
989    0.126083
991    0.492991
994    0.144825
997    0.370254
998    0.095815
Name: Click Through Rate, Length: 486, dtype: float64


(np.float64(-1.9781708664172253), np.float64(0.04818435371010704))

In [46]:

# extracting bounce rates for both themes
bounce_light_br = df[df['Theme'] == 'Light Theme']['Bounce Rate']
bounce_dark_br = df[df['Theme'] == 'Dark Theme']['Bounce Rate']

# t-test for Bounce Rate
t_stat_bounce, p_value_bounce = ttest_ind(bounce_light_br, bounce_dark_br, equal_var=False)

# extracting session durations for both themes
session_light_sd = df[df['Theme'] == 'Light Theme']['Session_Duration']
session_dark_sd = df[df['Theme'] == 'Dark Theme']['Session_Duration']

# t-test for Session Duration
t_stat_session, p_value_session = ttest_ind(session_light_sd, session_dark_sd, equal_var=False)

# extracting session durations for both themes
session_light = df[df['Theme'] == 'Light Theme']['Conversion Rate']
session_dark = df[df['Theme'] == 'Dark Theme']['Conversion Rate']

# t-test for Session Duration
t_stat_cr, p_value_cr = ttest_ind(session_light, session_dark, equal_var=False)

# extracting scroll depths for both themes
scroll_depth_light = df[df['Theme'] == 'Light Theme']['Scroll_Depth']
scroll_depth_dark = df[df['Theme'] == 'Dark Theme']['Scroll_Depth']

# performing a two-sample t-test for scroll depth
t_stat_scroll, p_value_scroll = ttest_ind(scroll_depth_light, scroll_depth_dark, equal_var=False)


# performing a two-sample t-test for Click Through Rate
t_stat_ctr, p_value_ctr = ttest_ind(ctr_light, ctr_dark, equal_var=False)

# comparison summary for all metrics
comparison = pd.DataFrame({
    'Metric': ['Bounce Rate', 'Session Duration', 'Conversion Rate', 'Click Through Rate', 'Scroll_Depth'],
    't-statistic': [t_stat_bounce, t_stat_session, t_stat_cr, t_stat_ctr, t_stat_scroll],
    'p-value': [p_value_bounce, p_value_session, p_value_cr, p_value_ctr, p_value_scroll]
})

print(comparison)

               Metric  t-statistic   p-value
0         Bounce Rate    -1.201888  0.229692
1    Session Duration     0.352912  0.724229
2     Conversion Rate     0.474849  0.634998
3  Click Through Rate    -1.978171  0.048184
4        Scroll_Depth     0.756228  0.449692


In [56]:
len(bounce_light_br)


486

In [57]:
len(bounce_dark_br)

514

In [37]:
# extracting session durations for both themes
session_light = df[df['Theme'] == 'Light Theme']['Conversion Rate']
session_dark = df[df['Theme'] == 'Dark Theme']['Conversion Rate']

# t-test for Session Duration
t_stat_cr, p_value_cr = ttest_ind(session_light, session_dark, equal_var=False)

In [52]:
# extracting bounce rates for both themes
bounce_rates_light = df[df['Theme'] == 'Light Theme']['Bounce Rate']
bounce_rates_dark = df[df['Theme'] == 'Dark Theme']['Bounce Rate']

# performing a two-sample t-test for bounce rate
t_stat_bounce, p_value_bounce = ttest_ind(bounce_rates_light, bounce_rates_dark, equal_var=False)

# extracting scroll depths for both themes
scroll_depth_light = df[df['Theme'] == 'Light Theme']['Scroll_Depth']
scroll_depth_dark = df[df['Theme'] == 'Dark Theme']['Scroll_Depth']

# performing a two-sample t-test for scroll depth
t_stat_scroll, p_value_scroll = ttest_ind(scroll_depth_light, scroll_depth_dark, equal_var=False)

# creating a table for comparison
comparison_table = pd.DataFrame({
    'Metric': ['Click Through Rate', 'Conversion Rate', 'Bounce Rate', 'Scroll Depth'],
    'T-Statistic': [t_stat_ctr, t_stat_cr, t_stat_bounce, t_stat_scroll],
    'P-Value': [p_value_ctr, p_value_cr, p_value_bounce, p_value_scroll]
})

comparison_table

Unnamed: 0,Metric,T-Statistic,P-Value
0,Click Through Rate,-1.978171,0.048184
1,Conversion Rate,0.474849,0.634998
2,Bounce Rate,-1.201888,0.229692
3,Scroll Depth,0.756228,0.449692


In [51]:
#Analyze statistical significance for each metric using p-values

alpha = 0.05  #significance level

results = {
    'Click Through Rate': {
        't-statistic': t_stat_ctr,
        'p-value': p_value_ctr,
        'Significant': p_value_ctr < alpha
    },
    'Bounce Rate': {
        't-statistic': t_stat_bounce,
        'p-value': p_value_bounce,
        'Significant': p_value_bounce < alpha
    },
    'Session Duration': {
        't-statistic': t_stat_session,
        'p-value': p_value_session,
        'Significant': p_value_session < alpha
    },
    'Conversion Rate': {
        't-statistic': t_stat_cr,
        'p-value': p_value_cr,
        'Significant': p_value_cr < alpha
    },
    'Scroll Depth': {
        't-statistic': t_stat_scroll,
        'p-value': p_value_scroll,
        'Significant': p_value_scroll < alpha
    }
}

#Display results
pd.DataFrame(results).T

Unnamed: 0,t-statistic,p-value,Significant
Click Through Rate,-1.978171,0.048184,True
Bounce Rate,-1.201888,0.229692,False
Session Duration,0.352912,0.724229,False
Conversion Rate,0.474849,0.634998,False
Scroll Depth,0.756228,0.449692,False


- Click Through Rate: The test reveals a statistically significant difference, with the Dark Theme likely performing better (P-Value = 0.048).
- Conversion Rate: No statistically significant difference was found (P-Value = 0.635).
- Bounce Rate: There’s no statistically significant difference in Bounce Rates between the themes (P-Value = 0.230).
- Scroll Depth: Similarly, no statistically significant difference is observed in Scroll Depths (P-Value = 0.450).
- In summary, while the two themes perform similarly across most metrics, the Dark Theme has a slight edge in terms of engaging users to click through.