In [2]:
import pandas as pd
from scipy.stats import ttest_ind

In [3]:
df=pd.read_csv('website_ab_test.csv')

In [4]:
df.head()

Unnamed: 0,Theme,Click Through Rate,Conversion Rate,Bounce Rate,Scroll_Depth,Age,Location,Session_Duration,Purchases,Added_to_Cart
0,Light Theme,0.05492,0.282367,0.405085,72.489458,25,Chennai,1535,No,Yes
1,Light Theme,0.113932,0.032973,0.732759,61.858568,19,Pune,303,No,Yes
2,Dark Theme,0.323352,0.178763,0.296543,45.737376,47,Chennai,563,Yes,Yes
3,Light Theme,0.485836,0.325225,0.245001,76.305298,58,Pune,385,Yes,No
4,Light Theme,0.034783,0.196766,0.7651,48.927407,25,New Delhi,1437,No,No


In [6]:
# Dataset Summary
summary = {
    'Number of Records': df.shape[0],
    'Number of Columns': df.shape[1],
    'Missing Values': df.isnull().sum(),
    'Numberical Columns Summary': df.describe()
}
summary

{'Number of Records': 1000,
 'Number of Columns': 10,
 'Missing Values': Theme                 0
 Click Through Rate    0
 Conversion Rate       0
 Bounce Rate           0
 Scroll_Depth          0
 Age                   0
 Location              0
 Session_Duration      0
 Purchases             0
 Added_to_Cart         0
 dtype: int64,
 'Numberical Columns Summary':        Click Through Rate  Conversion Rate  Bounce Rate  Scroll_Depth  \
 count         1000.000000      1000.000000  1000.000000   1000.000000   
 mean             0.256048         0.253312     0.505758     50.319494   
 std              0.139265         0.139092     0.172195     16.895269   
 min              0.010767         0.010881     0.200720     20.011738   
 25%              0.140794         0.131564     0.353609     35.655167   
 50%              0.253715         0.252823     0.514049     51.130712   
 75%              0.370674         0.373040     0.648557     64.666258   
 max              0.499989         0.4989

In [7]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 10 columns):
 #   Column              Non-Null Count  Dtype  
---  ------              --------------  -----  
 0   Theme               1000 non-null   object 
 1   Click Through Rate  1000 non-null   float64
 2   Conversion Rate     1000 non-null   float64
 3   Bounce Rate         1000 non-null   float64
 4   Scroll_Depth        1000 non-null   float64
 5   Age                 1000 non-null   int64  
 6   Location            1000 non-null   object 
 7   Session_Duration    1000 non-null   int64  
 8   Purchases           1000 non-null   object 
 9   Added_to_Cart       1000 non-null   object 
dtypes: float64(4), int64(2), object(4)
memory usage: 78.3+ KB


In [8]:
# grouping data by theme and calculating mean values
theme_performance = df.groupby('Theme').mean()
theme_performance

  theme_performance = df.groupby('Theme').mean()


Unnamed: 0_level_0,Click Through Rate,Conversion Rate,Bounce Rate,Scroll_Depth,Age,Session_Duration
Theme,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Dark Theme,0.264501,0.251282,0.512115,49.926404,41.332685,919.48249
Light Theme,0.247109,0.255459,0.499035,50.735232,41.734568,930.833333


In [9]:
# Sorting the data by conversion rate for a better comparison
theme_performance_sorted = theme_performance.sort_values(by='Conversion Rate', ascending=False)
theme_performance_sorted

Unnamed: 0_level_0,Click Through Rate,Conversion Rate,Bounce Rate,Scroll_Depth,Age,Session_Duration
Theme,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Light Theme,0.247109,0.255459,0.499035,50.735232,41.734568,930.833333
Dark Theme,0.264501,0.251282,0.512115,49.926404,41.332685,919.48249


# Hypothesis Testing
We’ll use a significance level (alpha) of 0.05 for our hypothesis testing

## 1. Hypothesis testing based on the Conversion Rate between the Light Theme and Dark Theme.

### Null Hypothesis (H0): 
There is no difference in conversion rates between the Light theme and Dark theme
### Alternative Hypothesis (H1): 
There is a difference in conversion rates between the Light theme and Dark theme

In [10]:
# Extracting conversion rates for both themes
conversion_rate_light = df[df['Theme']=='Light Theme']['Conversion Rate']
conversion_rate_dark = df[df['Theme']=='Dark Theme']['Conversion Rate']

In [11]:
# performing a two-sample t-test
t_stat, p_value = ttest_ind(conversion_rate_light, conversion_rate_dark, equal_var=False)
print('t-stat value:',t_stat)
print('p value:', p_value)

t-stat value: 0.4748494462782632
p value: 0.6349982678451778


### Conclusion:
Since p value is smuch greater than significance level, we donot have enough evidence to reject the null hypothesis.

## 2. Hypothesis testing based on the Click Through Rate (CTR) between the Light Theme and Dark Theme.

### Null Hypothesis (H0): 
There is no difference in Click Through Rates between the Light theme and Dark theme
### Alternative Hypothesis (H1): 
There is a difference in Click Through Rates between the Light theme and Dark theme

In [13]:
# Extracting click through rate  for both themes
click_through_rate_light=df[df['Theme']=='Light Theme']['Click Through Rate']
click_through_rate_dark =df[df['Theme']=='Dark Theme']['Click Through Rate']

In [14]:
# performing a two sample test
t_test_ctr, p_value_ctr = ttest_ind(click_through_rate_light, click_through_rate_dark, equal_var=False)
print('t-test for CTR:', t_test_ctr)
print('p value for CTR:', p_value_ctr)

t-test for CTR: -1.9781708664172253
p value for CTR: 0.04818435371010704


### Conclusion:
P value is slightly below our significance level indicating there is a statistically significant difference in CLick Through Rate between the Light theme and Dark theme, with the Dark theme likely having a higher CTR.

## 3. Hypothesis testing based on the Bounce Rate between the Light Theme and Dark Theme.

### Null Hypothesis (H0): 
There is no difference in Bounce Rate between the Light theme and Dark theme
### Alternative Hypothesis (H1): 
There is a difference in Bounce Rate between the Light theme and Dark theme

In [15]:
# Extracting Bounce rate for both themes
bounce_rate_light = df[df['Theme']=='Light Theme']['Bounce Rate']
bounce_rate_dark = df[df['Theme']=='Dark Theme']['Bounce Rate']

In [16]:
# performing a two sample test
t_test_bounce, p_value_bounce = ttest_ind(bounce_rate_light, bounce_rate_dark, equal_var=False)
print('t-test value for bounce rate:', t_test_bounce)
print('p value for bounce rate:', p_value_bounce)

t-test value for bounce rate: -1.2018883310494073
p value for bounce rate: 0.229692077505148


### Conclusion:
Since p value is greater than significance level, we do not have enough evidence to reject the null hypothesis.

## 4. Hypothesis testing based on the Scroll Depth between the Light Theme and Dark Theme.

### Null Hypothesis (H0): 
There is no difference in Scroll Depth between the Light theme and Dark theme
### Alternative Hypothesis (H1): 
There is a difference in Scroll Depth between the Light theme and Dark theme

In [19]:
# Extracting Bounce rate for both themes
scroll_depth_light = df[df['Theme']=='Light Theme']['Scroll_Depth']
scroll_depth_dark = df[df['Theme']=='Dark Theme']['Scroll_Depth']

In [21]:
# performing a two-sample t-test for scroll depth
t_test_scroll, p_value_scroll = ttest_ind(scroll_depth_light, scroll_depth_dark, equal_var=False)
print('t-test value for scroll depth:', t_test_scroll)
print('p value for scroll depth:', p_value_scroll)

t-test value for scroll depth: 0.7562277864140986
p value for scroll depth: 0.4496919249484911


### Conclusion:
Since p value is greater than significance level, we do not have enough evidence to reject the null hypothesis.

In [24]:
# Creatinga table for comparison
comparison_table= pd.DataFrame({
    'Metric': ['Click Through Rate','Conversion Rate','Bounce Rate', 'Scroll Depth'],
    'T-Statistic': [t_test_ctr,t_stat, t_test_bounce, t_test_scroll],
    'p value':[p_value_ctr, p_value, p_value_bounce, p_value_scroll]
})
comparison_table

Unnamed: 0,Metric,T-Statistic,p value
0,Click Through Rate,-1.978171,0.048184
1,Conversion Rate,0.474849,0.634998
2,Bounce Rate,-1.201888,0.229692
3,Scroll Depth,0.756228,0.449692


### Click Through Rate:
The test reveals a statistically significant difference, with the Dark theme likely performing better
### Conversion Rate:
No statistically significant difference was found.
### Bounce Rate:
No statistically significant difference found in Bounce rates.
### Scroll Depth:
No statistically significant difference observed in Scroll Depth.

## Summary:
While two themes perform similarly across most metrics, the Dark theme has slight edge in terms of engaging users to click through. For other key performance indicators like Conversion rate, Bounce Rate, Scroll Depth, the choice between a Light Theme and Dark Theme doesnot significantly affect user behaviour according to the data provided.