### HYPOTHESIS TESTING

In [2]:
##### Hypothesis Testing is a statistical method used to make inferences or decisions about a population based on sample data. 
##### It starts with a null hypothesis (H0), which represents a default stance or no effect, and an alternative hypothesis (H1 or Ha), which represents what we aim to prove or expect to find. 
##### The process involves using sample data to determine whether to reject the null hypothesis in favor of the alternative hypothesis, based on the likelihood of observing the sample data under the null hypothesis. 

In [1]:
# Hypothesis Testing: Process We Can Follow

     # So, Hypothesis Testing is a fundamental process in data science for making data-driven decisions and inferences about populations based on sample data. Below is the process we can follow for the task of Hypothesis Testing:
     # Gather the necessary data required for the hypothesis test.
     # Define Null (H0) and Alternative Hypothesis (H1 or Ha).
     # Choose the Significance Level (α), which is the probability of rejecting the null hypothesis when it is true.
     # Select the appropriate statistical tests. Examples include t-tests for comparing means, chi-square tests for categorical data, and ANOVA for comparing means across more than two groups.
     # Perform the chosen statistical test on your data.
     # Determine the p-value and interpret the results of your statistical tests.


In [3]:
# Import Data with neccessary library. 
# Get an overview by have an first 5 line items. 

import pandas as pd

data = pd.read_csv("D:/Users/Admin/website_ab_test.csv")
data.head()

Unnamed: 0,Theme,Click Through Rate,Conversion Rate,Bounce Rate,Scroll_Depth,Age,Location,Session_Duration,Purchases,Added_to_Cart
0,Light Theme,0.05492,0.282367,0.405085,72.489458,25,Chennai,1535,No,Yes
1,Light Theme,0.113932,0.032973,0.732759,61.858568,19,Pune,303,No,Yes
2,Dark Theme,0.323352,0.178763,0.296543,45.737376,47,Chennai,563,Yes,Yes
3,Light Theme,0.485836,0.325225,0.245001,76.305298,58,Pune,385,Yes,No
4,Light Theme,0.034783,0.196766,0.7651,48.927407,25,New Delhi,1437,No,No


In [9]:
data.columns

Index(['Theme', 'Click Through Rate', 'Conversion Rate', 'Bounce Rate',
       'Scroll_Depth', 'Age', 'Location', 'Session_Duration', 'Purchases',
       'Added_to_Cart'],
      dtype='object')

In [4]:
data.shape

# So, our dataset has 1000 rows and 10 columns. 

(1000, 10)

In [6]:
data.describe()

Unnamed: 0,Click Through Rate,Conversion Rate,Bounce Rate,Scroll_Depth,Age,Session_Duration
count,1000.0,1000.0,1000.0,1000.0,1000.0,1000.0
mean,0.256048,0.253312,0.505758,50.319494,41.528,924.999
std,0.139265,0.139092,0.172195,16.895269,14.114334,508.231723
min,0.010767,0.010881,0.20072,20.011738,18.0,38.0
25%,0.140794,0.131564,0.353609,35.655167,29.0,466.5
50%,0.253715,0.252823,0.514049,51.130712,42.0,931.0
75%,0.370674,0.37304,0.648557,64.666258,54.0,1375.25
max,0.499989,0.498916,0.799658,79.997108,65.0,1797.0


In [8]:
data.info()

# So, our dataset has no null values presented in each and every columns. 

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 10 columns):
 #   Column              Non-Null Count  Dtype  
---  ------              --------------  -----  
 0   Theme               1000 non-null   object 
 1   Click Through Rate  1000 non-null   float64
 2   Conversion Rate     1000 non-null   float64
 3   Bounce Rate         1000 non-null   float64
 4   Scroll_Depth        1000 non-null   float64
 5   Age                 1000 non-null   int64  
 6   Location            1000 non-null   object 
 7   Session_Duration    1000 non-null   int64  
 8   Purchases           1000 non-null   object 
 9   Added_to_Cart       1000 non-null   object 
dtypes: float64(4), int64(2), object(4)
memory usage: 78.3+ KB


In [None]:
# First we need to find out the sample mean for each and every NUMERICAL COLUMNS. 

import numpy as np

numerical_columns = ['Click Through Rate', 'Conversion Rate', 'Bounce Rate','Scroll_Depth','Age','Session_Duration']

sample_mean = data.groupby('Theme')[numerical_columns].mean() 

print(sample_mean)

# See the results, Mean has been calculated based on the Dark theme and Light theme individually. 
# So that we used 'Theme' near 'groupby'.

             Click Through Rate  Conversion Rate  Bounce Rate  Scroll_Depth  \
Theme                                                                         
Dark Theme             0.264501         0.251282     0.512115     49.926404   
Light Theme            0.247109         0.255459     0.499035     50.735232   

                   Age  Session_Duration  
Theme                                     
Dark Theme   41.332685        919.482490  
Light Theme  41.734568        930.833333  


##### Based on the above result - we can find out that the Mean of Light Theme is Slightly higher than Dark Theme by all the features expecting 'Click Through Rate'. 
##### Note that - less in bounce rate would be preferable. Here 'Light Theme is lesser by value  than 'Dark Theme'. 
##### Hence here also Light theme is preffered most. 
##### From these insights, it appears that the Light Theme slightly outperforms the Dark Theme in terms of Conversion Rate, Bounce Rate, Scroll Depth, and Session Duration, while the Dark Theme leads only in the Click Through Rate. 
##### However, the differences are relatively minor across all metrics.

### Getting Started with Hypothesis Testing

###### Now which Hypothesis test is best to use here in this case? Z-test or T-test? 
###### Usually Z-test is conducted when the following two conditions are satisfied. The sample size should be large (>30) & when we know the Population variance. 
###### And yes, here our data sample is 1000 (that is >30) this condition is satisfied. However we did not know the Population Variance. 
###### We do not have any information regarding Population to find out it's variance also. 
###### Hence we could not perform Z-test here in this case. What about T-test? 

###### For T-test, the conditions are same as Z-test but it's not neccessary to know the Population variance (that we don't have actually)
###### And yes - we can go ahead performing T-test. 

###### Common use cases of T-Test:
###### One-sample T-test: Testing if the sample mean is significantly different from a known value. 
###### Two-sample T-test: Comparing means from two independent samples.  (Here we have two independent samples - Light Theme and Dark Theme. So we going to use Two-Sample T-test here.)
###### Paired T-test: Comparing means from paired or related samples (e.g., before-and-after scenarios).

##### We can take out any feature to define Hypothesis. 

#### DEFINING HYPOTHESIS

##### NULL HYPOTHESIS (H0): There is NO DIFFERENCE in the Conversion rate between the light theme and dark theme (being there is minor slight difference, it considered this as 'No difference')
##### ALTERNATE HYPOTHESIS (H1): There is a DIFFERENCE in the Conversion rate between the light theme and dark theme. 

In [None]:
from scipy.stats import ttest_ind

# Extracting Conversion rates for both themes:

Conversion_rate_light = data[data.Theme == 'Light Theme']['Conversion Rate']
Conversion_rate_dark = data[data.Theme == 'Dark Theme']['Conversion Rate']

# Performing Two sample T-test:

T_stat, p_value = ttest_ind(Conversion_rate_light, Conversion_rate_dark, equal_var = False)
T_stat, p_value 

# Is it neccessay to mention 'T-stat' here, I think it doesn't made any effect here. 
# The reason we mentioned 'T-stat' here is just to KNOW THE P-VALUE FINDINGS IS FOR WHAT TECHNIQUE..!


(np.float64(0.4748494462782632), np.float64(0.6349982678451778))

In [21]:
# P-value should be lesser than Significance level (alpha = 0.05) to reject the Null Hypothesis. 
# Below conclusion says that hence P-value is higher than alpha, we could not reject Null Hypothesis. Hence it has been ACCEPTED. 
# Meaning that there is NO SIGNIFICANT DIFFERENCE between the themes. 

#### CONCLUSION:

##### The result of the two-sample t-test gives a p-value of approximately 0.635. 
##### Since this p-value is much greater than our significance level of 0.05, we do not have enough evidence to reject the null hypothesis. 
##### Therefore, we conclude that there is no statistically significant difference in 'CONVERSION RATES' between the Light Theme and Dark Theme based on the data provided.

In [20]:
# Now I want to do the testing with the feature 'Click Through Rate'.

# Extract 'Click through rate' for both light and dark themes.

Click_through_rate_light = data[data.Theme == 'Light Theme']['Click Through Rate']
Click_through_rate_dark = data[data.Theme == 'Dark Theme']['Click Through Rate']

t_stat1, p_value1 = ttest_ind(Click_through_rate_light, Click_through_rate_dark, equal_var = False)
t_stat1, p_value1


(np.float64(-1.9781708664172253), np.float64(0.04818435371010704))

In [None]:
# NOTE TO BE REMEMBERED:

# As we saw before, the sample mean of all the features (EXCEPT CLICK THROUGH RATES) are higher in LIGHT THEME when compared to DARK THEME. 
# IMPORTANT: As per the defining hypothesis and the statement right above - the hypothesis test results  will be the SAME  when put all the features into the test BUT DIFFERENT ONLY FOR 'CLICK THROUGH RATES'. 

# As you saw both of these conclusions, We ACCEPTED NULL HYPOTHESIS IN CONVERSION RATES while We REJECTED THE NULL HYPOTHEIS IN CLICK THOROUGH RATES.
# So, it's clear that IT SHOULD BE ACCEPTED NULL HYPOTHESIS IN ALL OTHER FEATURES ALSO (EXCEPT CLICK THROUGH RATE WHICH WAS REJECTED) 

# Below conclusion meaning that we can REJECT THE NULL HYPOTHESIS. 
# Means there is A SIGNIFICANT DIFFERENCE between the themes. 

#### CONCLUSION: 

##### Here you can see that P-value (0.048) is lesser than alpha 0.05 - Hene we REJECT the NULL HYPOTHESIS. 
##### Therefore, we conclude that there is statistically significant difference in 'CLICK THROUGH RATES' between the Light Theme and Dark Theme based on the data provided. 


In [26]:
# Let's check with the other two features - Bounce rate & Scroll depth. 

# Extract Bounce rate from both light and dark themes:

Bounce_rate_light = data[data.Theme == 'Light Theme']['Bounce Rate']
Bounce_rate_dark = data[data.Theme == 'Dark Theme']['Bounce Rate']

t_stat2, p_value2 = ttest_ind(Bounce_rate_light, Bounce_rate_dark, equal_var = False)
t_stat2,p_value2


(np.float64(-1.2018883310494073), np.float64(0.229692077505148))

In [28]:
Scroll_Depth_light = data[data.Theme == 'Light Theme']['Scroll_Depth']
Scroll_Depth_dark = data[data.Theme == 'Dark Theme']['Scroll_Depth']

t_stat3, p_value3 = ttest_ind(Scroll_Depth_light, Scroll_Depth_dark, equal_var = False)
t_stat3,p_value3

(np.float64(0.7562277864140986), np.float64(0.4496919249484911))

In [32]:
# CREATING A TABLE FOR COMPARISON:

comparison_table = pd.DataFrame(
    {'Metric' :['Click Through Rate', 'Conversion Rate', 'Bounce Rate', 'Scroll_Depth'],
'T_Statistic' :[T_stat,t_stat1,t_stat2,t_stat3],
'P_values' : [p_value,p_value1,p_value2,p_value3]}
)

comparison_table


Unnamed: 0,Metric,T_Statistic,P_values
0,Click Through Rate,0.474849,0.634998
1,Conversion Rate,-1.978171,0.048184
2,Bounce Rate,-1.201888,0.229692
3,Scroll_Depth,0.756228,0.449692


### CONCLUSION:

#### Here P_values are lesser in Conversion Rate, Bounce Rate and Scroll_Depth than significance level (alpha - 0.05)
#### Meaning that we have to REJECT THE NULL HYPOTHESIS concluding that there is A SIGNIFICANT DIFFERENCE between the both themes. 

#### Hence P-value is higher than significance level (alpha - 0.05) in 'Click through rates' - We have to ACCEPT THE NULL HYPOTHESIS. 
#### Meaning that there is NO SIGNIFICANT DIFFERENCE between the both themes. 

#### In summary, For other key performance indicators like Conversion Rate, Bounce Rate, and Scroll Depth, the choice between a Light Theme and a Dark Theme does not significantly affect user behaviour according to the data provided.


