#### Raviteja Padala
<img align="left" img src= in.png height = 20 width="20"/>   https://www.linkedin.com/in/raviteja-padala/ <br>

<img align="left" img src= github.png height = 20 width="20"/> https://github.com/raviteja-padala


# <span style='color:Blue'> Objective :  </span> To know, understand and perform A/B testing on Conversions and Revenue data.

In [1]:
#import libraries
import pandas as pd

In [3]:
#import data
df = pd.read_csv("https://raw.githubusercontent.com/raviteja-padala/Datasets/main/clicks_conversions.csv")
df.head()

Unnamed: 0,User ID,Group,Clicks,Conversions,Revenue($),Age,Gender
0,1,A,10,2,50,25,Male
1,2,B,8,1,35,32,Female
2,3,A,12,3,70,41,Male
3,4,B,9,2,45,37,Female
4,5,A,11,2,55,29,Male


In [4]:
#shape of dataset
df.shape

(200, 7)

In [6]:
#info fo dataset
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 200 entries, 0 to 199
Data columns (total 7 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   User ID      200 non-null    int64 
 1   Group        200 non-null    object
 2   Clicks       200 non-null    int64 
 3   Conversions  200 non-null    int64 
 4   Revenue($)   200 non-null    int64 
 5   Age          200 non-null    int64 
 6   Gender       200 non-null    object
dtypes: int64(5), object(2)
memory usage: 11.1+ KB


In [7]:
#checking null values
df.isnull().sum()

User ID        0
Group          0
Clicks         0
Conversions    0
Revenue($)     0
Age            0
Gender         0
dtype: int64

### Scenario: 
* In the dataset Group A are directed to control website page, Group B are directed to variant website page.

### Goal:
* To determine whether there is a significant difference in the conversion rates between Group A and Group B.

### Hypothesis:
* Null Hypothesis (H0): There is no significant difference in the conversion rates between Group A and Group B 
* Alternative Hypothesis (HA): There is a significant difference in the conversion rates between Group A and Group B 

<img src= webpage_ab_test.png height = 400 width="800"/>

In [15]:
#checking conversions based on group
df.groupby( 'Group')[ 'Conversions' ].value_counts()

Group  Conversions
A      2              56
       1              27
       3              27
B      2              52
       3              26
       1              12
Name: Conversions, dtype: int64

In [16]:
#Contingency Table
contingency_data = pd.crosstab(df['Group'],df['Conversions'])
contingency_data

Conversions,1,2,3
Group,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
A,27,56,27
B,12,52,26


In [18]:
# checking conversions of Group-A and Group-B

group_a_data = df[df['Group'] == 'A']['Conversions']
group_b_data = df[df['Group'] == 'B']['Conversions']

In [20]:
print(f"Conversions of group A = {group_a_data.count()} , Conversions of group B = {group_b_data.count()}" )

Conversions of group A = 110 , Conversions of group B = 90


In [23]:
# Calculate conversion rates
conversion_rate_a = df[df['Group'] == 'A']['Conversions'].mean()
conversion_rate_b = df[df['Group'] == 'B']['Conversions'].mean()

# Print the results
print("Conversion Rate for Group A: {:.2%}".format(conversion_rate_a))
print("Conversion Rate for Group B: {:.2%}".format(conversion_rate_b))


Conversion Rate for Group A: 200.00%
Conversion Rate for Group B: 215.56%


In [22]:
#Conduct statistical test to compare the two groups. One common test for A/B testing is the independent t-test

#import library
import scipy.stats as stats

#statistical test

t_statistic, p_value = stats.ttest_ind(group_a_data, group_b_data)

print('Test statistic=%.3f, p_values=%.3f' % (t_statistic, p_value))

alpha = 0.05  # significance level
if p_value < alpha:
    print("The results are statistically significant.")
else:
    print("The results are not statistically significant.")

Test statistic=-1.625, p_values=0.106
The results are not statistically significant.


In [42]:
# Import the necessary library
import scipy.stats as stats

# Perform the Mann-Whitney U test
u_statistic, p_value = stats.mannwhitneyu(df[df['Group'] == 'A']['Conversions'], df[df['Group'] == 'B']['Conversions'])

# Print the results
print("Mann-Whitney U statistic:", u_statistic)
print("p-value:", p_value)


alpha = 0.05  # significance level
if p_value < alpha:
    print("p-value is BELOW significance level, reject null hypothesis, there is significant difference in the conversion rates")
else:
    print("p-value is above significance level, fail to reject null hypothesis, there is no significant difference in the conversion rates")

Mann-Whitney U statistic: 4369.0
p-value: 0.11466275833600062
p-value is above significance level, fail to reject null hypothesis, there is no significant difference in the conversion rates


<span style='color:Blue'> **Observation:** </span>

Since the p-value is greater than the significance level (e.g., 0.05), we fail to reject the null hypothesis. This means that there is not enough evidence to support the claim of a significant difference in the conversion rates between Group A and Group B.

Therefore, based on the available data, we do not have sufficient statistical evidence to conclude that there is a significant difference in the conversion rates between Group A and Group B . It is possible that any observed difference in conversion rates between the two groups may be due to random chance rather than a true difference in the effectiveness of the implementations.

Based on the Mann-Whitney U test, the ranking of the conversion data does not indicate a significant difference between the two groups. This means that the observed difference in conversion rates could likely be attributed to random chance rather than a true difference in the effectiveness of the implementations.

It's important to note that failing to reject the null hypothesis does not necessarily imply that the conversion rates are exactly the same in both groups. There could still be some differences, but they are not statistically significant based on the given data.

Further analysis or a larger sample size may be required to obtain more conclusive results. Additionally, consider exploring other metrics or conducting additional experiments to gain a deeper understanding of the impact of different implementations on conversions.

In [43]:
#performing AB tests for multiple variables

group_a_ = df[df['Group'] == 'A']
group_b_ = df[df['Group'] == 'B']


In [46]:
variables = ['Conversions', 'Revenue($)', 'Clicks']


In [50]:
for variable in variables:
    group_a_variable = group_a_[variable]
    group_b_variable = group_b_[variable]
    
    t_statistic, p_value = stats.ttest_ind(group_a_variable, group_b_variable)
    
    alpha = 0.05  # significance level
    
    print(f"Variable: {variable}")
    print(f"t_statistic: {t_statistic}, p-value:{p_value}")
    if p_value < alpha:
        print("The results are statistically significant.")
    else:
        print("The results are not statistically significant.")
    print()


Variable: Conversions
t_statistic: -1.62490817309916, p-value:0.10577284999636533
The results are not statistically significant.

Variable: Revenue($)
t_statistic: -1.0770122388449817, p-value:0.2827852639406191
The results are not statistically significant.

Variable: Clicks
t_statistic: 1.7083106768948215, p-value:0.08914585223402866
The results are not statistically significant.



<span style='color:Blue'> **Observation:** </span>

When the results are not statistically significant, it is important to interpret the findings cautiously and avoid making strong conclusions or taking actions solely based on the data. It may be necessary to gather more data or re-evaluate the experimental design or intervention to identify potential improvements or factors that could affect the outcome.

Remember that statistical significance is just one aspect of the analysis, and it is essential to consider the context, effect size, practical significance, and other relevant factors when interpreting the results and making decisions.

### Conclusion: We have performed A/B testing on given data and observed that There is no significant difference in the conversion rates between Group A and Group B. 


## Thank you for reading all the way to the end.