<a href="https://colab.research.google.com/github/mihirdeo16/ab-testing/blob/main/docs/Tutorial.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# A/B Testing

A library designed to help the Data scientist, Developer & Data analyst to quickly build report and analysis the performance of the A/B test. Link to GitHub and Pypi can be found below.

+ [GitHub-ab-testing](https://github.com/mihirdeo16/ab-testing)

+ [PyPi-ab-testing](https://pypi.org/project/ab-testing-analysis/)

In [None]:
!pip install ab-testing-analysis

Import the ABTest class, which has methods to build the A/B report.
----
In this usecase will use a fake dataset, from Dataset class to create the users and related information, by importing the ``Dataset`` class

In [2]:
# Importing the library
from ab_testing import ABTest
from ab_testing.data import Dataset

In [3]:
# Creating the Dataframe of Users, their response and Group they belong to
df = Dataset().data()
df.head()

Unnamed: 0,Users,Response,Group
0,SVYJC9AMI8,0,A
1,Y2LO0CJO8X,0,A
2,H6YDX4KN91,0,A
3,11NP7K88R2,0,A
4,0KFN2MXOYG,0,A


## Data
Dataframe contain the, information on following things
+ **Response** from the user on test either **1** or **0** which indicate the whether he converted or not converted in the test.
+ **Group** column which indicate in which group/test this user belong to A or B.

In [4]:
print('Total number of the Users/records: ',df.shape[0])
print(f'Number of Groups, {df.Group.nunique()} and their labels {df.Group.unique()}')
print(f'Response types, {df.Response.unique()}','\n')

print('Group "A" response distribution')
print(df[df.Group=='A'].Response.value_counts())
print('\nGroup "B" response distribution')
print(df[df.Group=='B'].Response.value_counts())

Total number of the Users/records:  1000
Number of Groups, 2 and their labels ['A' 'B']
Response types, [0 1] 

Group "A" response distribution
0    387
1    113
Name: Response, dtype: int64

Group "B" response distribution
0    379
1    121
Name: Response, dtype: int64


### Conversion rate report
Class parameters,

+ df = A dataframe which has Users, Response column and Group column
+ response_column = A column which has 1 or 0 correspond for the users which being converted or not respectively.
+ group_column = A column name which indicate the user belong to which group **A** or **B**

In [5]:
ab_obj = ABTest(df,response_column='Response',group_column='Group')

print(ab_obj.conversion_rate())

  Conversion Rate Standard Deviation Standard Error
A          22.60%              0.418         0.0187
B          24.20%              0.428         0.0192


### Significance test

To determine whether the obtained result made a significant difference or not by evaluating the P-Value. 

*Note: By default the P-Value is being set to 0.05, but can be set by passing ``threshold`` parameter. As in different scenarios, the threshold varies.*

``significance_test`` api/function provides the following information :
+ P-Value and z-statistic of the results.
+ Confidence Intervals of both groups.
+ Final results with conclusive remarks on significance result.


In [9]:
print(ab_obj.significance_test(threshold=0.58))

z statistic: -0.60	p-value: 0.550
Confidence Interval 95% for A group: 21.56% to 23.64%
Confidence Interval 95% for B group: 23.14% to 25.26%

The Group A able to perform significantly different than group B.
The P-Value of the test is 0.5501462751198115 which is below 0.58, hence Null hypothesis Hₒ can be rejected.


## END