# AB test 

A/B testing is the research method that allows you to find out the effect of a particular change in the product. The study shows which of the two versions of the product or offer gives greater effect on the selected metrics and if it is statistically significant.  

<ul>
  <li><a href="#creation-of-a-new-test-dataset-with-synthetic-data">Creation of a new test dataset with synthetic data.
  <li><a href="#ab-test">AB test.
  <li><a href="#additional-tests-in-ab-test">Additional tests in AB Test.
  <li><a href="#abn-test">ABn Test.
</ul>

In [1]:
import random

from hypex.dataset import Dataset, InfoRole, TreatmentRole, TargetRole
from hypex import ABTest
random.seed(7)

## Creation of a new test dataset with synthetic data. 

In order to be able to work with our data in HypEx, first we need to convert it into `dataset`. It is important to mark the data fields by assigning the appropriate `roles`:
- FeatureRole: a role for columns that contain features or predictor variables. Our split will be based on them. Applied by default if the role is not specified for the column.
- TreatmentRole: a role for columns that show the treatment or intervention.
- TargetRole: a role for columns that show the target or outcome variable.
- InfoRole: a role for columns that contain information about the data, such as user IDs. 

In [2]:
data = Dataset(
    roles={
        "user_id": InfoRole(int),
        "treat": TreatmentRole(),
        "pre_spends": TargetRole(),
        "post_spends": TargetRole(),
        "gender": TargetRole()
    }, data="data.csv",
)
data

Unnamed: 0,user_id,signup_month,treat,pre_spends,post_spends,age,gender,industry
0,0,0,0,488.0,414.444444,,M,E-commerce
1,1,8,1,512.5,462.222222,26.0,,E-commerce
2,2,7,1,483.0,479.444444,25.0,M,Logistics
3,3,0,0,501.5,424.333333,39.0,M,E-commerce
4,4,1,1,543.0,514.555556,18.0,F,E-commerce
...,...,...,...,...,...,...,...,...
9995,9995,10,1,538.5,450.444444,42.0,M,Logistics
9996,9996,0,0,500.5,430.888889,26.0,F,Logistics
9997,9997,3,1,473.0,534.111111,22.0,F,E-commerce
9998,9998,2,1,495.0,523.222222,67.0,F,E-commerce


In [3]:
data["treat"] = [random.choice([0, 1, 2]) for _ in range(len(data))]
data

Unnamed: 0,user_id,signup_month,treat,pre_spends,post_spends,age,gender,industry
0,0,0,1,488.0,414.444444,,M,E-commerce
1,1,8,0,512.5,462.222222,26.0,,E-commerce
2,2,7,1,483.0,479.444444,25.0,M,Logistics
3,3,0,2,501.5,424.333333,39.0,M,E-commerce
4,4,1,0,543.0,514.555556,18.0,F,E-commerce
...,...,...,...,...,...,...,...,...
9995,9995,10,1,538.5,450.444444,42.0,M,Logistics
9996,9996,0,1,500.5,430.888889,26.0,F,Logistics
9997,9997,3,1,473.0,534.111111,22.0,F,E-commerce
9998,9998,2,1,495.0,523.222222,67.0,F,E-commerce


The roles' data types can be assigned automatically as shown below. Also, the fields, which were not marked, receive Feature role by default.

In [4]:
data.roles

{'user_id': Info(<class 'int'>),
 'treat': Treatment(<class 'int'>),
 'pre_spends': Target(<class 'float'>),
 'post_spends': Target(<class 'float'>),
 'gender': Target(<class 'str'>),
 'signup_month': Default(<class 'int'>),
 'age': Default(<class 'float'>),
 'industry': Default(<class 'str'>)}

## AB test
Then we select one of the pre-assembled pipelines, in our case `ABTest`. Also, a custom pipline can be created based on your specific needs and requirements with custom executors.
After that we wrap our prepared `dataset` into `ExperimentData` to be able to run experiments on it and then execute the test with this data passed as the argument.

In [5]:
test = ABTest()
result = test.execute(data)

### Experiment results
To show the report with summary of the test we run the `resume` method of the output of the experiment.

It displays the results of the test in the form of a table with the following columns:
- `feature`: name of the target feature, change of which we want to analyze.
- `group`: name of the test group we compare with the control group.
- `TTest pass`: result of the TTest, if it is significant or not.
- `TTest p-value`: p-value of the TTest shows the probability of obtaining the result when the null hypothesis is true. The lower the value the more significant the result is.
- `control mean`: the mean of the feature value across the control group.
- `test mean`: the mean of the feature value across the test group.
- `difference`: the difference between the mean of the test group and the mean of the control group.
- `difference %`: the normalized difference between the mean of the test group and the mean of the control group.

In [6]:
result.resume

Unnamed: 0,feature,group,difference,difference %,TTest pass,TTest p-value
0,pre_spends,1,-0.029607,-0.00608,NOT OK,0.949027
1,pre_spends,2,0.359936,0.073911,NOT OK,0.438027
2,post_spends,1,-0.365406,-0.080695,NOT OK,0.705151
3,post_spends,2,-1.651423,-0.364695,NOT OK,0.086506


The method sizes shows the statistics on the groups of the data.

The columns are:
- `control size`: the size of the control group.
- `test size`: the size of the test group.
- `control size %`: the share of the control group in the whole dataset.
- `test size %`: the share of the test group in the whole dataset.
- `group`: name of the test group.

In [7]:
result.sizes

Unnamed: 0,control size,test size,control size %,test size %,group
1,3426,3323,50,49,1
2,3426,3251,51,48,2


In [8]:
result.multitest

Unnamed: 0,field,test,old p-value,new p-value,correction,rejected,group
0,pre_spends,TTest,0.949027,1.0,0.949027,False,1
1,post_spends,TTest,0.438027,1.0,0.438027,False,1
2,pre_spends,TTest,0.705151,1.0,0.705151,False,2
3,post_spends,TTest,0.086506,0.346025,0.25,False,2


In [9]:
import pandas as pd
test_name = 'ab-casual'
file_path = 'data_output.xlsx'

with pd.ExcelWriter(file_path, engine='openpyxl', mode='a') as writer:
    result.resume.data.to_excel(writer, sheet_name=f'{test_name}.result.resume.data')
    result.sizes.data.to_excel(writer, sheet_name=f'{test_name}.result.sizes.data')
    result.multitest.data.to_excel(writer, sheet_name=f'{test_name}.result.multitest.data')



## Additional tests in AB Test 

It is possible to add u-test and chi2-test in pipline. 

In [10]:
test = ABTest(additional_tests=['t-test', 'u-test', 'chi2-test'])
result = test.execute(data)

The additional columns are:
- `UTest pass`: result of the UTest, if it is significant or not.
- `UTest p-value`: p-value of the UTest shows the probability of obtaining the result when the null hypothesis is true. The lower the value the more significant the result is.
- `Chi2Test pass`: result of the Chi2Test, if it is significant or not.
- `Chi2Test p-value`: p-value of the Chi2Test shows the probability of obtaining the result when the null hypothesis is true. The lower the value the more significant the result is.

In [11]:
result.resume

Unnamed: 0,feature,group,difference,difference %,TTest pass,TTest p-value,UTest pass,UTest p-value,Chi2Test pass,Chi2Test p-value
0,pre_spends,1,-0.029607,-0.00608,NOT OK,0.949027,NOT OK,0.514775,,
1,pre_spends,2,0.359936,0.073911,NOT OK,0.438027,NOT OK,0.135823,,
2,post_spends,1,-0.365406,-0.080695,NOT OK,0.705151,NOT OK,0.999945,,
3,post_spends,2,-1.651423,-0.364695,NOT OK,0.086506,NOT OK,0.346024,,
4,gender,1,,,,,,,NOT OK,0.808328
5,gender,2,,,,,,,NOT OK,0.530704


In [12]:
result.multitest

Unnamed: 0,field,test,old p-value,new p-value,correction,rejected,group
0,pre_spends,TTest,0.949027,1.0,0.949027,False,1
1,post_spends,TTest,0.438027,1.0,0.438027,False,1
2,pre_spends,TTest,0.705151,1.0,0.705151,False,2
3,post_spends,TTest,0.086506,0.692049,0.125,False,2
4,pre_spends,UTest,0.514775,1.0,0.514775,False,1
5,post_spends,UTest,0.135823,0.950759,0.142857,False,1
6,pre_spends,UTest,0.999945,1.0,0.999945,False,2
7,post_spends,UTest,0.346024,1.0,0.346024,False,2


In [13]:
result.sizes

Unnamed: 0,control size,test size,control size %,test size %,group
1,3426,3323,50,49,1
2,3426,3251,51,48,2


In [14]:
import pandas as pd
test_name = 'ab-additional'
file_path = 'data_output.xlsx'

with pd.ExcelWriter(file_path, engine='openpyxl', mode='a') as writer:
    result.resume.data.to_excel(writer, sheet_name=f'{test_name}.result.resume.data')
    result.sizes.data.to_excel(writer, sheet_name=f'{test_name}.result.sizes.data')
    result.multitest.data.to_excel(writer, sheet_name=f'{test_name}.result.multitest.data')



## ABn Test 

Finally, we may run multiple ab tests with different methods.

In [15]:
test = ABTest(multitest_method="bonferroni")
result = test.execute(data)

In [16]:
result.resume

Unnamed: 0,feature,group,difference,difference %,TTest pass,TTest p-value
0,pre_spends,1,-0.029607,-0.00608,NOT OK,0.949027
1,pre_spends,2,0.359936,0.073911,NOT OK,0.438027
2,post_spends,1,-0.365406,-0.080695,NOT OK,0.705151
3,post_spends,2,-1.651423,-0.364695,NOT OK,0.086506


In [17]:
result.sizes

Unnamed: 0,control size,test size,control size %,test size %,group
1,3426,3323,50,49,1
2,3426,3251,51,48,2


In [18]:
result.multitest

Unnamed: 0,field,test,old p-value,new p-value,correction,rejected,group
0,pre_spends,TTest,0.949027,1.0,0.949027,False,1
1,post_spends,TTest,0.438027,1.0,0.438027,False,1
2,pre_spends,TTest,0.705151,1.0,0.705151,False,2
3,post_spends,TTest,0.086506,0.346025,0.25,False,2


In [19]:
import pandas as pd
test_name = 'ab-n'
file_path = 'data_output.xlsx'

with pd.ExcelWriter(file_path, engine='openpyxl', mode='a') as writer:
    result.resume.data.to_excel(writer, sheet_name=f'{test_name}.result.resume.data')
    result.sizes.data.to_excel(writer, sheet_name=f'{test_name}.result.sizes.data')
    result.multitest.data.to_excel(writer, sheet_name=f'{test_name}.result.multitest.data')

