# AB test 

A/B testing is a research method that allows you to find out people's reaction to any changes. The study shows which of the two versions of the product or offer is better and gives greater effect.

In [6]:
import random

import numpy as np

from hypex.dataset import Dataset, InfoRole, TreatmentRole, TargetRole
from hypex.experiments.ab import ABTest

## Creation of a new test dataset with synthetic data.
It is important to mark the data fields by assigning the appropriate roles:

* FeatureRole: a role for columns that contain features or predictor variables. Our split will be based on them. Applied by default if the role is not specified for the column.
* TreatmentRole: a role for columns that show the treatment or intervention.
* TargetRole: a role for columns that show the target or outcome variable.
* InfoRole: a role for columns that contain information about the data, such as user IDs.

In [7]:
data = Dataset(
    roles={
        "user_id": InfoRole(int),
        "treat": TreatmentRole(),
        "pre_spends": TargetRole(),
        "post_spends": TargetRole(), 
        "gender": TargetRole()
    }, data="data.csv",
)
data

      user_id  signup_month  treat  pre_spends  post_spends   age gender  \
0           0             0      0       488.0   414.444444   NaN      M   
1           1             8      1       512.5   462.222222  26.0    NaN   
2           2             7      1       483.0   479.444444  25.0      M   
3           3             0      0       501.5   424.333333  39.0      M   
4           4             1      1       543.0   514.555556  18.0      F   
...       ...           ...    ...         ...          ...   ...    ...   
9995     9995            10      1       538.5   450.444444  42.0      M   
9996     9996             0      0       500.5   430.888889  26.0      F   
9997     9997             3      1       473.0   534.111111  22.0      F   
9998     9998             2      1       495.0   523.222222  67.0      F   
9999     9999             7      1       508.0   475.888889  38.0      F   

        industry  
0     E-commerce  
1     E-commerce  
2      Logistics  
3     E-com

In [8]:
data["treat"] = [random.choice([0, 1, 2]) for _ in range(len(data))]
data

      user_id  signup_month  treat  pre_spends  post_spends   age gender  \
0           0             0      1       488.0   414.444444   NaN      M   
1           1             8      1       512.5   462.222222  26.0    NaN   
2           2             7      1       483.0   479.444444  25.0      M   
3           3             0      0       501.5   424.333333  39.0      M   
4           4             1      0       543.0   514.555556  18.0      F   
...       ...           ...    ...         ...          ...   ...    ...   
9995     9995            10      1       538.5   450.444444  42.0      M   
9996     9996             0      1       500.5   430.888889  26.0      F   
9997     9997             3      0       473.0   534.111111  22.0      F   
9998     9998             2      0       495.0   523.222222  67.0      F   
9999     9999             7      1       508.0   475.888889  38.0      F   

        industry  
0     E-commerce  
1     E-commerce  
2      Logistics  
3     E-com

In [9]:
data.roles

{'user_id': Info(<class 'int'>),
 'treat': Treatment(<class 'int'>),
 'pre_spends': Target(<class 'float'>),
 'post_spends': Target(<class 'float'>),
 'gender': Target(<class 'str'>),
 'signup_month': Feature(<class 'int'>),
 'age': Feature(<class 'float'>),
 'industry': Feature(<class 'str'>)}

## Simple AB Test 

Simple pipline contains group sizes, group differences and TTest estimation. 

In [10]:
test = ABTest()
result = test.execute(data)

In [11]:
result.resume

       feature group TTest pass  TTest p-value
0   pre_spends     0         OK       0.982127
1  post_spends     0         OK       0.866868

In [12]:
result.sizes 

   control size  test size  control size %  test size % group
0          3315       3364              49           50     1
0          3315       3321              49           50     2

In [13]:
result.difference

   control mean   test mean  difference  difference % group        field
0    487.268024  486.760256   -0.507768     -0.104207     1   pre_spends
0    487.268024  487.257603   -0.010421     -0.002139     1  post_spends
0    452.070789  452.509182    0.438393      0.096974     2   pre_spends
0    452.070789  451.909064   -0.161726     -0.035774     2  post_spends

In [14]:
result.multitest

"There was less than three groups or multitest method wasn't provided"

## Additional tests in AB Test 

It is possible to add u-test and chi2-test in pipline. 

In [15]:
test = ABTest(additional_tests=['t-test', 'u-test', 'chi2-test'])
result = test.execute(data)

In [16]:
result.resume

       feature group TTest pass  TTest p-value UTest pass  UTest p-value  \
0   pre_spends     0         OK       0.982127         OK       0.778838   
1  post_spends     0         OK       0.866868         OK       0.786121   
2       gender     0        NaN            NaN        NaN            NaN   

  Chi2Test pass  Chi2Test p-value  
0           NaN               NaN  
1           NaN               NaN  
2            OK               1.0  

In [17]:
result.multitest

"There was less than three groups or multitest method wasn't provided"

In [18]:
result.difference

   control mean   test mean  difference  difference % group        field
0    487.268024  486.760256   -0.507768     -0.104207     1   pre_spends
0    487.268024  487.257603   -0.010421     -0.002139     1  post_spends
0    452.070789  452.509182    0.438393      0.096974     2   pre_spends
0    452.070789  451.909064   -0.161726     -0.035774     2  post_spends

In [19]:
result.sizes

   control size  test size  control size %  test size % group
0          3315       3364              49           50     1
0          3315       3321              49           50     2

## ABn Test 

Finally, we may estimate multiple ab test with different methods.

In [20]:
test = ABTest(multitest_method="bonferroni")
result = test.execute(data)

In [21]:
result.resume

       feature group TTest pass  TTest p-value
0   pre_spends     0         OK       0.982127
1  post_spends     0         OK       0.866868

In [22]:
result.sizes

   control size  test size  control size %  test size % group
0          3315       3364              49           50     1
0          3315       3321              49           50     2

In [23]:
result.difference

   control mean   test mean  difference  difference % group        field
0    487.268024  486.760256   -0.507768     -0.104207     1   pre_spends
0    487.268024  487.257603   -0.010421     -0.002139     1  post_spends
0    452.070789  452.509182    0.438393      0.096974     2   pre_spends
0    452.070789  451.909064   -0.161726     -0.035774     2  post_spends

In [24]:
result.multitest

   correction        field  new p-value  old p-value  rejected   test group
0    0.272660   pre_spends          1.0     0.272660     False  TTest     1
1    0.982127  post_spends          1.0     0.982127     False  TTest     1
2    0.649918   pre_spends          1.0     0.649918     False  TTest     2
3    0.866868  post_spends          1.0     0.866868     False  TTest     2