# AB test 

A/B testing is a research method that allows you to find out people's reaction to any changes. The study shows which of the two versions of the product or offer is better and gives greater effect.

In [1]:
from hypex.dataset import Dataset, InfoRole, TreatmentRole, TargetRole
from hypex.experiments.ab import ABTest

  from .autonotebook import tqdm as notebook_tqdm


## Creation of a new test dataset with synthetic data.
It is important to mark the data fields by assigning the appropriate roles:

* FeatureRole: a role for columns that contain features or predictor variables. Our split will be based on them. Applied by default if the role is not specified for the column.
* TreatmentRole: a role for columns that show the treatment or intervention.
* TargetRole: a role for columns that show the target or outcome variable.
* InfoRole: a role for columns that contain information about the data, such as user IDs.

In [2]:
data = Dataset(
    roles={
        "user_id": InfoRole(int),
        "treat": TreatmentRole(),
        "pre_spends": TargetRole(),
        "post_spends": TargetRole(), 
        "gender": TargetRole()
    }, data="data.csv",
)
data

      user_id  signup_month  treat  pre_spends  post_spends   age gender  \
0           0             0      0       488.0   414.444444   NaN      M   
1           1             8      1       512.5   462.222222  26.0    NaN   
2           2             7      2       483.0   479.444444  25.0      M   
3           3             0      0       501.5   424.333333  39.0      M   
4           4             1      1       543.0   514.555556  18.0      F   
...       ...           ...    ...         ...          ...   ...    ...   
9995     9995            10      1       538.5   450.444444  42.0      M   
9996     9996             0      0       500.5   430.888889  26.0      F   
9997     9997             3      1       473.0   534.111111  22.0      F   
9998     9998             2      1       495.0   523.222222  67.0      F   
9999     9999             7      1       508.0   475.888889  38.0      F   

        industry  
0     E-commerce  
1     E-commerce  
2      Logistics  
3     E-com

In [3]:
data.roles

{'user_id': Info(<class 'int'>),
 'treat': Treatment(<class 'int'>),
 'pre_spends': Target(<class 'float'>),
 'post_spends': Target(<class 'float'>),
 'gender': Target(<class 'str'>),
 'signup_month': Feature(<class 'int'>),
 'age': Feature(<class 'float'>),
 'industry': Feature(<class 'str'>)}

## Simple AB Test 

Simple pipline contains group sizes, group differences and TTest estimation. 

In [4]:
test = ABTest()
result = test.execute(data)

In [5]:
result.resume

       feature group TTest pass  TTest p-value
0   pre_spends     0         OK   4.788336e-01
1  post_spends     0     NOT OK   2.739781e-15

In [6]:
result.sizes 

   control size  test size  control size %  test size % group
0          4934       5062              49           50     1
0          4934          4              99            0     2

In [7]:
result.difference

   control mean   test mean  difference  difference % group        field
0    484.908492  489.221059    4.312567      0.889357     1   pre_spends
0    484.908492  490.500000    5.591508      1.153106     1  post_spends
0    420.047066  483.471882   63.424816     15.099455     2   pre_spends
0    420.047066  449.666667   29.619601      7.051496     2  post_spends

In [8]:
result.multitest

"There was less than three groups or multitest method wasn't provided"

## Additional tests in AB Test 

It is possible to add u-test and chi2-test in pipline. 

In [9]:
test = ABTest(additional_tests=['t-test', 'u-test', 'chi2-test'])
result = test.execute(data)

In [10]:
result.resume

       feature group TTest pass  TTest p-value UTest pass  UTest p-value  \
0   pre_spends     0         OK   4.788336e-01         OK       0.713732   
1  post_spends     0     NOT OK   2.739781e-15         OK       0.125244   
2       gender     0        NaN            NaN        NaN            NaN   

  Chi2Test pass  Chi2Test p-value  
0           NaN               NaN  
1           NaN               NaN  
2            OK               1.0  

In [11]:
result.multitest

"There was less than three groups or multitest method wasn't provided"

In [12]:
result.difference

   control mean   test mean  difference  difference % group        field
0    484.908492  489.221059    4.312567      0.889357     1   pre_spends
0    484.908492  490.500000    5.591508      1.153106     1  post_spends
0    420.047066  483.471882   63.424816     15.099455     2   pre_spends
0    420.047066  449.666667   29.619601      7.051496     2  post_spends

In [13]:
result.sizes

   control size  test size  control size %  test size % group
0          4934       5062              49           50     1
0          4934          4              99            0     2

## ABn Test 

Finally, we may estimate multiple ab test with different methods.

In [14]:
test = ABTest(multitest_method="bonferroni")
result = test.execute(data)

In [15]:
result.resume

       feature group TTest pass  TTest p-value
0   pre_spends     0         OK   4.788336e-01
1  post_spends     0     NOT OK   2.739781e-15

In [16]:
result.sizes

   control size  test size  control size %  test size % group
0          4934       5062              49           50     1
0          4934          4              99            0     2

In [17]:
result.difference

   control mean   test mean  difference  difference % group        field
0    484.908492  489.221059    4.312567      0.889357     1   pre_spends
0    484.908492  490.500000    5.591508      1.153106     1  post_spends
0    420.047066  483.471882   63.424816     15.099455     2   pre_spends
0    420.047066  449.666667   29.619601      7.051496     2  post_spends

In [18]:
result.multitest

   correction        field   new p-value   old p-value  rejected   test group
0    0.250000   pre_spends  8.302799e-30  2.075700e-30      True  TTest     1
1    0.478834  post_spends  1.000000e+00  4.788336e-01     False  TTest     1
2    0.000000   pre_spends  0.000000e+00  0.000000e+00      True  TTest     2
3    0.250000  post_spends  1.095912e-14  2.739781e-15      True  TTest     2