# AA test tutorial 
AA test is important part of randomized controlled experiment, for example AB test. 

The objectives of the AA test are to verify the assumption of uniformity of samples as a result of the applied partitioning method, to select the best partition from the available ones, and to verify the applicability of statistical criteria for checking uniformity. 

For example, there is a hypothesis about the absence of dependence of features on each other. If this hypothesis is not followed, the AA test will fail.

<ul>
  <li><a href="#creation-of-a-new-test-dataset-with-synthetic-data">Creation of a new test dataset with synthetic data.
  <li><a href="#one-split-of-aa-test">One split of AA test.
  <li><a href="#aa-test">AA test.
  <li><a href="#aa-test-with-stratification">AA test with stratification.
</ul>

In [1]:
from hypex.dataset import Dataset, InfoRole, TreatmentRole, TargetRole, StratificationRole
from hypex import AATest

  from .autonotebook import tqdm as notebook_tqdm


## Creation of a new test dataset with synthetic data. 

In order to be able to work with our data in HypEx, first we need to convert it into `dataset`. It is important to mark the data fields by assigning the appropriate `roles`:
- FeatureRole: a role for columns that contain features or predictor variables. Our split will be based on them. Applied by default if the role is not specified for the column.
- TreatmentRole: a role for columns that show the treatment or intervention.
- TargetRole: a role for columns that show the target or outcome variable.
- InfoRole: a role for columns that contain information about the data, such as user IDs. 

In [2]:
data = Dataset(
    roles={
        "user_id": InfoRole(int),
        "treat": TreatmentRole(int),
        "pre_spends": TargetRole(),
        "post_spends": TargetRole(),
        "gender": StratificationRole(str)
    }, data="data.csv",
)
data

      user_id  signup_month  treat  pre_spends  post_spends   age gender  \
0           0             0      0       488.0   414.444444   NaN      M   
1           1             8      1       512.5   462.222222  26.0    NaN   
2           2             7      1       483.0   479.444444  25.0      M   
3           3             0      0       501.5   424.333333  39.0      M   
4           4             1      1       543.0   514.555556  18.0      F   
...       ...           ...    ...         ...          ...   ...    ...   
9995     9995            10      1       538.5   450.444444  42.0      M   
9996     9996             0      0       500.5   430.888889  26.0      F   
9997     9997             3      1       473.0   534.111111  22.0      F   
9998     9998             2      1       495.0   523.222222  67.0      F   
9999     9999             7      1       508.0   475.888889  38.0      F   

        industry  
0     E-commerce  
1     E-commerce  
2      Logistics  
3     E-com

## AA test
Then we run the experiment on our prepared dataset, wrapped into ExperimentData. In this case we select one of the pre-assembled pipeline, AA_TEST.
We can set the number of iterations for simple execution. In this case the random states are the numbers of each iteration.

In [3]:
aa = AATest(n_iterations=10)
res = aa.execute(data)

In [4]:
res.resume

  TTest aa test KSTest aa test TTest best split KSTest best split  result  \
0        NOT OK         NOT OK               OK                OK  NOT OK   
1        NOT OK         NOT OK               OK                OK  NOT OK   

       feature    group  
0  post_spends  control  
1   pre_spends  control  

In [5]:
res.aa_score

                             pass     score
pre_spends TTest control    False  0.653814
post_spends TTest control   False  0.664092
pre_spends KSTest control   False  0.639293
post_spends KSTest control  False  0.637995

In [6]:
res.best_split

      user_id  signup_month  treat  pre_spends  post_spends   age gender  \
0           0             0      0       488.0   414.444444   NaN      M   
1           1             8      1       512.5   462.222222  26.0    NaN   
2           2             7      1       483.0   479.444444  25.0      M   
3           3             0      0       501.5   424.333333  39.0      M   
4           4             1      1       543.0   514.555556  18.0      F   
...       ...           ...    ...         ...          ...   ...    ...   
9995     9995            10      1       538.5   450.444444  42.0      M   
9996     9996             0      0       500.5   430.888889  26.0      F   
9997     9997             3      1       473.0   534.111111  22.0      F   
9998     9998             2      1       495.0   523.222222  67.0      F   
9999     9999             7      1       508.0   475.888889  38.0      F   

        industry    split  
0     E-commerce  control  
1     E-commerce     test  
2  

In [7]:
res.best_split_statistic

       feature    group TTest pass  TTest p-value KSTest pass  KSTest p-value
0   pre_spends  control         OK       0.645746          OK        0.932542
1  post_spends  control         OK       0.357727          OK        0.577046

In [8]:
res.experiments

        splitter_id  pre_spends GroupDifference control mean control  \
0  AASplitter┴rs 0┴                                         487.3801   
1  AASplitter┴rs 1┴                                         487.3333   
2  AASplitter┴rs 2┴                                         487.0445   
3  AASplitter┴rs 3┴                                         486.6742   
4  AASplitter┴rs 4┴                                         487.1970   
5  AASplitter┴rs 5┴                                         486.8953   
6  AASplitter┴rs 6┴                                         487.3100   
7  AASplitter┴rs 7┴                                         487.1805   
8  AASplitter┴rs 8┴                                         487.3882   
9  AASplitter┴rs 9┴                                         487.0735   

   pre_spends GroupDifference test mean control  \
0                                      486.8074   
1                                      486.8542   
2                                      487.1430   
3  

# AA Test with random states

We can also adjust some of the preset parameters of the experiment by assigning them to the respective params of the experiment. I.e. here we set the range of the random states we want to run our AA test for. 

In [9]:
aa = AATest(random_states=[56, 72, 2, 43])
res = aa.execute(data)

In [10]:
res.resume

  TTest aa test KSTest aa test TTest best split KSTest best split  result  \
0        NOT OK         NOT OK               OK                OK  NOT OK   
1        NOT OK         NOT OK               OK                OK  NOT OK   

       feature    group  
0  post_spends  control  
1   pre_spends  control  

In [11]:
res.aa_score

                             pass     score
pre_spends TTest control    False  0.652500
post_spends TTest control   False  0.645896
pre_spends KSTest control   False  0.605599
post_spends KSTest control  False  0.533713

In [12]:
res.best_split

      user_id  signup_month  treat  pre_spends  post_spends   age gender  \
0           0             0      0       488.0   414.444444   NaN      M   
1           1             8      1       512.5   462.222222  26.0    NaN   
2           2             7      1       483.0   479.444444  25.0      M   
3           3             0      0       501.5   424.333333  39.0      M   
4           4             1      1       543.0   514.555556  18.0      F   
...       ...           ...    ...         ...          ...   ...    ...   
9995     9995            10      1       538.5   450.444444  42.0      M   
9996     9996             0      0       500.5   430.888889  26.0      F   
9997     9997             3      1       473.0   534.111111  22.0      F   
9998     9998             2      1       495.0   523.222222  67.0      F   
9999     9999             7      1       508.0   475.888889  38.0      F   

        industry    split  
0     E-commerce  control  
1     E-commerce     test  
2  

In [13]:
res.best_split_statistic

       feature    group TTest pass  TTest p-value KSTest pass  KSTest p-value
0   pre_spends  control         OK       0.519864          OK        0.694583
1  post_spends  control         OK       0.998020          OK        0.677788

In [14]:
res.experiments

         splitter_id  pre_spends GroupDifference control mean control  \
0  AASplitter┴rs 56┴                                         486.7993   
1  AASplitter┴rs 72┴                                         486.9723   
2   AASplitter┴rs 2┴                                         487.0445   
3  AASplitter┴rs 43┴                                         487.3606   

   pre_spends GroupDifference test mean control  \
0                                      487.3882   
1                                      487.2152   
2                                      487.1430   
3                                      486.8269   

   pre_spends GroupDifference difference control  \
0                                         0.5889   
1                                         0.2429   
2                                         0.0985   
3                                        -0.5337   

   pre_spends GroupDifference difference % control  \
0                                         0.120974   
1        

# AA Test with stratification

Depending on your requirements it is possible to stratify the data. You can set `stratification=True` and `StratificationRole` in `Dataset` to run it with stratification.  

In [15]:
aa = AATest(random_states=[56, 72, 2, 43], stratification=True)
res = aa.execute(data)

In [16]:
res.resume

  TTest aa test KSTest aa test TTest best split KSTest best split  result  \
0        NOT OK         NOT OK               OK                OK  NOT OK   
1        NOT OK         NOT OK               OK                OK  NOT OK   

       feature    group  
0  post_spends  control  
1   pre_spends  control  

In [17]:
res.aa_score

                             pass     score
pre_spends TTest control    False  0.388191
post_spends TTest control   False  0.774438
pre_spends KSTest control   False  0.728226
post_spends KSTest control  False  0.660354

In [18]:
res.best_split

      user_id  signup_month  treat  pre_spends  post_spends   age gender  \
0           0             0      0       488.0   414.444444   NaN      M   
1           1             8      1       512.5   462.222222  26.0    NaN   
2           2             7      1       483.0   479.444444  25.0      M   
3           3             0      0       501.5   424.333333  39.0      M   
4           4             1      1       543.0   514.555556  18.0      F   
...       ...           ...    ...         ...          ...   ...    ...   
9995     9995            10      1       538.5   450.444444  42.0      M   
9996     9996             0      0       500.5   430.888889  26.0      F   
9997     9997             3      1       473.0   534.111111  22.0      F   
9998     9998             2      1       495.0   523.222222  67.0      F   
9999     9999             7      1       508.0   475.888889  38.0      F   

        industry    split  
0     E-commerce  control  
1     E-commerce     test  
2  

In [19]:
res.best_split_statistic

       feature    group TTest pass  TTest p-value KSTest pass  KSTest p-value
0   pre_spends  control         OK       0.384576          OK        0.129368
1  post_spends  control         OK       0.801273          OK        0.952433

In [20]:
res.experiments

                           splitter_id  \
0  AASplitterWithStratification┴rs 56┴   
1  AASplitterWithStratification┴rs 72┴   
2   AASplitterWithStratification┴rs 2┴   
3  AASplitterWithStratification┴rs 43┴   

   pre_spends GroupDifference control mean control  \
0                                       487.110444   
1                                       486.922667   
2                                       487.264000   
3                                       487.075889   

   pre_spends GroupDifference test mean control  \
0                                    487.082000   
1                                    487.269778   
2                                    486.928444   
3                                    487.116556   

   pre_spends GroupDifference difference control  \
0                                      -0.028444   
1                                       0.347111   
2                                      -0.335556   
3                                       0.040667   

 

In [21]:
aa = AATest(n_iterations=20)
res = aa.execute(data)

In [22]:
res.resume

  TTest aa test KSTest aa test TTest best split KSTest best split  result  \
0        NOT OK         NOT OK               OK                OK  NOT OK   
1        NOT OK         NOT OK               OK                OK  NOT OK   

       feature    group  
0  post_spends  control  
1   pre_spends  control  

In [23]:
res.aa_score

                             pass     score
pre_spends TTest control    False  0.637037
post_spends TTest control   False  0.633799
pre_spends KSTest control   False  0.604478
post_spends KSTest control  False  0.537745

In [24]:
res.best_split

      user_id  signup_month  treat  pre_spends  post_spends   age gender  \
0           0             0      0       488.0   414.444444   NaN      M   
1           1             8      1       512.5   462.222222  26.0    NaN   
2           2             7      1       483.0   479.444444  25.0      M   
3           3             0      0       501.5   424.333333  39.0      M   
4           4             1      1       543.0   514.555556  18.0      F   
...       ...           ...    ...         ...          ...   ...    ...   
9995     9995            10      1       538.5   450.444444  42.0      M   
9996     9996             0      0       500.5   430.888889  26.0      F   
9997     9997             3      1       473.0   534.111111  22.0      F   
9998     9998             2      1       495.0   523.222222  67.0      F   
9999     9999             7      1       508.0   475.888889  38.0      F   

        industry    split  
0     E-commerce     test  
1     E-commerce     test  
2  

In [25]:
res.best_split_statistic

       feature    group TTest pass  TTest p-value KSTest pass  KSTest p-value
0   pre_spends  control         OK       0.914549          OK        0.577046
1  post_spends  control         OK       0.679226          OK        0.480675

In [26]:
res.experiments

          splitter_id  pre_spends GroupDifference control mean control  \
0    AASplitter┴rs 0┴                                         487.3801   
1    AASplitter┴rs 1┴                                         487.3333   
2    AASplitter┴rs 2┴                                         487.0445   
3    AASplitter┴rs 3┴                                         486.6742   
4    AASplitter┴rs 4┴                                         487.1970   
5    AASplitter┴rs 5┴                                         486.8953   
6    AASplitter┴rs 6┴                                         487.3100   
7    AASplitter┴rs 7┴                                         487.1805   
8    AASplitter┴rs 8┴                                         487.3882   
9    AASplitter┴rs 9┴                                         487.0735   
10  AASplitter┴rs 10┴                                         487.1884   
11  AASplitter┴rs 11┴                                         487.2131   
12  AASplitter┴rs 12┴                 

In [27]:
aa = AATest(n_iterations=20, sample_size=0.2)
res = aa.execute(data)

In [28]:
res.resume

  TTest aa test KSTest aa test TTest best split KSTest best split  result  \
0        NOT OK         NOT OK               OK                OK  NOT OK   
1        NOT OK         NOT OK               OK                OK  NOT OK   

       feature    group  
0  post_spends  control  
1   pre_spends  control  

In [29]:
res.aa_score

                             pass     score
pre_spends TTest control    False  0.526366
post_spends TTest control   False  0.531698
pre_spends KSTest control   False  0.454925
post_spends KSTest control  False  0.557932

In [30]:
res.best_split

      user_id  signup_month  treat  pre_spends  post_spends   age gender  \
0           0             0      0       488.0   414.444444   NaN      M   
1           1             8      1       512.5   462.222222  26.0    NaN   
2           2             7      1       483.0   479.444444  25.0      M   
3           3             0      0       501.5   424.333333  39.0      M   
4           4             1      1       543.0   514.555556  18.0      F   
...       ...           ...    ...         ...          ...   ...    ...   
9995     9995            10      1       538.5   450.444444  42.0      M   
9996     9996             0      0       500.5   430.888889  26.0      F   
9997     9997             3      1       473.0   534.111111  22.0      F   
9998     9998             2      1       495.0   523.222222  67.0      F   
9999     9999             7      1       508.0   475.888889  38.0      F   

        industry    split  
0     E-commerce     test  
1     E-commerce  control  
2  

In [31]:
res.best_split_statistic

       feature    group TTest pass  TTest p-value KSTest pass  KSTest p-value
0   pre_spends  control         OK       0.736843          OK        0.720616
1  post_spends  control         OK       0.936551          OK        0.709537

In [32]:
res.experiments

          splitter_id  pre_spends GroupDifference control mean control  \
0    AASplitter┴rs 0┴                                       487.143722   
1    AASplitter┴rs 1┴                                       487.111556   
2    AASplitter┴rs 2┴                                       487.007556   
3    AASplitter┴rs 3┴                                       487.097944   
4    AASplitter┴rs 4┴                                       487.096944   
5    AASplitter┴rs 5┴                                       487.112333   
6    AASplitter┴rs 6┴                                       487.247500   
7    AASplitter┴rs 7┴                                       487.071000   
8    AASplitter┴rs 8┴                                       487.111333   
9    AASplitter┴rs 9┴                                       487.197778   
10  AASplitter┴rs 10┴                                       487.132167   
11  AASplitter┴rs 11┴                                       487.115000   
12  AASplitter┴rs 12┴                 