# AA test tutorial

AA test is important part of randomized controlled experiment, for example AB test. 

The objectives of the AA test are to verify the assumption of uniformity of samples as a result of the applied partitioning method, to select the best partition from the available ones, and to verify the applicability of statistical criteria for checking uniformity. 

For example, there is a hypothesis about the absence of dependence of features on each other. If this hypothesis is not followed, the AA test will fail.

<ul>
  <li><a href="#creation-of-a-new-test-dataset-with-synthetic-data">Creation of a new test dataset with synthetic data.
  <li><a href="#one-split-of-aa-test">One split of AA test.
  <li><a href="#aa-test">AA test.
  <li><a href="#aa-test-with-stratification">AA test with stratification.
</ul>

In [1]:
from hypex.dataset import Dataset, ExperimentData, InfoRole, TreatmentRole, TargetRole
from hypex.experiments.aa import ONE_AA_TEST, AA_TEST, AA_TEST_WITH_STRATIFICATION
from hypex.reporters.aa import AADatasetReporter, AAPassedReporter, AABestSplitReporter
from hypex.splitters import AASplitter

  from .autonotebook import tqdm as notebook_tqdm


## Creation of a new test dataset with synthetic data. 

In order to be able to work with our data in HypEx, first we need to convert it into `dataset`. It is important to mark the data fields by assigning the appropriate `roles`:
- FeatureRole: a role for columns that contain features or predictor variables. Our split will be based on them. Applied by default if the role is not specified for the column.
- TreatmentRole: a role for columns that show the treatment or intervention.
- TargetRole: a role for columns that show the target or outcome variable.
- InfoRole: a role for columns that contain information about the data, such as user IDs. 

In [2]:
data = Dataset(
    roles={
        "user_id": InfoRole(float),
        "treat": TreatmentRole(int),
        "pre_spends": TargetRole(),
        "post_spends": TargetRole()
    }, data="data.csv",
)
data

      user_id  signup_month  treat  pre_spends  post_spends   age gender  \
0         0.0             0      0       488.0   414.444444   NaN      M   
1         1.0             8      1       512.5   462.222222  26.0    NaN   
2         2.0             7      1       483.0   479.444444  25.0      M   
3         3.0             0      0       501.5   424.333333  39.0      M   
4         4.0             1      1       543.0   514.555556  18.0      F   
...       ...           ...    ...         ...          ...   ...    ...   
9995   9995.0            10      1       538.5   450.444444  42.0      M   
9996   9996.0             0      0       500.5   430.888889  26.0      F   
9997   9997.0             3      1       473.0   534.111111  22.0      F   
9998   9998.0             2      1       495.0   523.222222  67.0      F   
9999   9999.0             7      1       508.0   475.888889  38.0      F   

        industry  
0     E-commerce  
1     E-commerce  
2      Logistics  
3     E-com

## One split of AA test
Before execution, we wrap prepared dataset into ExperimentData to be able to run experiments on it. 
Then we execute pipelines, in this case we select one of the pre-assembled pipeline, ONE_AA_TEST. Also, a custom pipline can be created based on your specific needs and requirements with custom executors.

In [5]:
ed = ExperimentData(data)
result = ONE_AA_TEST.execute(ed)

**Analysis tables**    
We can access the results of the experiment directly with the property `analysis_tables` of `ExperimentData`. It includes the information about the outcome of each executor's job. Key is the executor state id, value is the result of the executor's job. It may be useful for debugging or interpretation purposes.

In [6]:
result

{'AASplitter┴┴': {'control':       user_id  signup_month  treat  pre_spends  post_spends   age gender  \
  0         0.0             0      0       488.0   414.444444   NaN      M   
  1         1.0             8      1       512.5   462.222222  26.0    NaN   
  2         2.0             7      1       483.0   479.444444  25.0      M   
  5         5.0             6      1       486.5   486.555556  44.0      M   
  7         7.0            11      1       496.0   432.888889  57.0      M   
  ...       ...           ...    ...         ...          ...   ...    ...   
  9985   9985.0             0      0       484.0   411.333333  52.0      M   
  9987   9987.0             0      0       467.0   431.555556  62.0      M   
  9989   9989.0             6      1       466.5   487.444444  19.0      F   
  9990   9990.0             0      0       490.0   426.000000   NaN      M   
  9994   9994.0             0      0       486.0   423.777778  69.0      F   
  
          industry  
  0     E-com

### Experiment results
To show the report with teh summary of the test we run the `report` method of the reporter, associated with the respective test type, AA test in our case.

It displays the results of the test in the form of a table with the following columns:
- `feature`: name of the target feature, change of which we want to analyze.
- `group`: name of the test group we compare with the control group.
- `TTest pass`: result of the TTest, if it is significant or not.
- `TTest p-value`: p-value of the TTest shows the probability of obtaining the result when the null hypothesis is true. The lower the value the more significant the result is.
- `KSTest pass`: result of the KSTest, if it is significant or not.
- `KSest p-value`: p-value of the KSTest shows the probability of obtaining the result when the null hypothesis is true. The lower the value the more significant the result is.

In [5]:
AADatasetReporter().report(result)

       feature group TTest pass  TTest p-value KSTest pass  KSTest p-value
0   pre_spends     0         OK       0.564800          OK        0.877289
1  post_spends     0         OK       0.404321          OK        0.544187

## AA test
Then we execute another pipeline, in this case we select one of the pre-assembled pipeline, AA_TEST.
 We can also adjust some of the preset parameters of the experiment by assigning them to the respective params of the experiment. I.e. here we set the range of the random states we want to run our AA test for.     
Then we run the experiment on our prepared dataset, wrapped into ExperimentData. 

In [3]:
aa = AA_TEST
aa.executors[0].params[AASplitter]= {"random_states": range(10)}
res = aa.execute(ExperimentData(data))
res

<hypex.dataset.dataset.ExperimentData at 0x1bd8e563340>

### Experiment results
To show the report with summary of the test, we run report method of the reporter, associated with the respective test type, AA test in our case.
AAPassedReporter shows the results of the number of tests (OK / NOT OK) for the different random states.

In [4]:
res.groups

{}

In [7]:
AAPassedReporter().report(res)

  return self.data.replace(to_replace=to_replace, value=value, regex=regex)
  return self.data.replace(to_replace=to_replace, value=value, regex=regex)
  return self.data.replace(to_replace=to_replace, value=value, regex=regex)
  return self.data.replace(to_replace=to_replace, value=value, regex=regex)
  return self.data.replace(to_replace=to_replace, value=value, regex=regex)
  return self.data.replace(to_replace=to_replace, value=value, regex=regex)


  TTest aa test KSTest aa test TTest best split KSTest best split  result  \
0        NOT OK         NOT OK               OK                OK  NOT OK   
1        NOT OK         NOT OK               OK                OK  NOT OK   

       feature group  
0  post_spends     0  
1   pre_spends     0  

AABestSplitReporter returns the dataset with the best split among the ones covered by the AA test.

In [8]:
AABestSplitReporter().report(res)

      user_id  signup_month  treat  pre_spends  post_spends   age gender  \
0         0.0             0      0       488.0   414.444444   NaN      M   
1         1.0             8      1       512.5   462.222222  26.0    NaN   
2         2.0             7      1       483.0   479.444444  25.0      M   
3         3.0             0      0       501.5   424.333333  39.0      M   
4         4.0             1      1       543.0   514.555556  18.0      F   
...       ...           ...    ...         ...          ...   ...    ...   
9995   9995.0            10      1       538.5   450.444444  42.0      M   
9996   9996.0             0      0       500.5   430.888889  26.0      F   
9997   9997.0             3      1       473.0   534.111111  22.0      F   
9998   9998.0             2      1       495.0   523.222222  67.0      F   
9999   9999.0             7      1       508.0   475.888889  38.0      F   

        industry    split  
0     E-commerce  control  
1     E-commerce  control  
2  

## AA test with stratification

Then we repeat that for the AA test with stratification, also setting the share of the control group in the split.

In [9]:
aa = AA_TEST_WITH_STRATIFICATION
aa.executors[0].params[AASplitter]= {"random_states": range(10), "control_size": [0.3]}
res = aa.execute(ExperimentData(data))
res

<hypex.dataset.dataset.ExperimentData at 0x7f212d8a2bc0>

In [10]:
AAPassedReporter().report(res)

  return self.data.replace(to_replace=to_replace, value=value, regex=regex)
  return self.data.replace(to_replace=to_replace, value=value, regex=regex)
  return self.data.replace(to_replace=to_replace, value=value, regex=regex)
  return self.data.replace(to_replace=to_replace, value=value, regex=regex)
  return self.data.replace(to_replace=to_replace, value=value, regex=regex)
  return self.data.replace(to_replace=to_replace, value=value, regex=regex)


  TTest aa test KSTest aa test TTest best split KSTest best split  result  \
0        NOT OK         NOT OK               OK                OK  NOT OK   
1        NOT OK         NOT OK               OK                OK  NOT OK   

       feature group  
0  post_spends     0  
1   pre_spends     0  

In [11]:
AABestSplitReporter().report(res)

      user_id  signup_month  treat  pre_spends  post_spends   age gender  \
0         0.0             0      0       488.0   414.444444   NaN      M   
1         1.0             8      1       512.5   462.222222  26.0    NaN   
2         2.0             7      1       483.0   479.444444  25.0      M   
3         3.0             0      0       501.5   424.333333  39.0      M   
4         4.0             1      1       543.0   514.555556  18.0      F   
...       ...           ...    ...         ...          ...   ...    ...   
9995   9995.0            10      1       538.5   450.444444  42.0      M   
9996   9996.0             0      0       500.5   430.888889  26.0      F   
9997   9997.0             3      1       473.0   534.111111  22.0      F   
9998   9998.0             2      1       495.0   523.222222  67.0      F   
9999   9999.0             7      1       508.0   475.888889  38.0      F   

        industry    split  
0     E-commerce     test  
1     E-commerce     test  
2  