# AA test tutorial 
AA test is important part of randomized controlled experiment, for example AB test.

The objectives of the AA test are to verify the assumption of uniformity of samples as a result of the applied partitioning method, to select the best partition from the available ones, and to verify the applicability of statistical criteria for checking uniformity.

For example, there is a hypothesis about the absence of dependence of features on each other. If this hypothesis is not followed, the AA test will fail. 

<ul>
  <li><a href="#creation-of-a-new-test-dataset-with-synthetic-data">Creation of a new test dataset with synthetic data.
  <li><a href="#one-split-of-aa-test">One split of AA test.
  <li><a href="#aa-test">AA test.
  <li><a href="#aa-test-with-stratification">AA test with stratification.
</ul>

In [1]:
from hypex.dataset import Dataset, InfoRole, TreatmentRole, TargetRole, StratificationRole
from hypex.experiments.aa import AATest

## Creation of a new test dataset with synthetic data. 

It is important to mark the data fields by assigning the appropriate roles:
- FeatureRole: a role for columns that contain features or predictor variables. Our split will be based on them. Applied by default if the role is not specified for the column.
- TreatmentRole: a role for columns that show the treatment or intervention.
- TargetRole: a role for columns that show the target or outcome variable.
- InfoRole: a role for columns that contain information about the data, such as user IDs. 

In [2]:
data = Dataset(
    roles={
        "user_id": InfoRole(int),
        "treat": TreatmentRole(int),
        "pre_spends": TargetRole(),
        "post_spends": TargetRole(),
        "gender": StratificationRole(str)
    }, data="data.csv",
)
data

      user_id  signup_month  treat  pre_spends  post_spends   age gender  \
0           0             0      0       488.0   414.444444   NaN      M   
1           1             8      1       512.5   462.222222  26.0    NaN   
2           2             7      1       483.0   479.444444  25.0      M   
3           3             0      0       501.5   424.333333  39.0      M   
4           4             1      1       543.0   514.555556  18.0      F   
...       ...           ...    ...         ...          ...   ...    ...   
9995     9995            10      1       538.5   450.444444  42.0      M   
9996     9996             0      0       500.5   430.888889  26.0      F   
9997     9997             3      1       473.0   534.111111  22.0      F   
9998     9998             2      1       495.0   523.222222  67.0      F   
9999     9999             7      1       508.0   475.888889  38.0      F   

        industry  
0     E-commerce  
1     E-commerce  
2      Logistics  
3     E-com

## AA test
We can set number of iterations for simple execution. In this case random state is number of each iteration.

In [3]:
aa = AATest(n_iterations=10)
res = aa.execute(data)

TypeError: 'builtin_function_or_method' object is not iterable

In [None]:
res.resume

In [None]:
res.aa_score

In [None]:
res.best_split

In [None]:
res.best_split_statistic

In [None]:
res.experiments

# AA Test with random states

We can set random states for each execution. In this case number of random states equals number of execution. 

In [None]:
aa = AATest(random_states=[56, 72, 2, 43])
res = aa.execute(data)

In [None]:
res.resume

In [None]:
res.aa_score

In [None]:
res.best_split

In [None]:
res.best_split_statistic

In [None]:
res.experiments

# AA Test with stratification

Depends on your needs it is possible to stratify data. You can set `stratification=True` and StratificationRole in Dataset to run it.  

In [None]:
aa = AATest(random_states=[56, 72, 2, 43], stratification=True)
res = aa.execute(data)

In [None]:
res.resume

In [None]:
res.aa_score

In [None]:
res.best_split

In [None]:
res.best_split_statistic

In [None]:
res.experiments