# User guide "How to use auto_ab library"

In [None]:
import sys, yaml, os, json
import pandas as pd
import numpy as np
import warnings
warnings.filterwarnings('ignore')

sys.path.append(str('../'))
from auto_ab import ABTest, Splitter, VarianceReduction, Graphics

## Loading config file

Config file is in *yaml* format and is located in the root of the library.
Later in file config is available via *config* variable.

In [None]:
try:
    project_dir = os.path.dirname(os.path.abspath(''))
    config_file = os.path.join(project_dir, 'config.yaml')
    with open (config_file, 'r') as file:
        config = yaml.safe_load(file)
except yaml.YAMLError as exc:
    print(exc)
    sys.exit(1)
except Exception as e:
    print('Error reading the config file')
    sys.exit(1)
    

gf = Graphics()

## Loading dataset

- **sex, married, country** — independent of experiment features
- **weight_now, noise_now** — features during experiment
- **height_now** — target during experiment if continuous metric
- **clicks_now, sessions_now** — numerator and denominator of ratio metric during experiment

- **weight_prev, noise_prev** — features before experiment
- **height_prev** — target before experiment if continuous metric
- **clicks_prev, sessions_prev** — numerator and denominator of ratio metric before experiment

In [None]:
data = pd.read_csv(os.path.join(project_dir, 'data/internal/guide/data.csv'))
data.head(10)

# Experiment design

There are 5 stages in the experiment
1. **Preparation to the experiment** — calculate best MDE
2. **Experiment** — log data during experiment (this stage is not currently developed in the library)
3. **A/A test** — A/A test, or splitter test
4. **Variance reduction** — reduce variance of given during the experiment metrics
5. **A/B test analysis** — actual A/B test by certain statistical technique

# 1. Continuous metric
# 1.1. Preparation to the experiment
## Loading dataset

- **sex, married, country, weight_prev, noise_prev** — features
- **height_prev** — target

In [None]:
data_1 = data[['id', 'sex', 'married', 'country', 'weight_prev', 'noise_prev', 'height_prev']]
data_1.head()

## Initialization of splitter

If you are going to run MDE simulation, **split_rate** parameter can be omitted as it will be placed to the splitter during the simulation.

In [None]:
splitter = Splitter(split_rate=config['splitter']['split_rate'])

### Custom splitter

We can pass custom splitter as the second parameter.
Custom splitter must add column 'group' with two possible values: 'A' or 'B'.

In [None]:
# def my_splitter(X: pd.DataFrame = None,
#                 target: str = None,
#                 split_rate: float = None) -> pd.DataFrame:
#   # some splitter logic here
#     pass

# splitter = Splitter(split_rate=config['splitter']['split_rate'],
#                    custom_splitter=my_splitter)

## Initialization of A/B test

Here
- **alpha** — significance level
- **beta** — probability of type II error
- **alternative** — 'less', 'greater', 'two-sided'

In [None]:
ab = ABTest(alpha=config['hypothesis']['alpha'], 
            beta=config['hypothesis']['beta'],
            alternative=config['hypothesis']['alternative'])

### Set loaded dataset as analyzed

Here
- **id_col** — id column of a dataset

In [None]:
ab.use_dataset(data_1, id_col=config['data']['id_col'],
              target=config['data']['target'])

### Set previously defined splitter for test

Assign defined splitter to the test.

In [None]:
ab.splitter = splitter

### Set list of split rates for MDE exploration

Set a list of split rates between control/treatment you are going to test.

In [None]:
ab.split_rates = config['simulation']['split_rates']

### Set list of increments for MDE exploration

Here
- **inc_var** — list of increments, i.e. [1, 2, 3, 4, 5]
- **extra_paramms** — extra parameters for increment, currently not used in analysis

In [None]:
ab.set_increment(inc_var=config['simulation']['increment']['vars'],
                extra_params=config['simulation']['increment']['extra_params'])

### Create metric which you want to compare

In the example below, we want to compare 10th percentile of control and treatment distributions.
Metric function must return a single value over a set of numbers.

In [None]:
def metric(X: np.array) -> float:
    return np.quantile(X, 0.5)

### MDE simulation in order to find the best combination of split rate—increment

Here
- **n_iter** — number of iterations of simulation
- **n_boot_samples** — set if you chose bootstrap hypothesis testing
- **metric_type** — metric type: ratio or solid (continuous)
- **metric** — Python function as tested metric (quantile, median, mean, etc) or custom
- **strategy** — strategy of hypothesis testing
- **strata** — strata column name for variance reduction
- **strata_weights** — weights of each unique value in strata column as a dictionary
- **to_csv** — whether or not to save the result to csv file
- **csv_path** — path to the newly created csv file

In [None]:
res = ab.mde_simulation(n_iter=config['simulation']['n_iter'],
                       metric_type=config['metric']['metric_type'],
                       metric=metric,
                       strategy=config['hypothesis']['strategy'],
                       to_csv=config['result']['to_csv'],
                       csv_path=config['result']['csv_path'])

### Print simulation log

Here
- **first key** — split rate
- **second key** — increment
- **value** — share of rejected H0

In [None]:
print(json.dumps(res, indent=4))

### Visualize simulation log in plot

In [None]:
gf.plot_simulation_log(config['result']['csv_path'])

# 1.2. Experiment

During this step, dataset of outcomes is gathered and is ready for the analysis.

# 1.3. A/A test

Yes, it must be run before or in parallel with A/B test, but let's assume that we have data after A/B test has finished and now we need to assure that the splitter is OK.

In [None]:
data_aa = data[['height_now']]
data_aa.head()

In [None]:
splitter = Splitter(split_rate=0.5)
res = splitter.aa_test(X=data_aa, target='height_now', alpha=0.05, n_iter=100)
print(f'Share of iterations when control and treatment groups are equal: {res}')

# 1.4. Variance reduction

## Loading dataset generated during A/B test

Here
- **height_now** — experiment metric during experiment
- **height_prev** — experiment metric before experiment
- **weight_now** — highly correlated feature with metric during experiment
- **weight_prev** — highly correlated feature with metric before experiment
- **noise_now** — feature during experiment that is just noise
- **noise_prev** — feature before experiment that is just noise
- **group** — group column

In [None]:
data_vr = data[['noise_prev', 'weight_prev', 'height_prev', 'noise_now', 'weight_now', 'height_now', 'group']]

## Initial distribution of tested metrics

In [None]:
gf.plot_distributions(data_vr, 'height_now', 'group', 50)

As can be seen, distributions are identical.

## Add increment to the treatment group

In [None]:
ab = ABTest(alpha=config['hypothesis']['alpha'], 
            beta=config['hypothesis']['beta'],
            alternative=config['hypothesis']['alternative'])

treatment = data_vr.loc[data_vr.group == 'B', 'height_now']
treatment_increased = ab._add_increment('solid', treatment, 5)
data_vr.loc[data_vr.group == 'B', 'height_now'] = treatment_increased

gf.plot_distribution(treatment_increased, bins=50)

## Initial control and increased treatment distribution

In [None]:
gf.plot_distributions(data_vr, 'height_now', 'group', 50)

## Use CUPED to reduce variance

After the execution, new column is introduced — **height_now_cuped**.

In [None]:
vr = VarianceReduction()
data_vr_cuped = vr.cuped(data_vr, target='height_now', groups='group', covariate='height_prev')
print(data_vr_cuped.head())

In [None]:
gf.plot_distributions(data_vr_cuped, 'height_now_cuped', 'group', 50)

As can be seen, variance reduced **from 160 to 170** and **from 190 to 180** for control and **from 165 to 175** and **from 195 to 185** for treatment.

## Use CUPAC to reduce variance

Below you can see the model that was created to predict covariate to experiment period.
After the execution, new column is introduced — **target_pred**.

In [None]:
data_vr_cupac = vr.cupac(data_vr, target_prev='height_prev', target_now='height_now',
               factors_prev=['weight_prev'],
               factors_now=['weight_now'], groups='group')

In [None]:
print(data_vr_cupac.head())

In [None]:
gf.plot_distributions(data_vr_cupac, 'height_now_cuped', 'group', 50)

As can be seen, variance reduced **from 160 to 170** on the left and **from 190 to 180** on the right.

# 1.5. A/B test analysis

Metric tested in the experiment in 10th quantile.

In [None]:
def metric(X: np.array) -> float:
    return np.quantile(X, 0.1)

In [None]:
ab = ABTest(alpha=config['hypothesis']['alpha'],
            beta=config['hypothesis']['beta'],
            alternative=config['hypothesis']['alternative'])

control = data_vr_cuped.loc[data_vr_cuped.group == 'A', 'height_now_cuped'].to_numpy()
treatment = data_vr_cuped.loc[data_vr_cuped.group == 'B', 'height_now_cuped'].to_numpy()

is_rejected = ab.test_hypothesis_buckets(control, treatment, metric, n_buckets=config['hypothesis']['n_buckets'])
result = 'rejected' if is_rejected == 1 else 'not rejected'
print(f'H0: {result}')

# 2. Ratio metric
# 2.1. Preparation to the experiment
## Loading dataset

- **sex, married, country, height** — features
- **clicks, sessions** — numerator and denominator of ratio metric

In [None]:
data_2 = data[['id', 'sex', 'married', 'country', 'clicks_prev', 'sessions_prev']]
data_2.head()

## Initialization of splitter

If you are going to run MDE simulation, **split_rate** parameter can be omitted as it will be placed to the splitter during the simulation.

In [None]:
splitter = Splitter(split_rate=config['splitter']['split_rate'])

### Custom splitter

We can pass custom splitter as the second parameter.
Custom splitter must add column 'group'.

In [None]:
# def my_splitter(X: pd.DataFrame = None,
#                 target: str = None,
#                 split_rate: float = None) -> pd.DataFrame:
#   # some splitter logic here
#     pass

# splitter = Splitter(split_rate=config['splitter']['split_rate'],
#                    custom_splitter=my_splitter)

## Initialization of A/B test

Here
- **alpha** — significance level
- **beta** — probability of type II error
- **alternative** — 'less', 'more', 'two-sided'

In [None]:
ab = ABTest(alpha=config['hypothesis']['alpha'], 
            beta=config['hypothesis']['beta'],
            alternative=config['hypothesis']['alternative'])

### Set loaded dataset as analyzed

Here
- **id_col** — id column of a dataset

In [None]:
ab.use_dataset(data_2, id_col=config['data']['id_col'],
              numerator=config['data']['numerator'],
              denominator=config['data']['denominator'])

### Set previously defined splitter for test

Assign defined splitter to the test.

In [None]:
ab.splitter = splitter

### Set list of split rates for MDE exploration

Set a list of split rates between control/treatment you are going to test.

In [None]:
ab.split_rates = config['simulation']['split_rates']

### Set list of increments for MDE exploration

Here
- **inc_var** — list of increments, i.e. [1, 2, 3, 4, 5]
- **extra_paramms** — extra parameters for increment, currently not used in analysis

**Note**: if *numerator + inc_var > denominator* then increment randomly chosen such that *numerator <= denominator*.

In [None]:
ab.set_increment(inc_var=config['simulation']['increment']['vars'],
                extra_params=config['simulation']['increment']['extra_params'])

### Metric to compare

In most cases, "means" of ratio metrics are compared.

### MDE simulation in order to find the best combination of split rate—increment

Here
- **n_iter** — number of iterations of simulation
- **n_boot_samples** — set if you chose bootstrap hypothesis testing
- **metric_type** — metric type: ratio or solid (continuous)
- **metric** — Python function as tested metric (quantile, median, mean, etc) or custom
- **strategy** — strategy of hypothesis testing
- **strata** — strata column name for variance reduction
- **strata_weights** — weights of each unique value in strata column as a dictionary
- **to_csv** — whether or not to save the result to csv file
- **csv_path** — path to the newly created csv file

In [None]:
res = ab.mde_simulation(n_iter=config['simulation']['n_iter'],
                       metric_type='ratio',
                       strategy='delta_method',
                       to_csv=True,
                       csv_path='../data/internal/guide/ratio_mde.csv')

### Print simulation log

Here
- **first key** — split rate
- **second key** — increment
- **value** — share of rejected H0

In [None]:
print(json.dumps(res, indent=4))

### Visualize simulation log in plot

In [None]:
gf.plot_simulation_log('../data/internal/guide/ratio_mde.csv')

# 2.2. Experiment

During this step, dataset of outcomes is gathered and is ready for the analysis.

# 2.3. A/A test

Yes, it must be run before or in parallel with A/B test, but let's assume that we have data after A/B test has finished and now we need to assure that the splitter is OK.

In [None]:
data_aa = data[['clicks_now', 'sessions_now']]
data_aa.head()

In [None]:
splitter = Splitter(split_rate=0.5)
res = splitter.aa_test(X=data_aa, numerator='clicks_now', denominator='sessions_now', 
                       metric_type='ratio', alpha=0.05, n_iter=100)
print(f'Share of iterations when control and treatment groups are equal: {res}')

# 2.4. Variance reduction

Not applicable to ratio metric (as far as I know) but if ratio metric was linearized and now presented as a continuous metric then variance reduction can be easily used here as usual.

In [None]:
# ab = ABTest(alpha=config['hypothesis']['alpha'], 
#             beta=config['hypothesis']['beta'],
#             alternative=config['hypothesis']['alternative'])

# ab.use_dataset(some_data, id_col=config['data']['id_col'],
#               numerator=config['data']['numerator'],
#               denominator=config['data']['denominator'])

# ab.linearization()

As we applied linearization, new continuous metric was added to a dataset with name **'numerator_denominator'**, where *numerator* and *denominator* are names of ratio's numerator and denominator columns.

New **'numerator_denominator'** column is already added as a target.

# 2.5. A/B test analysis

Tested metric is *ratio mean*.
Here we back to ratio metric instead of continuous derived in previous step.

In [None]:
data_ab = data[['id', 'clicks_now', 'sessions_now', 'group']]

ab = ABTest(alpha=config['hypothesis']['alpha'],
            beta=config['hypothesis']['beta'],
            alternative=config['hypothesis']['alternative'])

ab.use_dataset(data_ab, id_col=config['data']['id_col'],
              numerator='clicks_now',
              denominator='sessions_now')

is_rejected = ab.delta_method()
result = 'rejected' if is_rejected == 1 else 'not rejected'
print(f'H0: {result}')