# Import

In [2]:
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
from blockeval.analysis import *
from blockeval.utils import campaign_simulation

In [3]:
# options
pd.options.display.float_format = '{:.4f}'.format

# Data preparation

The campaign data has to be stored in a pandas dataframe with a row for each individual and the following columns:
- **block**: block labels
- **treatment**: treatment indicator (0 for control, 1 for treated)
- **outcome**: campaign outcome (binary 0 or 1, or continuous)

To illustrage the dataframe structure, we simulate an uplift campain with four blocks:

In [4]:
uplift_data = campaign_simulation(blocks = ['public_refractory', 'public_persuadable', 'private_refractory', 'private_persuadable'],
        block_sizes = [100000,  100000, 100000, 100000],
        treatment_probas = [0.5,  0.8, 0.5, 0.8],
        control_means = [0.15, 0.15, 0.15, 0.15],
        treatment_effects = [0, 0.04, 0, 0.04])

Higher level groups can be derived from the blocks:

In [5]:
uplift_data['segment'] = np.where(uplift_data['block'].str.contains('public'), 'public', 'private')
uplift_data['segment'] = pd.Categorical(uplift_data['segment'], categories=['public', 'private'], ordered=True)
uplift_data['persona'] = np.where(uplift_data['block'].str.contains('refractory'), 'refractory', 'persuadable')
uplift_data['persona'] = pd.Categorical(uplift_data['persona'], categories=['refractory', 'persuadable'], ordered=True)

The campaign data should look similar to the extract below with at least the three columns **outcome**, **treatment**, and **block**:

In [6]:
uplift_data.head()

Unnamed: 0,outcome,treatment,block,segment,persona
0,0,1,public_refractory,public,refractory
1,1,1,public_refractory,public,refractory
2,0,1,public_refractory,public,refractory
3,0,1,public_refractory,public,refractory
4,0,1,public_refractory,public,refractory


Importantly, a treated and control groups is expected within each block like in the simulated uplift campaign:

![UpliftDesign](figures/UpliftDesign.PNG)

# Block Summary

We first summarize the design and results by block. The *keep_cols* argument is optional and used to keep information about higher level groups in the summary:

In [7]:
block_summary(uplift_data, keep_cols=['segment', 'persona'])

Unnamed: 0,block,segment,persona,eff,treated_mean,control_mean,treated_size,control_size,block_size,treatment_proba
0,public_refractory,public,refractory,0.0005,0.1491,0.1486,50000,50000,100000,0.5
1,public_persuadable,public,persuadable,0.0397,0.1914,0.1517,80000,20000,100000,0.8
2,private_refractory,private,refractory,0.0026,0.1522,0.1496,50000,50000,100000,0.5
3,private_persuadable,private,persuadable,0.0424,0.1889,0.1465,80000,20000,100000,0.8


We can see from the 'eff' column that the treatment effects are close to 4pp in the two Persuadable blocks and close to 0pp in the Refractory ones.

# Treatment Effects

The p-value and confidence interval of the treatment effects can be obtained for each block:

In [8]:
weighted_avg_test(uplift_data, group_by=['block'])

Unnamed: 0,block,eff,treated_mean,control_mean,treated_size,control_size,group_size,eff_se,z,p_value,ci_low,ci_upp,incremental
0,public_refractory,0.0005,0.1491,0.1486,50000,50000,100000,0.0023,0.2399,0.8104,-0.0039,0.005,54.0
1,public_persuadable,0.0397,0.1914,0.1517,80000,20000,100000,0.0029,13.7169,0.0,0.034,0.0454,3968.75
2,private_refractory,0.0026,0.1522,0.1496,50000,50000,100000,0.0023,1.1308,0.2581,-0.0019,0.007,256.0
3,private_persuadable,0.0424,0.1889,0.1465,80000,20000,100000,0.0029,14.8234,0.0,0.0368,0.048,4236.25


Results show significant effects for the Persuadable blocks but not for the Refractory ones. The output also shows the incremental conversions if both the treated and control groups were exposed to the intervention ('incremental' = 'eff' * 'group_size').

Results can be rolled-up to group of blocks. Here the treatment effects are rolled-up to the persona level by changing the *group_by* argument:

In [17]:
weighted_avg_test(uplift_data, group_by=['persona'])

Unnamed: 0,persona,eff,treated_mean,control_mean,treated_size,control_size,group_size,eff_se,z,p_value,ci_low,ci_upp,incremental
0,refractory,0.0015,0.1507,0.1491,100000,100000,200000,0.0016,0.971,0.3316,-0.0016,0.0047,310.0
1,persuadable,0.041,0.1901,0.1491,160000,40000,200000,0.002,20.1758,0.0,0.037,0.045,8205.0


For the Refractory persona, the treatment effect is 0.15pp with a 95% CI of [-0.16, 0.47]. For the Persuadable persona, it is 4.10pp with a 95% CI of [3.70, 4.50].

The overall treatment effect with p-value and confidence interval can be obtained by omiting the *group_by* argument:

In [16]:
weighted_avg_test(uplift_data)

Unnamed: 0,group,eff,treated_mean,control_mean,treated_size,control_size,group_size,eff_se,z,p_value,ci_low,ci_upp,incremental
0,all,0.0213,0.1704,0.1491,260000,140000,400000,0.0013,16.469,0.0,0.0188,0.0238,8515.0


# Comparing Treatment Effects

Treatment effects can be compared between blocks. The first block 'public_refractory' is used as the reference:

In [11]:
comparison_test(uplift_data, compare_along='block')

Unnamed: 0,eff_delta,variant_grp,reference_grp,variant_size,reference_size,eff_se,z,p_value,ci_low,ci_upp
0,0.0391,public_persuadable,public_refractory,100000,100000,0.0037,10.6785,0.0,0.032,0.0463
1,0.002,private_refractory,public_refractory,100000,100000,0.0032,0.6327,0.5269,-0.0042,0.0083
2,0.0418,private_persuadable,public_refractory,100000,100000,0.0036,11.4958,0.0,0.0347,0.049


Effects can also be compared at a higher level by changing the *compare_along* argument:

In [9]:
comparison_test(uplift_data, compare_along='persona')

Unnamed: 0,eff_delta,variant_grp,reference_grp,variant_size,reference_size,eff_se,z,p_value,ci_low,ci_upp
0,0.0395,persuadable,refractory,200000,200000,0.0026,15.2699,0.0,0.0344,0.0445


Results show an uplift of 3.95pp with a 95% CI of [3.44pp, 4.45pp].