## ERUPT under simulated random assignment

In [13]:
%load_ext autoreload
%autoreload 2
import os, sys
import warnings
warnings.filterwarnings('ignore') # suppress sklearn deprecation warnings for now..

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split

# the below checks for whether we run dowhy, causaltune, and FLAML from source
root_path = root_path = os.path.realpath('../..')
try:
    import causaltune
except ModuleNotFoundError:
    sys.path.append(os.path.join(root_path, "auto-causality"))

try:
    import dowhy
except ModuleNotFoundError:
    sys.path.append(os.path.join(root_path, "dowhy"))

try:
    import flaml
except ModuleNotFoundError:
    sys.path.append(os.path.join(root_path, "FLAML"))

from causaltune import CausalTune
from causaltune.datasets import generate_non_random_dataset
from causaltune.erupt import DummyPropensity, ERUPT


In [14]:
# this makes the notebook expand to full width of the browser window
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))

In [15]:
%%javascript

// turn off scrollable windows for large output
IPython.OutputArea.prototype._should_scroll = function(lines) {
    return false;
}

<IPython.core.display.Javascript object>

## Loading data and model training

In [None]:
# load toy dataset with non-random assignment and apply standard pre-processing
cd = generate_non_random_dataset()
cd.preprocess_dataset()

In [None]:
display(cd.data.head())

## Random ERUPT

Below we demonstrate how to use Estimated Response Under Proposed Treatment (ERUPT) to estimate the average treatment effect had the treatment been assigned randomly. Recall that the dataset used in this example is constructed in a way that the treatment propensity is a function of a unit's covariates.

In [48]:
use_df = cd.data.copy()

# define some variables to be used below
propensities = use_df['propensity']
actual_treatment = use_df['T']
outcome = use_df['Y']

# define the random assignment policy
random_policy = np.random.randint(0,2, size=len(use_df))

In [58]:
# define a propensity model that will simply return the propensities when calling predict_proba
propensity_model = DummyPropensity(p=propensities, treatment=actual_treatment)

# obtain ERUPT under random policy
e = ERUPT(treatment_name='T', propensity_model=propensity_model)

print(f'Average effect had treatment been assigned randomly: {np.round(e.score(df=use_df,outcome=outcome,policy=random_policy),4)}')

Average effect had treatment been assigned randomly: 0.8301


For more details on the implementation, consult [Hitsch and Misra (2018)](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3111957). Note also that we assume that treatment takes integer values from 0 to n.