[Reference](https://medium.com/@monosr/robust-experimentation-impact-analysis-using-causal-ml-python-package-136db5d7f921)

In [1]:
!pip install causalml
# conda install -c conda-forge causalml

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting causalml
  Downloading causalml-0.12.3.tar.gz (406 kB)
[K     |████████████████████████████████| 406 kB 11.2 MB/s 
Collecting shap
  Downloading shap-0.40.0-cp37-cp37m-manylinux2010_x86_64.whl (564 kB)
[K     |████████████████████████████████| 564 kB 27.5 MB/s 
Collecting pygam
  Downloading pygam-0.8.0-py2.py3-none-any.whl (1.8 MB)
[K     |████████████████████████████████| 1.8 MB 33.9 MB/s 
Collecting pyro-ppl
  Downloading pyro_ppl-1.8.1-py3-none-any.whl (718 kB)
[K     |████████████████████████████████| 718 kB 28.1 MB/s 
Collecting pyro-api>=0.1.1
  Downloading pyro_api-0.1.2-py3-none-any.whl (11 kB)
Collecting slicer==0.0.7
  Downloading slicer-0.0.7-py3-none-any.whl (14 kB)
Building wheels for collected packages: causalml
  Building wheel for causalml (setup.py) ... [?25l[?25hdone
  Created wheel for causalml: filename=causalml-0.12.3-cp37-cp37m-linux_x86_64.whl size

- n : Number of samples
- p : Number of covariates (i.e. number of independent variables)
- y : outcome array (i.e. synthetic outcome from the experiments), in this case these are continuous variable
- X : Independent variables of dimensions n,p
- w : Treatment flag, 0 signifies control
- tau : ITE
- b : Expected outcome
- e : Propensity of receiving treatment

In [2]:
import warnings
warnings.filterwarnings("ignore")

import numpy as np
from causalml.dataset import synthetic_data

y, X, treatment_flags, ite_list, exp_outcome_list, e = synthetic_data(mode=1, n=10_000, p=5, sigma=1.0)

print(f"Sample outcome: {y[:5]}")
print(f"Sample independent variables: {X[:5]}")

print(f"Treatment Data Count = {np.count_nonzero(treatment_flags)}")
print(f"Control Data Count = {len(y) - np.count_nonzero(treatment_flags)}")

print(f"Sample ITE: {ite_list[:5]}")

Sample outcome: [2.48178454 0.74136288 1.87345124 2.48919831 2.83421542]
Sample independent variables: [[0.76337241 0.20833775 0.52620044 0.7577642  0.54801578]
 [0.12601318 0.79215693 0.43063064 0.26831812 0.53385855]
 [0.38403465 0.09895808 0.28230967 0.27129277 0.50887985]
 [0.75858701 0.68508042 0.04587426 0.92242968 0.27756016]
 [0.83849571 0.92451621 0.52464629 0.33065857 0.85826711]]
Treatment Data Count = 5088
Control Data Count = 4912
Sample ITE: [0.48585508 0.45908505 0.24149637 0.72183372 0.88150596]


In [3]:
from causalml.inference.meta import LRSRegressor

lr = LRSRegressor()
print(lr)

LRSRegressor(model=<causalml.inference.meta.slearner.StatsmodelsOLS object at 0x7fb26e36b4d0>)


In [4]:
te, lb, ub = lr.estimate_ate(X, treatment_flags, y)
print(f"Confidence Level Alpha = {lr.ate_alpha}")
print(f"ATE = {np.round(te[0], 2)}, Range : {np.round(lb[0], 2)}:{np.round(ub[0], 2)}")

Confidence Level Alpha = 0.05
ATE = 0.68, Range : 0.63:0.73
