## Average Treatment Effect on Patients' Outcome

Import required libraries.

In [1]:
import numpy as np
import pandas as pd
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from causalinference import CausalModel

Load the dataset and provide contextual information on column types.

In [2]:
# Load the dataset
df = pd.read_csv("mydata.csv")

# Contextual information
binary_columns = ['X5', 'W', 'Y']
categorical_columns = ['X6', 'X8']
numeric_columns = ['X1', 'X2', 'X3', 'X4', 'X7', 'X9']

Define the covariates and the treatment indicator and split the data into treatment and control groups

In [3]:
# Define covariates and treatment indicator
covariates = ['X1', 'X2', 'X3', 'X4', 'X5', 'X6', 'X7', 'X8', 'X9']
treatment_indicator = 'W'

Create the preprocessing pipeline using standard scaling of numeric columns and one-hot encoding of categorical columns.

In [4]:
preprocessor = ColumnTransformer(
    transformers=[
        ('num', StandardScaler(), numeric_columns),
        ('cat', OneHotEncoder(), categorical_columns),
    ],
    remainder='passthrough'
)

Create a dataframe X that is suitable for the "causalinference" package.

In [5]:
# Transform X for causalinference package requirements
X = df[covariates]
X = preprocessor.fit_transform(X)
X = pd.DataFrame(X)

Configure the causal model.

In [6]:
# Configure causal model
cm = CausalModel(Y = df['Y'].values, D = df['W'].values, X = X.values)

Estimate average treatment effect (ATE) using nearest-neighbourhood matching of covariate vectors (Abadie & Imbens, 2006). Confidence intervals are also provided in the results.

In [7]:
cm.est_via_matching()

print(cm.estimates)


Treatment Effect Estimates: Matching

                     Est.       S.e.          z      P>|z|      [95% Conf. int.]
--------------------------------------------------------------------------------
           ATE      0.194      0.014     13.456      0.000      0.166      0.222
           ATC      0.200      0.017     11.573      0.000      0.166      0.233
           ATT      0.187      0.017     10.918      0.000      0.154      0.221



References:

Abadie, A., & Imbens, G. W. (2006). Large sample properties of matching estimators for average treatment effects. Econometrica, 74(1), 235-267. doi:10.1111/j.1468-0262.2006.00655.x