# Propensity Score & DML

- Matching

### 질문이나 의견을 남겨주세요.
<script src="https://utteranc.es/client.js"
        repo="CausalInferenceLab/awesome-causal-inference-python"
        issue-term="pathname"
        theme="github-light"
        crossorigin="anonymous"
        async>
</script>

- IPW, AIPW, Doubly Robust Estimator

출처: https://matheusfacure.github.io/python-causality-handbook/11-Propensity-Score.html

In [1]:
import warnings
warnings.filterwarnings('ignore')

import pandas as pd
import numpy as np
from causalinference import CausalModel

In [2]:
data = pd.read_csv("./data/learning_mindset.csv")
data.sample(5, random_state=5)

Unnamed: 0,schoolid,intervention,achievement_score,success_expect,ethnicity,gender,frst_in_family,school_urbanicity,school_mindset,school_achievement,school_ethnic_minority,school_poverty,school_size
259,73,1,1.480828,5,1,2,0,1,-0.462945,0.652608,-0.515202,-0.169849,0.173954
3435,76,0,-0.987277,5,13,1,1,4,0.334544,0.648586,-1.310927,0.224077,-0.426757
9963,4,0,-0.15234,5,2,2,1,0,-2.289636,0.190797,0.875012,-0.724801,0.761781
4488,67,0,0.358336,6,14,1,0,4,-1.115337,1.053089,0.315755,0.054586,1.862187
2637,16,1,1.36092,6,4,1,0,1,-0.538975,1.433826,-0.033161,-0.982274,1.591641


Propensity Score 계산

In [3]:
categ = ["ethnicity", "gender", "school_urbanicity"]
cont = ["school_mindset", "school_achievement", "school_ethnic_minority", "school_poverty", "school_size"]

data_with_categ = pd.concat([
    data.drop(columns=categ), # dataset without the categorical features
    pd.get_dummies(data[categ], columns=categ, drop_first=False)# categorical features converted to dummies
], axis=1)

In [4]:
from sklearn.linear_model import LogisticRegression

T = 'intervention'
Y = 'achievement_score'
X = data_with_categ.columns.drop(['schoolid', T, Y])

ps_model = LogisticRegression(C=1e6).fit(data_with_categ[X], data_with_categ[T])

data_ps = data.assign(propensity_score=ps_model.predict_proba(data_with_categ[X])[:, 1])

## IPW 생성 및 ATE 추정

In [5]:
import pandas as pd
from sklearn.linear_model import LogisticRegression, Ridge
from causallib.estimation.standardization import Standardization
from causallib.estimation.ipw import IPW
from causallib.estimation.doubly_robust import AIPW

Vanilla IPW

In [6]:
weight_t = 1/data_ps.query("intervention==1")["propensity_score"]
weight_nt = 1/(1-data_ps.query("intervention==0")["propensity_score"])
print("Original Sample Size", data.shape[0])
print("Treated Population Sample Size", sum(weight_t))
print("Untreated Population Sample Size", sum(weight_nt))

Original Sample Size 10391
Treated Population Sample Size 10387.611324207002
Untreated Population Sample Size 10391.506162305861


In [7]:
weight = ((data_ps["intervention"]-data_ps["propensity_score"]) /
          (data_ps["propensity_score"]*(1-data_ps["propensity_score"])))

y1 = sum(data_ps.query("intervention==1")["achievement_score"]*weight_t) / len(data)
y0 = sum(data_ps.query("intervention==0")["achievement_score"]*weight_nt) / len(data)

ate = np.mean(weight * data_ps["achievement_score"])

print("Y1:", y1)
print("Y0:", y0)
print("ATE", np.mean(weight * data_ps["achievement_score"]))

Y1: 0.25981027799629486
Y0: -0.12903052783749974
ATE 0.38884080583379527


결과 해석: 
1. Treatment 받은 개인이 Treatment 받지 않은 동료보다 achievement_score가 0.38 표준편차 더 크다. (achievement_score는 표준화된 결과이기 때문에 표준 편차의 차이로 해석)
2. 아무도 Treatment 받지 않은 경우 일반적인 성취 수준이 현재보다 0.12 표준편차 더 낮다.
3. 모든 사람이 Treatment(세미나)를 받았다면 일반적인 성취 수준이 0.25 표준편차 더 높음.

## Doubly Robust Estimator & AIPW

출처: https://causallib.readthedocs.io/en/latest/causallib.estimation.doubly_robust.html?highlight=doubly

In [12]:
from sklearn.model_selection import KFold
from causallib.estimation.ipw import IPW
from causallib.estimation.doubly_robust import AIPW
from causallib.estimation.standardization import Standardization
from sklearn.linear_model import LogisticRegression, LinearRegression, Ridge

In [13]:
data = pd.read_csv("./data/learning_mindset.csv")
y = data["achievement_score"]
a = data["intervention"]
X = pd.get_dummies(
    data[["school_mindset","school_achievement","school_ethnic_minority",
          "school_poverty","school_size","ethnicity","gender","school_urbanicity"]],
    drop_first=False
)

DR Estimator는 결과모형과 IPW값이 모두 필요함   
- Y값(achievement_score)을 Ridge로 예측(L2 패널티 부여)   
- IPW: 로지스틱 회귀 사용

In [14]:
outcome_model = Standardization(learner=Ridge(alpha=1.0))
weight_model  = IPW(learner=LogisticRegression(max_iter=1000),
                    clip_min=0.01, clip_max=0.99, use_stabilized=True)

PS Score를 구할 때 max_iter을 충분히 큰 숫자(1000)으로 설정해 수치 최적화가 수렴할 수 있도록 설정합니다.   
또한 클리핑을 사용하여 $ \hat{e} $가 [0.01, 0.99]에서만 존재하도록 극단 가중치를 완화합니다.  
(use_stabilized = True)   

위에서 구한 PS score과 IPW를 활용하여 AIPW를 구합니다. (Vanilla)

In [15]:
dr = AIPW(outcome_model=outcome_model, weight_model=weight_model, overlap_weighting=False)
dr.fit(X, a, y)

In [24]:
pop_outcomes = dr.estimate_population_outcome(X, a, y)   # index {0,1}
mu1, mu0 = pop_outcomes[1], pop_outcomes[0]
ate = dr.estimate_effect(mu1, mu0, agg="population")["diff"]
print("μ1 (A=1):", mu1)
print("μ0 (A=0):", mu0)
print("ATE (DR, vanilla):", ate)

μ1 (A=1): 0.30329930792804977
μ0 (A=0): -0.1471130386820095
ATE (DR, vanilla): 0.45041234661005924


결과 해석은 IPW에서와 마찬가지로 생각하면 됩니다.  

위에서 구한 AIPW는 Propensity Score의 Overlap이 충분히 확보되었을 때는 좋은 결과를 나타냅니다.   
하지만 Overlap 구간이 불안정할 때는 Overlap-weighting = True이라는 기능을 활용해도 좋습니다.

In [27]:
dr_overlap = AIPW(outcome_model=outcome_model,
                  weight_model=weight_model,
                  overlap_weighting=True)
dr_overlap.fit(X, a, y)

In [29]:
pop_outcomes_ov = dr_overlap.estimate_population_outcome(X, a, y)
ate_ov = dr_overlap.estimate_effect(pop_outcomes_ov[1], pop_outcomes_ov[0],
                                    agg="population")["diff"]
print(" ATE(DR, overlap-weighting):", ate_ov)

 ATE(DR, overlap-weighting): 0.3467411782171299


정리하자면, 
- ATE(IPW) 0.38884080583379527   
- ATE(DR, vanilla): 0.45041234661005924,   
- ATE(DR, overlap-weighting) ATE: 0.3467411782171299    

위에서 사용한 방법에 따라 ATE 값이 다르게 나타나는 것을 알 수 있습니다.  
$ \hat{e} $이 어떻게 구성되어있냐에 따라 어떤 방법을 사용할지 결정하면 됩니다.

## IPW vs AIPW
IPW를 구하는 방식 3가지와 AIPW의 성능을 비교해보려고 합니다.   

AIPW뿐만 아니고 IPW를 구하는 방식 또한 여러 방법이 있습니다.    
causalpy 라이브러리에서 제공하는 추가적인 세 가지 IPW 구하는 방법을 소개하고, 네 방식의 결과를 비교하고 시각화하는 코드를 제공하고자 합니다.

출처: https://causallib.readthedocs.io/en/latest/causallib.estimation.doubly_robust.html?highlight=doubly

In [21]:
import arviz as az
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

import causalpy as cp

In [22]:
df1 = pd.DataFrame(
    np.random.multivariate_normal([0.5, 1], [[2, 1], [1, 1]], size=10000),
    columns=["x1", "x2"],
)
df1["trt"] = np.where(
    -0.5 + 0.25 * df1["x1"] + 0.75 * df1["x2"] + np.random.normal(0, 1, size=10000) > 0,
    1,
    0,
)
TREATMENT_EFFECT = 2
df1["outcome"] = (
    TREATMENT_EFFECT * df1["trt"]
    + df1["x1"]
    + df1["x2"]
    + np.random.normal(0, 1, size=10000)
)
df1.head()

Unnamed: 0,x1,x2,trt,outcome
0,2.667887,2.993667,1,7.208272
1,-0.19729,1.234554,0,0.042709
2,1.325639,1.079826,0,0.341103
3,0.80901,0.722534,1,2.87739
4,-0.02919,-0.340142,0,-0.823492


- Double Machine Learning (비모수 버전의 Regression 처럼 활용 가능)