# Robustness Check for PSM

The results of the propensity score matching estimation method can be unstable (Kind and Nielsen,2019). To verify the degree of instability of the results, we have added the following functions to help everyone identify this, including: 
* Calculate the total number of observations remained, you should avoid discard too many observations.
* Calculate the most repeated times of a control observations, if a control observation appears too many times, something might be wrong for the propensity score estimation model.
* Sensitivity test, following Rosenbaum (2005) "Sensitivity analysis in observational studies". You can start with fitting a list of integers starting from 1, such as [1,2,3,...] and see from which gamma, the "p-val upper bound" start to fall below 0.05.
* Placebo test, verifies if the ate disappears when the treatment is randomly assigned.


### Data

We have following types of observations:
* Covariates, which we will denote with `X`
* Treatment, which we will denote with `T`
* Responses, which we will denote with `Y`

Requirement is that `T` is a binary varible which contain only 0/1 values. 


In [1]:
# import sys
# print(sys.path)
# sys.path.append('/Users/bytedance/PycharmProjects/github/CausalMatch')
import causalmatch as causalmatch
from causalmatch import matching,gen_test_data
print('current version is: ',causalmatch.__version__)
import sys
print(sys.executable)
print(sys.version)
print(sys.version_info)
import pandas as pd
import numpy as np
import statsmodels.api as sm
from sklearn.linear_model import LogisticRegression


df, rand_continuous, rand_true_param, param_te , rand_treatment, rand_error = gen_test_data(n = 10000, c_ratio=0.5)
X = ['c_1', 'c_2', 'c_3', 'd_1', 'gender']
y = ['y', 'y2']
id = 'user_id'

# treatment variable has to be a 0/1 dummy variable
# if is string, please convert to a 0/1 int input
T = 'treatment'

['/Library/Frameworks/Python.framework/Versions/3.12/lib/python312.zip', '/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12', '/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/lib-dynload', '', '/Users/bytedance/Library/Python/3.12/lib/python/site-packages', '/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages']
current version is:  0.0.5
/usr/local/bin/python3.12
3.12.2 (v3.12.2:6abddd9f6a, Feb  6 2024, 17:02:06) [Clang 13.0.0 (clang-1300.0.29.30)]
sys.version_info(major=3, minor=12, micro=2, releaselevel='final', serial=0)


In [2]:
# -------------------------------------------------------- #
# STEP 1: initialize matching object
match_obj = matching(data = df,
                     T = T,
                     y = ['y'],
                     X = X,
                     id = id)

# -------------------------------------------------------- #
# STEP 2: propensity score matching without trim anything
match_obj.psm(n_neighbors = 1, 
              model = LogisticRegression(), 
              trim_percentage = 0, 
              caliper = 1) 




In [3]:
# -------------------------------------------------------- #
# STEP 3: Sensitivity test, following Rosenbaum (2005) "Sensitivity analysis in observational studies".
#         You can start with fitting a list of integers starting from 1, such as [1,2,3,...] and
#         see from which gamma, the "p-val upper bound" start to fall below 0.05
#         !!!!! only works for PSM  !!!!!

gamma = [1, 1.5, 2, 2.5, 3]
match_obj.sensitivity_test(gamma = gamma)


Unnamed: 0,Wilcoxon-statistic,gamma,stat upper bound,stat_lower_bound,z-score upper bound,z-score lower bound,p-val upper bound,p-val lower bound,y
0,7670016.0,1.0,6496126.5,6496126.5,11.173314,11.173314,0.0,0.0,y
1,7670016.0,1.5,7795351.8,5196901.2,,,,,y
2,7670016.0,2.0,8661502.0,4330751.0,,,,,y
3,7670016.0,2.5,9280180.71,3712072.29,,,,,y
4,7670016.0,3.0,9744189.75,3248063.25,,,,,y


In [4]:
# -------------------------------------------------------- #
# STEP 4: placebo treatment, verifies the effect disappears when the Treatment is replaced with a placebo.
#         Returen a numpy array to store ATE estimate for each bootstrap sample with length b X 1, default is 100.
ate_array = match_obj.placebo_treatment()
print('empirical critical value: ',
      match_obj.critical_value_lb, 
      match_obj.critical_value_ub)


b_pass_placebo_test = 'Fail' if (match_obj.critical_value_lb <= match_obj.ate().iloc[0]['ate'] and match_obj.critical_value_ub >= match_obj.ate().iloc[0]['ate']) else 'Pass'
print('if pass placebo test: ',b_pass_placebo_test)


empirical critical value:  -0.31556721533936816 0.2844152580884073
if pass placebo test:  Pass


In [5]:
# -------------------------------------------------------- #
# STEP 5: Robustness check 
#  -- "1. Total % of obs remained: " is used to check if you discard too many observations or not
#         in this example, 
#  --    "Control % obs remained: " is greater than 100% because we set "trim_percentage = 0"
#         and "caliper = 1", which means we keep all observations
#  --    "The most repeated times of a control obs:" means for an observation in control group, how many
#        times it has been used to match to different treated observation. If an observation has been
#        used too many times, something might be off with your propensity score model or your dataset.

match_obj.robust_check(gamma)


             Robustness Check Output Table for Dep. Variable y              
                                                            coef    P>|t|   
  Average Treatment Effect:                                 0.52     0.00%  
                                                                            
  1. Total % of obs remained:                            101.94%      Pass  
   -- Treated % obs remained:                            100.00%      Pass  
   -- Control % obs remained:                            103.96%      Pass  
      -- The most repeated times of a control obs:             9         -  
  2. Sensitivity test result, Gamma statistics               1.0      Fail  
  3. Placebo Test Result conf. interval              -0.32,0.28,      Pass  
